Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add is_overshadowed column to sys.segments table #7425

Merged

Conversation

surekhasaharan
Copy link

@surekhasaharan surekhasaharan commented Apr 8, 2019

Addresses #7233

This PR adds a col is_overshadowed to sys-segments table
Add a new class SegmentWithOvershadowedStatus to capture overshadowed info for segment in coordinator.
Add optional new queryParam /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus to coordinator API
Modify tests and docs

@vogievetsky
Copy link
Contributor

Does it make sense to add this to the web console within this PR?

@surekhasaharan
Copy link
Author

Does it make sense to add this to the web console within this PR?

I think it'd be better to do as a separate PR, which should be merged after this one.

@jon-wei jon-wei self-assigned this Apr 10, 2019
@jon-wei
Copy link
Contributor

jon-wei commented Apr 10, 2019

Can you update the PR description so that it refers to includeOvershadowedStatus and SegmentWithOvershadowedStatus?

@surekhasaharan
Copy link
Author

Yes, missed here, done.

*/
public class SegmentWithOvershadowedStatus implements Comparable<SegmentWithOvershadowedStatus>
{
private final boolean isOvershadowed;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggest calling this overshadowed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -609,6 +609,7 @@ Note that a segment can be served by more than one stream ingestion tasks or His
|is_published|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 represents this segment has been published to the metadata store with `used=1`|
|is_available|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is currently being served by any process(Historical or realtime)|
|is_realtime|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is being served on any type of realtime tasks|
|is_overshadowed|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is published and is overshadowed by some other published segments. Currently, is_overshadowed is always false for unpublished segments, although this may change in the future. You can filter for segments that "should be published" by filtering for `is_published = 1 AND is_overshadowed = 0`. Segments can briefly be both published and overshadowed if they were recently replaced, but have not been unpublished yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 if this segment is published and is overshadowed by some other published segments.

I think this should mention that this returns 1 only for fully overshadowed segments

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i think it's important to mention that, changed.

final Set<SegmentId> overshadowedSegments = findOvershadowedSegments(druidDataSources);
//transform DataSegment to SegmentWithOvershadowedStatus objects
final Stream<SegmentWithOvershadowedStatus> segmentsWithOvershadowedStatus = metadataSegments.map(segment -> {
if (overshadowedSegments.contains(segment.getId())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block could be

return new SegmentWithOvershadowedStatus(
  segment, 
  overshadowedSegments.contains(segment.getId())
);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that looks nicer, thanks.

@@ -92,6 +93,9 @@
private static final String SERVER_SEGMENTS_TABLE = "server_segments";
private static final String TASKS_TABLE = "tasks";

private static final long AVAILABLE_IS_OVERSHADOWED_VALUE = 0L;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, if we use these, I think IS_OVERSHADOWED_FALSE would be clearer, and it should have IS_OVERSHADOWED_TRUE as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, yeah, i was having hard time coming up with good constant names for these.

isAvailable,
isRealtime,
val.isOvershadowed() ? 1L : 0L,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should use the constants instead of 1L and 0L

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -295,6 +302,7 @@ public TableType getJdbcTableType()
val.getValue().isPublished(),
val.getValue().isAvailable(),
val.getValue().isRealtime(),
AVAILABLE_IS_OVERSHADOWED_VALUE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this should have a comment that this assumes unpublished segments are never overshadowed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment

Response.ResponseBuilder builder = Response.status(Response.Status.OK);
return builder.entity(stream).build();
final Function<DataSegment, Iterable<ResourceAction>> raGenerator = segment -> Collections.singletonList(
AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not formatted properly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not addressed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manually changed

}

/**
* find fully overshadowed segments
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write proper sentences in Javadocs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, changed.

*
* @return set of overshadowed segments
*/
private Set<SegmentId> findOvershadowedSegments(Collection<ImmutableDruidDataSource> druidDataSources)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this method is similar or different from findOvershadowed()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This this method belong to MetadataResource, is there a better place for it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean the VersionedIntervalTimeline#findOvershadowed(), I think that one finds partially overshadowed segments as well. This method only looks for fully overshadowed segments. Also that method returns a TimelineObjectHolder.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this methods belongs here, yeah I also thought about it, thought of adding it to VersionedIntervalTimeline, but since that's in core package, there was dependency issue to take ImmutableDruidDataSource as argument. May be can work around that by passing DataSegment object instead, if it makes sense to move this to VersionedIntervalTimeline and if it's going to be used by other code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could be a static method on ImmutableDruidDataSource that accepts a collection, or an instance method on ImmutableDruidDataSource like getFullyOvershadowedSegments()

Copy link
Author

@surekhasaharan surekhasaharan Apr 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, moved this method to ImmutableDruidDataSource as an instance method.

*/
private Set<SegmentId> findOvershadowedSegments(Collection<ImmutableDruidDataSource> druidDataSources)
{
final Stream<DataSegment> segmentStream = druidDataSources
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of extracting segmentStream variable?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was used for building timelines, and is still used to pass to the new method i created in VersionedIntervalTimeline

final Stream<DataSegment> segmentStream = druidDataSources
.stream()
.flatMap(t -> t.getSegments().stream());
final Set<DataSegment> usedSegments = segmentStream.collect(Collectors.toSet());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of the creation of this collection instead of iterating existing ImmutableDruidDataSource objects?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can get rid of this one now.

segment -> new SegmentWithOvershadowedStatus(
segment,
overshadowedSegments.contains(segment.getId())
)).collect(Collectors.toList()).stream();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not formatted properly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why .collect(Collectors.toList()).stream() part is needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed formatting and that part is actually not needed.

@@ -155,7 +158,8 @@ public Response getDatabaseSegmentDataSource(@PathParam("dataSourceName") final
@Produces(MediaType.APPLICATION_JSON)
public Response getDatabaseSegments(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method is too big now, it should be split into smaller methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not addressed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this, not sure best way to split it, took out parts of finding and authorizing SegmentWithOvershadowedStatus into a helper method.

@@ -73,8 +73,10 @@
private final BrokerSegmentWatcherConfig segmentWatcherConfig;

private final boolean isCacheEnabled;
// Use ConcurrentSkipListMap so that the order of segments is deterministic and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Javadocs for commenting fields

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to javadocs

AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource()));
if (includeOvershadowedStatus != null) {
final Set<SegmentId> overshadowedSegments = findOvershadowedSegments(druidDataSources);
//transform DataSegment to SegmentWithOvershadowedStatus objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't add meaning

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the comment

*
* SegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} of the DataSegment object.
*/
public class SegmentWithOvershadowedStatus implements Comparable<SegmentWithOvershadowedStatus>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you explore the possibility for this class to extend DataSegment for memory saving purposes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in fact I started with extends DataSegment, but in order to call the super(), I had to pass the DataSegment reference to SegmentWithOvershadowedStatus, so that I get the properties for the super constructor call, something like this

 @JsonCreator
  public SegmentWithOvershadowedStatus(
      @JsonProperty("dataSegment") DataSegment segment,
      @JsonProperty("overshadowed") boolean overshadowed
  )
  {
    super(
      segment.getDataSource(),
      segment.getInterval(),
      segment.getVersion(),
      segment.getLoadSpec(),
      segment.getDimensions(),
      segment.getMetrics(),
      segment.getShardSpec(),
      segment.getBinaryVersion(),
      segment.getSize()
  );
    this.dataSegment = segment;
    this.overshadowed = overshadowed;
  }

which didn't seem correct to me, as I am both extending the class and passing the reference to same in sub-class and decided to just keep DataSegment as member of this class. Is there a better way of doing this ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why SegmentWithOvershadowedStatus should have a "dataSegment" field rather than all fields deconstructed. In fact, it would allow saving a little of serialization/deserialization and the number of bytes sent over the network as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I deconstruct the DataSegment object, then we might save few bytes on a reference to DataSegment, but then the memory savings in broker where interned DataSegment is used would be lost. (Getting rid of interning or not is another issue, which should be addressed outside of this PR). If the concern is bytes sent over the network, then moving to smile format instead of json can provide considerable reduction in size of bytes transferred, which I plan to do in a follow-up PR later.

Copy link
Member

@leventov leventov Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is SegmentWithOvershadowedStatus stored somewhere for a long time? Intering an object upon deserialization and throwing it away soon doesn't make a lot of sense.

And even if the overshadowed status should be kept around on some node for a long time, you would better off apply mapping techniques such as described in #7395 instead of using plain Guava's interners. When you do this, you can insert the "overshadowed" flag wherever you want, or have something like ConcurrentHashMap<DataSegment, SegmentWithOvershadowedStatus> for storage, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coordinator API is evolvable, and is already evolving in this patch via request parameters: its response structure is different based on whether or not the includeOvershadowedStatus parameter is provided. If it needs to evolve further, then that would be okay and doable. (Although if all we do is switch to Smile, I don't think structural evolution is needed, since I imagine we would do that switch by making various APIs support both JSON and Smile based on a client header.)

By the way, we could get rid of the old formats after a few releases if we want, by deprecating them and then introducing a hard barrier that rolling updates cannot cross. We usually try to avoid doing this too often but it can be done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I think we should have something like @ClusterInternalAPI annotation for this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like a good idea to add an annotation for all internal API's in a separate PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

Copy link
Contributor

@jon-wei jon-wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -195,7 +204,7 @@ private void poll()
sb.append("datasources=").append(ds).append("&");
}
sb.setLength(sb.length() - 1);
query = "/druid/coordinator/v1/metadata/segments?" + sb;
query = "/druid/coordinator/v1/metadata/segments?includeOvershadowedStatus?" + sb;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if doing two ? in the same URL works, but even if it does, it's poor form; the second one should be a &.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, didn't realize i have two ? , will change

// timestamp is used to filter deleted segments
publishedSegments.put(interned, timestamp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a bug: since there are two possible SegmentWithOvershadowedStatus for each underlying DataSegment, now the same segment can be in publishedSegments twice for a period of time. There's a few ways to deal with this:

  1. Make publishedSegments a TreeSet<SegmentWithOvershadowedStatus> and update the entire map atomically. This is a super clean solution but would burst to higher memory usage (it would need to keep two entire copies of the map in memory when replacing them).
  2. Make publishedSegments a ConcurrentSkipListMap<DataSegment, CachedSegmentInfo> where CachedSegmentInfo is some static class, defined in this file, containing the updated timestamp and the overshadowed boolean. If you do this, the SegmentWithOvershadowedStatus won't be stored long term anymore. You could minimize memory footprint of CachedSegmentInfo, if you want, by making the timestamp a long rather than DateTime.
  3. Make publishedSegments a ConcurrentSkipListSet<SegmentWithOvershadowedStatus>, make SegmentWithOvershadowedStatus mutable (in a thread-safe way), make its equals, hashCode, and compareTo methods only based on the dataSegment field, let its overshadowed field be modified, and add a timestamp field to it. When syncing the cache, get the current object and mutate the overshadowed field if necessary. Btw, a ConcurrentSkipListSet uses a ConcurrentSkipListMap under the hood, so the memory footprint savings of this aren't as much as you might expect relative to (2).

(2) is the variant that's closest to what the code was doing before this patch. One thing I don't love about it is that it is racey: it means that if a set of segments is overshadowed all at once, callers will not necessarily see a consistent view, because the map is being concurrently updated. They'll see the overshadowed flag get set for the underlying segments one at a time. But the same problem existed in the old code, so fixing it could be considered out of scope for this patch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I was you I would probably do (2) and then consider if there's a way to address the raciness in a future patch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omg, thanks for catching this issue, actually I feel like going with option (1), i didn't do it initially because of memory bump it'd cause like you said, but that would have avoided such bugs and updates the cache atomically. I tried option (2) as well, one thing it would cause is, the return type of getPublishedSegments() changes, so CachedSegmentInfo cannot be private to this class. The getPublishedSegments() might need to be split into 2 methods getPublishedSegments() and getCachedPublishedSegments() or create yet another wrapper class and return that. Another potential ugliness it might have is clients need to know if cache is enabled and call the right method to get published segments.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, i added changes to do (2) now as we discussed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have numbers, but I'm concerned about a ConcurrentSkipListMap of all-segments-in-system cardinality. Maybe take Gian's approach 1), but instead of TreeMap using sorted arrays of DataSegment objects. (Or Guava's ImmutableSortedMap, which uses the same approach underneath).

See https://webcache.googleusercontent.com/search?q=cache:csIsYj1a5oAJ:https://gist.github.com/gaul/7108880+&cd=2&hl=en&ct=clnk&gl=es&lr=lang_en%7Clang_ru:

ConcurrentSkipListMap    36
ImmutableSortedMap        8

So I'm pretty sure even temporarily having two ImmutableSortedMap in memory will well beat one ConcurrentSkipListMap.

Using sorted arrays directly, even that two maps can be avoided materializing in memory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used ImmutableSortedSet, now the cache gets replaced atomically. @gianm @leventov let me know if you see any issues with this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 1 (what you went with) sounds good to me, especially in list of ConcurrentSkipListMap's additional overhead.

*
* @return set of overshadowed segments
*/
public Set<SegmentId> getFullyOvershadowedSegments()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of an expensive computation method should start with get-, which usually implies cheap, no-allocation method in Java. It can be compute-, determine-, or find-.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to determineOvershadowedSegments

@@ -109,6 +112,27 @@ public long getTotalSizeOfSegments()
return totalSizeOfSegments;
}

/**
* This method finds the fully overshadowed segments in this datasource
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the prefix "fully" mean in this context? Can a segment be "just" overshadowed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A segment can be partially overshadowed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term "overshadowed" is used throughout the codebase in the fully overshadowed meaning. If partially overshadowed concept is ever used in the codebase, it should be called partiallyOvershadowed.

Compare with determineOvershadowedSegments() method. We should either have "FullyOveshadowed" everywhere or nowhere. I think we should have it nowhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think it makes sense to just use the word "overshadowed" here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed "fully overshadowed" -> "overshadowed"

{
final Collection<DataSegment> segments = this.getSegments();
final Map<String, VersionedIntervalTimeline<String, DataSegment>> timelines = VersionedIntervalTimeline.buildTimelines(
segments);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not formatted properly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if I am missing something, I again reimported druid_intellij_formatting.xml, and this is what I get if I reformat code. I tried to add a manual line break and reformat, see if it looks ok now?

}

final Map<String, VersionedIntervalTimeline<String, DataSegment>> timelines = VersionedIntervalTimeline.buildTimelines(
params.getAvailableSegments());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not formatted properly

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, reformat doesn't change the formatting

@@ -141,13 +139,8 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params)

private Set<DataSegment> determineOvershadowedSegments(DruidCoordinatorRuntimeParams params)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this method is the same as getFullyOvershadowedSegments(), can they be merged?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it's similar, except input args and return type. I think getFullyOvershadowedSegments would be useful as it's public and can be used by anyone with a datasource object, i don't see how i can use that one in here though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me there can be a static method determineOvershadowedSegments(Iterable<DataSegment> segments). DataSource's method can delegate to that static method for the convenience of API. DruidCoordinatorRuleRunner's can be removed and determineOvershadowedSegments(params.getAvailableSegments()) is used instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, made determineOvershadowedSegments static in ImmutableDruidDataSource and removed the one from here

final Iterable<DataSegment> authorizedSegments =
AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper);
final Function<SegmentWithOvershadowedStatus, Iterable<ResourceAction>> raGenerator = segment -> Collections.singletonList(
AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSegment().getDataSource()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not addressed

Response.ResponseBuilder builder = Response.status(Response.Status.OK);
return builder.entity(stream).build();
final Function<DataSegment, Iterable<ResourceAction>> raGenerator = segment -> Collections.singletonList(
AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not addressed

@@ -155,7 +158,8 @@ public Response getDatabaseSegmentDataSource(@PathParam("dataSourceName") final
@Produces(MediaType.APPLICATION_JSON)
public Response getDatabaseSegments(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not addressed

@surekhasaharan
Copy link
Author

thanks @leventov for the review, I believe I have addressed all the comments which are in scope for this PR, let me know if you have more comments.

/**
* DataSegment object plus the overshadowed status for the segment. An immutable object.
*
* SegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} of the DataSegment object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line longer 120 cols

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -141,13 +139,8 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params)

private Set<DataSegment> determineOvershadowedSegments(DruidCoordinatorRuntimeParams params)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me there can be a static method determineOvershadowedSegments(Iterable<DataSegment> segments). DataSource's method can delegate to that static method for the convenience of API. DruidCoordinatorRuleRunner's can be removed and determineOvershadowedSegments(params.getAvailableSegments()) is used instead.

Stream<DataSegment> metadataSegments
)
{
final Set<SegmentId> overshadowedSegments = new HashSet<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the comment like the following:

This is fine to add all overshadowed segments to a single collection because only
a small fraction of the segments in the cluster are expected to be overshadowed,
so building this collection shouldn't generate a lot of garbage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, added the comment

@@ -73,8 +72,12 @@
private final BrokerSegmentWatcherConfig segmentWatcherConfig;

private final boolean isCacheEnabled;
/**
* Use {@link ConcurrentSkipListMap} so that the order of segments is deterministic and sys.segments queries return the segments in sorted order based on segmentId
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line longer than 120 cols

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -195,7 +206,7 @@ private void poll()
sb.append("datasources=").append(ds).append("&");
}
sb.setLength(sb.length() - 1);
query = "/druid/coordinator/v1/metadata/segments?" + sb;
query = "/druid/coordinator/v1/metadata/segments?includeOvershadowedStatus&" + sb;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End & intended?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i think so, this will be only used if there are non empty watchedDataSources set (let's say it contains a datasource name "dummy") then the URL would look like /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus&datasources=dummy etc.

// timestamp is used to filter deleted segments
publishedSegments.put(interned, timestamp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have numbers, but I'm concerned about a ConcurrentSkipListMap of all-segments-in-system cardinality. Maybe take Gian's approach 1), but instead of TreeMap using sorted arrays of DataSegment objects. (Or Guava's ImmutableSortedMap, which uses the same approach underneath).

See https://webcache.googleusercontent.com/search?q=cache:csIsYj1a5oAJ:https://gist.github.com/gaul/7108880+&cd=2&hl=en&ct=clnk&gl=es&lr=lang_en%7Clang_ru:

ConcurrentSkipListMap    36
ImmutableSortedMap        8

So I'm pretty sure even temporarily having two ImmutableSortedMap in memory will well beat one ConcurrentSkipListMap.

Using sorted arrays directly, even that two maps can be avoided materializing in memory.

@gianm
Copy link
Contributor

gianm commented Apr 26, 2019

Is this patch ready to merge? Anything I can help with?

@surekhasaharan
Copy link
Author

@leventov any more comments, specifically here

@@ -109,6 +109,15 @@ public static void addSegments(
);
}

public static Map<String, VersionedIntervalTimeline<String, DataSegment>> buildTimelines(Iterable<DataSegment> segments)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface looks somewhat less intuitive because as this method also implies, VersionedIntervalTimeline is for each dataSource. I think it would be better to first group segments by their dataSources and then call VersionedIntervalTimeline.forSegments per dataSource. What do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, it should not belong here, moved it to ImmutableDruidDataSource as that's the only place it's used from.

{
if (isCacheEnabled) {
Preconditions.checkState(
lifecycleLock.awaitStarted(1, TimeUnit.MILLISECONDS) && cachePopulated.get(),
"hold on, still syncing published segments"
);
return publishedSegments.keySet().iterator();
synchronized (lock) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lock is not needed. If publishedSegments was a mutable collection, such lock wouldn't prevent a race: https://github.com/code-review-checklists/java-concurrency#unsafe-concurrent-iteration. But since it's immutable, you don't need a lock at all. You can just make publishedSegments field volatile if you wish.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed lock and made publishedSegments volatile

cachePopulated.set(true);
}

public Iterator<DataSegment> getPublishedSegments()
public Iterator<SegmentWithOvershadowedStatus> getPublishedSegments()
{
if (isCacheEnabled) {
Preconditions.checkState(
lifecycleLock.awaitStarted(1, TimeUnit.MILLISECONDS) && cachePopulated.get(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type of "wait" which always throws IllegalStateException intimidates me. Can you replace cachePopulated with CountDownLatch(1), call cachePopulated.countDown() instead of cachePopulated.set(true) and call Uninterruptibles.awaitUninterruptibly(cachePopulated) here before accessing publishedSegments.iterator(), exception-free?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced AtomicBoolean with CountDownLatch here, no exception on wait.

/**
* Use {@link ImmutableSortedSet} so that the order of segments is deterministic and
* sys.segments queries return the segments in sorted order based on segmentId
*/
@Nullable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please annotate @MonotonicNonNull instead of @Nullable

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, good to know about this annotation.

// so building this collection shouldn't generate a lot of garbage.
final Set<DataSegment> overshadowedSegments = new HashSet<>();
for (ImmutableDruidDataSource dataSource : druidDataSources) {
overshadowedSegments.addAll(ImmutableDruidDataSource.determineOvershadowedSegments(dataSource.getSegments()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there are 20 brokers querying this endpoint on Coordinator? They all recompute overshadowed status (which is expensive and memory-intensive, because requires to build a VersionedIntervalTimeline) again and again.

I suggest the following:

  • isOvershadowed becomes a non-final field of DataSegment object itself, not participating in equals() and hashCode().
  • Add interface SegmentsAccess { ImmutableDruidDataSource prepare(String dataSource); Iterable<DataSegment> iterateAll(); } (strawman naming)
  • Add DataSourceAccess computeOvershadowed() method to SQLSegmentMetadataManager, which performs this computation for every snapshot of SQLSegmentMetadataManager.dataSources (which is updated in poll()) at most once, lazily.
  • Both endpoints in MetadataResource and Coordination balancing logic (which currently computes isOvershadowed status on its own, too) use this API.
  • On the side of MetadataSegmentView, maintain something like a Map<DataSegment, DataSegment> and update overshadowed status like map.get(segmentFromCoordinator).setOvershadowed(segmentFromCoordinator.isOvershadowed()).

Result: we don't do any repetitive computations of overshadowed segments for every SQLSegmentMetadataManager.dataSegments snapshot whatsoever.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion, but this code change is not contained in this method and would affect other places in the code, some of which are not part of original PR( eg coordinator balancing logic). I would prefer to do this change separately as it suggests changing DataSegment object and adding new interfaces, so there may be more follow-up discussions. Created #7571 to address this comment.

Copy link
Member

@leventov leventov Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code change is not contained in this method and would affect other places in the code, some of which are not part of original PR( eg coordinator balancing logic

I don't see how the files touched in the original PR are special. If you have implemented the above suggestion from the beginning those files would be part of the "original" PR.

Touching relatively unrelated files is normal when you do refactoring, in fact, that's one of the objectives of refactoring - to gather functionality that accidentally happens to scatter unrelated places in a single place.

I would prefer to do this change separately as it suggests changing DataSegment object and adding new interfaces, so there may be more follow-up discussions.

I won't block this PR from merging if other reviewers of this PR (@gianm @jihoonson @jon-wei) agree with that design on a high level (or propose another solution that solves the same problem) and it's being implemented just after this PR. Because the current design doesn't seem reasonable to me at this point. (So there won't be much difference from as if you just do the implementation right in this PR, but if you wish you can separate in two PRs.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leventov I am working on #7571. Agree the current API is not most efficient and I acknowledge your concern. While I am not sure what's the most appropriate way to avoid recalculating overshadowed segments yet, I looked at the suggested changes and I have some questions, which I have asked in #7571. Could we agree to discuss the design there, I think it'll make it easier for you and me and others to review those changes as this PR is getting crowded, and we may miss some parts about the new changes as they get mixed up with existing changes.

Copy link
Contributor

@jon-wei jon-wei May 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really formed an opinion on DataSegment mutability presently, but I think @leventov's suggestion for lazily computing the overshadowed view at most once per SQLSegmentMetadataManager poll() and sharing that view with the metadata retrieval APIs and the coordinator balancing logic makes a lot of sense.

Because the current design doesn't seem reasonable to me at this point. (So there won't be much difference from as if you just do the implementation right in this PR, but if you wish you can separate in two PRs.)

I agree with making the adjustment to the overshadowed view computation as an immediate follow on, I think a separate PR is a bit better:

  • The coordinator balancing logic is a pretty "core" part of the system, and I feel like it would be better to change that in a separate PR that calls attention more explicitly to that/isolates that change more
  • This PR is getting a bit long, a little tedious to navigate

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@leventov leventov merged commit 15d19f3 into apache:master May 1, 2019
@jihoonson jihoonson added this to the 0.15.0 milestone May 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants