Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconcile terminology and method naming to 'used/unused segments'; Rename MetadataSegmentManager to MetadataSegmentsManager #7306

Merged
merged 67 commits into from Jan 27, 2020
Merged
Show file tree
Hide file tree
Changes from 51 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
87556bc
Reconcile terminology and method naming to 'used/unused segments'; Do…
leventov Mar 19, 2019
e44e199
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Mar 20, 2019
25de70d
Fix brace
leventov Mar 20, 2019
fe13d9c
Import order
leventov Mar 21, 2019
050bd3b
Rename withKillDataSourceWhitelist to withSpecificDataSourcesToKill
leventov Mar 21, 2019
562f5b4
Fix tests
leventov Mar 21, 2019
a6138da
Fix tests by adding proper methods without interval parameters to Ind…
leventov Mar 25, 2019
cb6fe7f
More aligned names of DruidCoordinatorHelpers, rename several Coordin…
leventov Mar 28, 2019
164f783
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Mar 28, 2019
d959bb0
Rename ClientCompactTaskQuery to ClientCompactionTaskQuery for consis…
leventov Mar 28, 2019
01c4494
More variable and method renames
leventov Mar 28, 2019
06ec389
Rename MetadataSegments to SegmentsMetadata
leventov Apr 1, 2019
09fab95
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Apr 1, 2019
f4774e1
Javadoc update
leventov Apr 1, 2019
8d5200d
Simplify SegmentsMetadata.getUnusedSegmentIntervals(), more javadocs
leventov Apr 1, 2019
255f320
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Apr 9, 2019
f34eef0
Update Javadoc of VersionedIntervalTimeline.iterateAllObjects()
leventov Apr 9, 2019
2ebb23c
Reorder imports
leventov Apr 9, 2019
ba8ed62
Rename SegmentsMetadata.tryMark... methods to mark... and make them t…
leventov Apr 19, 2019
64b0404
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Apr 19, 2019
231a529
Complete merge
leventov Apr 19, 2019
a113231
Add CollectionUtils.newTreeSet(); Refactor DruidCoordinatorRuntimePar…
leventov Apr 22, 2019
abad3f3
Remove MetadataSegmentManager
leventov Apr 22, 2019
cb1f3b0
Rename millisLagSinceCoordinatorBecomesLeaderBeforeCanMarkAsUnusedOve…
leventov Apr 22, 2019
a5840b8
Fix tests, refactor DruidCluster creation in tests into DruidClusterB…
leventov Apr 26, 2019
f4283d2
Merge with master
leventov May 3, 2019
559f2c8
Merge remote-tracking branch 'upstream/master' into used-segments
leventov May 8, 2019
93c12cd
Merge remote-tracking branch 'upstream/master' into used-segments
leventov May 9, 2019
dfc0dbb
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Jul 19, 2019
553b804
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Jul 22, 2019
886e580
Fix inspections
leventov Jul 22, 2019
77fc5e8
Fix SQLMetadataSegmentManagerEmptyTest and rename it to SqlSegmentsMe…
leventov Jul 22, 2019
1258fa0
Rename SegmentsAndMetadata to SegmentsAndCommitMetadata to reduce the…
leventov Jul 23, 2019
15cb9ae
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Jul 24, 2019
c2c7547
Rename DruidCoordinatorHelper to CoordinatorDuty, refactor DruidCoord…
leventov Jul 25, 2019
9cd239e
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Jul 26, 2019
a723553
Unused import
leventov Jul 26, 2019
26a8381
Optimize imports
leventov Jul 29, 2019
afcd301
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Jul 29, 2019
1557acc
Rename IndexerSQLMetadataStorageCoordinator.getDataSourceMetadata() t…
leventov Jul 29, 2019
96bcab7
Unused import
leventov Jul 29, 2019
d934853
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Jul 31, 2019
c3e488e
Update terminology in datasource-view.tsx
leventov Jul 31, 2019
f65e654
Fix label in datasource-view.spec.tsx.snap
leventov Jul 31, 2019
5e8f3e4
Fix lint errors in datasource-view.tsx
leventov Aug 1, 2019
697f0d5
Doc improvements
leventov Aug 1, 2019
66a23f9
Another attempt to please TSLint
leventov Aug 2, 2019
d3882c4
Another attempt to please TSLint
leventov Aug 2, 2019
4b9d992
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Aug 2, 2019
b77a327
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Aug 8, 2019
d355aa6
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Aug 12, 2019
718bb23
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Aug 28, 2019
19b7f3a
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Aug 28, 2019
2a6bcff
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Sep 11, 2019
22e71d1
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Nov 14, 2019
c575e6e
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Nov 14, 2019
c5d22a0
Style fixes
leventov Nov 14, 2019
285ebc9
Fix IndexerSQLMetadataStorageCoordinator.createUsedSegmentsSqlQueryFo…
leventov Nov 14, 2019
53f572a
Try to fix docs build issue
leventov Nov 14, 2019
da7667c
Javadoc and spelling fixes
leventov Nov 15, 2019
91a5c8c
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Nov 20, 2019
a566b6d
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Nov 21, 2019
6da0b9d
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Dec 8, 2019
d0bf20a
Rename SegmentsMetadata to SegmentsMetadataManager, address other com…
leventov Jan 22, 2020
10587c1
Merge remote-tracking branch 'upstream/master' into used-segments
leventov Jan 22, 2020
2ad6dd5
Address more comments
leventov Jan 23, 2020
6745f4a
Fix a bug in DataSourceOptimizer
leventov Jan 26, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
13 changes: 11 additions & 2 deletions .idea/inspectionProfiles/Druid.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Expand Up @@ -23,9 +23,9 @@
import com.google.common.collect.ImmutableMap;
import org.apache.druid.client.DataSourcesSnapshot;
import org.apache.druid.java.util.common.DateTimes;
import org.apache.druid.server.coordinator.helper.CompactionSegmentIterator;
import org.apache.druid.server.coordinator.helper.CompactionSegmentSearchPolicy;
import org.apache.druid.server.coordinator.helper.NewestSegmentFirstPolicy;
import org.apache.druid.server.coordinator.duty.CompactionSegmentIterator;
import org.apache.druid.server.coordinator.duty.CompactionSegmentSearchPolicy;
import org.apache.druid.server.coordinator.duty.NewestSegmentFirstPolicy;
import org.apache.druid.timeline.DataSegment;
import org.apache.druid.timeline.VersionedIntervalTimeline;
import org.apache.druid.timeline.partition.NumberedShardSpec;
Expand Down
Expand Up @@ -22,7 +22,6 @@
import org.apache.druid.java.util.common.logger.Logger;

import javax.annotation.Nullable;

import java.lang.management.ManagementFactory;
import java.lang.management.ThreadMXBean;
import java.lang.reflect.Method;
Expand Down
Expand Up @@ -19,6 +19,7 @@

package org.apache.druid.metadata;

import javax.annotation.Nullable;
import java.util.List;

/**
Expand All @@ -36,7 +37,7 @@ Void insertOrUpdate(
byte[] value
);

byte[] lookup(
@Nullable byte[] lookup(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you add a javadoc saying when it can be null? Perhaps it would be

  /**
   * Returns the value of the valueColumn when there is only one row matched to the given key.
   * This method returns null if there is no such row and throws an error if there are more than one rows.
   */

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for suggestion, applied

String tableName,
String keyColumn,
String valueColumn,
Expand Down
Expand Up @@ -21,6 +21,7 @@

import com.fasterxml.jackson.annotation.JsonProperty;
import org.apache.druid.java.util.common.StringUtils;

import java.util.Properties;

/**
Expand Down
Expand Up @@ -37,7 +37,6 @@
import java.util.Collection;
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.IdentityHashMap;
import java.util.Iterator;
import java.util.List;
Expand All @@ -49,28 +48,44 @@
import java.util.TreeMap;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.locks.ReentrantReadWriteLock;
import java.util.stream.Collectors;
import java.util.stream.StreamSupport;

/**
* VersionedIntervalTimeline is a data structure that manages objects on a specific timeline.
*
* It associates a jodatime Interval and a generically-typed version with the object that is being stored.
* It associates an {@link Interval} and a generically-typed version with the object that is being stored.
*
* In the event of overlapping timeline entries, timeline intervals may be chunked. The underlying data associated
* with a timeline entry remains unchanged when chunking occurs.
*
* After loading objects via the add() method, the lookup(Interval) method can be used to get the list of the most
* recent objects (according to the version) that match the given interval. The intent is that objects represent
* a certain time period and when you do a lookup(), you are asking for all of the objects that you need to look
* at in order to get a correct answer about that time period.
* After loading objects via the {@link #add} method, the {@link #lookup(Interval)} method can be used to get the list
* of the most recent objects (according to the version) that match the given interval. The intent is that objects
* represent a certain time period and when you do a {@link #lookup(Interval)}, you are asking for all of the objects
* that you need to look at in order to get a correct answer about that time period.
*
* The findFullyOvershadowed() method returns a list of objects that will never be returned by a call to lookup() because
* they are overshadowed by some other object. This can be used in conjunction with the add() and remove() methods
* to achieve "atomic" updates. First add new items, then check if those items caused anything to be overshadowed, if
* so, remove the overshadowed elements and you have effectively updated your data set without any user impact.
* The {@link #findFullyOvershadowed} method returns a list of objects that will never be returned by a call to {@link
* #lookup} because they are overshadowed by some other object. This can be used in conjunction with the {@link #add}
* and {@link #remove} methods to achieve "atomic" updates. First add new items, then check if those items caused
* anything to be overshadowed, if so, remove the overshadowed elements and you have effectively updated your data set
* without any user impact.
*/
public class VersionedIntervalTimeline<VersionType, ObjectType extends Overshadowable<ObjectType>> implements TimelineLookup<VersionType, ObjectType>
public class VersionedIntervalTimeline<VersionType, ObjectType extends Overshadowable<ObjectType>>
implements TimelineLookup<VersionType, ObjectType>
{
public static VersionedIntervalTimeline<String, DataSegment> forSegments(Iterable<DataSegment> segments)
{
return forSegments(segments.iterator());
}

public static VersionedIntervalTimeline<String, DataSegment> forSegments(Iterator<DataSegment> segments)
{
final VersionedIntervalTimeline<String, DataSegment> timeline =
new VersionedIntervalTimeline<>(Comparator.naturalOrder());
addSegments(timeline, segments);
return timeline;
}

private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock(true);

// Below timelines stores only *visible* timelineEntries
Expand All @@ -95,19 +110,6 @@ public VersionedIntervalTimeline(Comparator<? super VersionType> versionComparat
this.versionComparator = versionComparator;
}

public static VersionedIntervalTimeline<String, DataSegment> forSegments(Iterable<DataSegment> segments)
{
return forSegments(segments.iterator());
}

public static VersionedIntervalTimeline<String, DataSegment> forSegments(Iterator<DataSegment> segments)
{
final VersionedIntervalTimeline<String, DataSegment> timeline =
new VersionedIntervalTimeline<>(Comparator.naturalOrder());
addSegments(timeline, segments);
return timeline;
}

public static void addSegments(
VersionedIntervalTimeline<String, DataSegment> timeline,
Iterator<DataSegment> segments
Expand Down Expand Up @@ -147,6 +149,11 @@ public Collection<ObjectType> iterateAllObjects()
);
egor-ryashin marked this conversation as resolved.
Show resolved Hide resolved
}

public int getNumObjects()
{
return numObjects.get();
}

public void add(final Interval interval, VersionType version, PartitionChunk<ObjectType> object)
{
addAll(Iterators.singletonIterator(object), o -> interval, o -> version);
Expand Down Expand Up @@ -354,62 +361,69 @@ public Set<TimelineObjectHolder<VersionType, ObjectType>> findFullyOvershadowed(
lock.readLock().lock();
try {
// 1. Put all timelineEntries and remove all visible entries to find out only non-visible timelineEntries.
final Map<Interval, Map<VersionType, TimelineEntry>> overShadowed = new HashMap<>();
for (Map.Entry<Interval, TreeMap<VersionType, TimelineEntry>> versionEntry : allTimelineEntries.entrySet()) {
@SuppressWarnings("unchecked")
Map<VersionType, TimelineEntry> versionCopy = (TreeMap) versionEntry.getValue().clone();
overShadowed.put(versionEntry.getKey(), versionCopy);
}

for (Entry<Interval, TimelineEntry> entry : completePartitionsTimeline.entrySet()) {
Map<VersionType, TimelineEntry> versionEntry = overShadowed.get(entry.getValue().getTrueInterval());
if (versionEntry != null) {
versionEntry.remove(entry.getValue().getVersion());
if (versionEntry.isEmpty()) {
overShadowed.remove(entry.getValue().getTrueInterval());
}
}
}

for (Entry<Interval, TimelineEntry> entry : incompletePartitionsTimeline.entrySet()) {
Map<VersionType, TimelineEntry> versionEntry = overShadowed.get(entry.getValue().getTrueInterval());
if (versionEntry != null) {
versionEntry.remove(entry.getValue().getVersion());
if (versionEntry.isEmpty()) {
overShadowed.remove(entry.getValue().getTrueInterval());
}
}
}

final Set<TimelineObjectHolder<VersionType, ObjectType>> retVal = new HashSet<>();
for (Entry<Interval, Map<VersionType, TimelineEntry>> versionEntry : overShadowed.entrySet()) {
for (Entry<VersionType, TimelineEntry> entry : versionEntry.getValue().entrySet()) {
final TimelineEntry timelineEntry = entry.getValue();
retVal.add(timelineEntryToObjectHolder(timelineEntry));
}
}
final Map<Interval, Map<VersionType, TimelineEntry>> overshadowedPartitionsTimeline =
computeOvershadowedPartitionsTimeline();

final Set<TimelineObjectHolder<VersionType, ObjectType>> overshadowedObjects = overshadowedPartitionsTimeline
.values()
.stream()
.flatMap(
(Map<VersionType, TimelineEntry> entry) -> entry.values().stream().map(this::timelineEntryToObjectHolder)
)
.collect(Collectors.toSet());

// 2. Visible timelineEntries can also have overshadowed segments. Add them to the result too.
// 2. Visible timelineEntries can also have overshadowed objects. Add them to the result too.
for (TimelineEntry entry : incompletePartitionsTimeline.values()) {
final List<PartitionChunk<ObjectType>> entryOvershadowed = entry.partitionHolder.getOvershadowed();
if (!entryOvershadowed.isEmpty()) {
retVal.add(
final List<PartitionChunk<ObjectType>> overshadowedEntries = entry.partitionHolder.getOvershadowed();
if (!overshadowedEntries.isEmpty()) {
overshadowedObjects.add(
new TimelineObjectHolder<>(
entry.trueInterval,
entry.version,
new PartitionHolder<>(entryOvershadowed)
new PartitionHolder<>(overshadowedEntries)
)
);
}
}

return retVal;
return overshadowedObjects;
}
finally {
lock.readLock().unlock();
}
}

private Map<Interval, Map<VersionType, TimelineEntry>> computeOvershadowedPartitionsTimeline()
{
final Map<Interval, Map<VersionType, TimelineEntry>> overshadowedPartitionsTimeline = new HashMap<>();
allTimelineEntries.forEach((Interval interval, TreeMap<VersionType, TimelineEntry> versionEntry) -> {
@SuppressWarnings("unchecked")
Map<VersionType, TimelineEntry> versionEntryCopy = (TreeMap) versionEntry.clone();
overshadowedPartitionsTimeline.put(interval, versionEntryCopy);
});

for (TimelineEntry entry : completePartitionsTimeline.values()) {
overshadowedPartitionsTimeline.computeIfPresent(
entry.getTrueInterval(),
(Interval interval, Map<VersionType, TimelineEntry> versionEntry) -> {
versionEntry.remove(entry.getVersion());
return versionEntry.isEmpty() ? null : versionEntry;
}
);
}

for (TimelineEntry entry : incompletePartitionsTimeline.values()) {
overshadowedPartitionsTimeline.computeIfPresent(
entry.getTrueInterval(),
(Interval interval, Map<VersionType, TimelineEntry> versionEntry) -> {
versionEntry.remove(entry.getVersion());
return versionEntry.isEmpty() ? null : versionEntry;
}
);
}
return overshadowedPartitionsTimeline;
}

public boolean isOvershadowed(Interval interval, VersionType version, ObjectType object)
{
lock.readLock().lock();
Expand Down
Expand Up @@ -88,22 +88,22 @@ enum State

// (start partitionId, end partitionId) -> minorVersion -> atomicUpdateGroup
private final TreeMap<RootPartitionRange, Short2ObjectSortedMap<AtomicUpdateGroup<T>>> standbyGroups;
private final TreeMap<RootPartitionRange, Short2ObjectSortedMap<AtomicUpdateGroup<T>>> visibleGroup;
private final TreeMap<RootPartitionRange, Short2ObjectSortedMap<AtomicUpdateGroup<T>>> visibleGroups;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I intentionally used the singular form here to say that there is only one visibleGroup in any case. Perhaps the code should be refactored to be more clear.

Copy link
Member Author

@leventov leventov Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed to visibleGroupPerRange, does it make more sense? Also added a comment as per #8788

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds good to me. Thanks for adding a comment.

private final TreeMap<RootPartitionRange, Short2ObjectSortedMap<AtomicUpdateGroup<T>>> overshadowedGroups;

OvershadowableManager()
{
this.knownPartitionChunks = new HashMap<>();
this.standbyGroups = new TreeMap<>();
this.visibleGroup = new TreeMap<>();
this.visibleGroups = new TreeMap<>();
this.overshadowedGroups = new TreeMap<>();
}

OvershadowableManager(OvershadowableManager<T> other)
{
this.knownPartitionChunks = new HashMap<>(other.knownPartitionChunks);
this.standbyGroups = new TreeMap<>(other.standbyGroups);
this.visibleGroup = new TreeMap<>(other.visibleGroup);
this.visibleGroups = new TreeMap<>(other.visibleGroups);
this.overshadowedGroups = new TreeMap<>(other.overshadowedGroups);
}

Expand All @@ -123,7 +123,7 @@ private TreeMap<RootPartitionRange, Short2ObjectSortedMap<AtomicUpdateGroup<T>>>
case STANDBY:
return standbyGroups;
case VISIBLE:
return visibleGroup;
return visibleGroups;
case OVERSHADOWED:
return overshadowedGroups;
default:
Expand Down Expand Up @@ -669,7 +669,7 @@ boolean addChunk(PartitionChunk<T> chunk)
final AtomicUpdateGroup<T> newAtomicUpdateGroup = new AtomicUpdateGroup<>(chunk);

// Decide the initial state of the new atomicUpdateGroup
final boolean overshadowed = visibleGroup
final boolean overshadowed = visibleGroups
.values()
.stream()
.flatMap(map -> map.values().stream())
Expand Down Expand Up @@ -785,7 +785,7 @@ private List<AtomicUpdateGroup<T>> findLatestNonFullyAvailableAtomicUpdateGroups
}

final List<AtomicUpdateGroup<T>> visibles = new ArrayList<>();
for (Short2ObjectSortedMap<AtomicUpdateGroup<T>> map : manager.visibleGroup.values()) {
for (Short2ObjectSortedMap<AtomicUpdateGroup<T>> map : manager.visibleGroups.values()) {
visibles.addAll(map.values());
}
return visibles;
Expand All @@ -807,7 +807,7 @@ private List<AtomicUpdateGroup<T>> findLatestFullyAvailableOvershadowedAtomicUpd

final OvershadowableManager<T> manager = new OvershadowableManager<>(overshadowedGroups);
final List<AtomicUpdateGroup<T>> visibles = new ArrayList<>();
for (Short2ObjectSortedMap<AtomicUpdateGroup<T>> map : manager.visibleGroup.values()) {
for (Short2ObjectSortedMap<AtomicUpdateGroup<T>> map : manager.visibleGroups.values()) {
for (AtomicUpdateGroup<T> atomicUpdateGroup : map.values()) {
if (!atomicUpdateGroup.isFull()) {
return Collections.emptyList();
Expand Down Expand Up @@ -885,12 +885,12 @@ PartitionChunk<T> removeChunk(PartitionChunk<T> partitionChunk)

public boolean isEmpty()
{
return visibleGroup.isEmpty();
return visibleGroups.isEmpty();
}

public boolean isComplete()
{
return visibleGroup.values().stream().allMatch(map -> Iterables.getOnlyElement(map.values()).isFull());
return visibleGroups.values().stream().allMatch(map -> Iterables.getOnlyElement(map.values()).isFull());
}

@Nullable
Expand All @@ -915,7 +915,7 @@ PartitionChunk<T> getChunk(int partitionId)

List<PartitionChunk<T>> getVisibleChunks()
{
return getAllChunks(visibleGroup);
return getAllChunks(visibleGroups);
}

List<PartitionChunk<T>> getOvershadowedChunks()
Expand Down Expand Up @@ -954,14 +954,14 @@ public boolean equals(Object o)
OvershadowableManager<?> that = (OvershadowableManager<?>) o;
return Objects.equals(knownPartitionChunks, that.knownPartitionChunks) &&
Objects.equals(standbyGroups, that.standbyGroups) &&
Objects.equals(visibleGroup, that.visibleGroup) &&
Objects.equals(visibleGroups, that.visibleGroups) &&
Objects.equals(overshadowedGroups, that.overshadowedGroups);
}

@Override
public int hashCode()
{
return Objects.hash(knownPartitionChunks, standbyGroups, visibleGroup, overshadowedGroups);
return Objects.hash(knownPartitionChunks, standbyGroups, visibleGroups, overshadowedGroups);
}

@Override
Expand All @@ -970,7 +970,7 @@ public String toString()
return "OvershadowableManager{" +
"knownPartitionChunks=" + knownPartitionChunks +
", standbyGroups=" + standbyGroups +
", visibleGroup=" + visibleGroup +
", visibleGroups=" + visibleGroups +
", overshadowedGroups=" + overshadowedGroups +
'}';
}
Expand Down