Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

Ensure postings are always sorted #150

Merged
merged 1 commit into from
Sep 21, 2017
Merged

Ensure postings are always sorted #150

merged 1 commit into from
Sep 21, 2017

Conversation

fabxc
Copy link
Contributor

@fabxc fabxc commented Sep 20, 2017

@@ -1038,8 +1038,6 @@ func (s *stripeSeries) getOrSet(hash uint64, series *memSeries) (*memSeries, boo
return prev, false
}
s.hashes[i].set(hash, series)

s.hashes[i][hash] = append(s.hashes[i][hash], series)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a leftover and we were actually storing series twice – without any negative effects aside from space consumption.

// be generated independently before adding them to postings.
// We repair order violations on insert. The invariant is that the first n-1
// items in the list are already sorted.
for i := len(list) - 1; i >= 1; i-- {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a race here, but I might be wrong. Also this case is a long shot, but hear me out :)

Let the existing list be: [1, 2, 3]
Now we insert: 4, 5, 6, 7 concurrently in this way:
[1, 2, 3, 6, 4, 5, 7].

Now, if the check for 5 runs before check for 4, which means in the end, we end up with: [1, 2, 3, 4, 6, 5, 7]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if we have [1, 2, 3] and insert 6 and then 4 and then 5.
We get at one point:

list := [1, 2, 3, 6, 4, 5]
p.m[l] = list

Even the swapping for 6, 4 that happens is not visible to the new list as it is a copy, AFAICS.

Copy link
Contributor Author

@fabxc fabxc Sep 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The insertion itself is guarded by mutex, so there is no concurrency at play here.
Before each insert all elements so far are sorted. Just the last one is potentially out of order.
So the cases you describe are not possible due to this invariant.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah, got it. LGTM!

@gouthamve
Copy link
Collaborator

👍

@mattbostock
Copy link
Contributor

What's the impact if they're not sorted? It might be good to include an explanation in the commit message.

@fabxc
Copy link
Contributor Author

fabxc commented Sep 21, 2017

Done.

@fabxc fabxc merged commit 5fa1c99 into master Sep 21, 2017
@fabxc fabxc deleted the postingssort branch September 21, 2017 07:37
IDs for new series are handed out before the postings are locked. Thus
series are not indexed in order of their IDs, which could result in only
partially sorted postings list.
Iterating over those silently skipped elements as the sort invariant was
violated.
@mattbostock
Copy link
Contributor

Thanks @fabxc.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants