Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: new style pgls #4919

Closed
wants to merge 66 commits into from
Closed

RFC: new style pgls #4919

wants to merge 66 commits into from

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented Jun 10, 2015

No description provided.

@jcsp
Copy link
Contributor Author

jcsp commented Jun 10, 2015

@liewegas @athanatos see what you think of this so far?

The tests don't actually pass; I'm finding that the filestore is not actually returning items in the expected order (i.e. sorted by bit-reversed hash) -- assuming that there's something in its implementation causing that (haven't looked).

The math in ReplicatedPG::do_pg_op to calculate the hash offset of the start of the next PG is a bit hacky (takes account of non-power-of-two pg_num though), there is probably a simpler way of expressing that.

@liewegas
Copy link
Member

On Wed, 10 Jun 2015, John Spray wrote:

@liewegas @athanatos see what you think of this so far?

The tests don't actually pass; I'm finding that the filestore is not actually returning items in the expected order (i.e. sorted by bit-reversed
hash) -- assuming that there's something in its implementation causing that (haven't looked).

oh.. yeah, there is. the DIR_$hexdigit dirs are sorted lexicographically
now instead of in reverse bit order. my branch fixing all this up is
still half baked, sorry. focusing on getting wip-temp merged first!

@liewegas
Copy link
Member

liewegas commented Aug 5, 2015

wip-newstore-sort is the branch to rebase this onto

@liewegas liewegas added this to the infernalis milestone Aug 12, 2015
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
We'll need this later when we sort the subdirs bitwise instead of
nibblewise.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Note that this breaks compatibility for the moment.

Signed-off-by: Sage Weil <sage@redhat.com>
We will sort [g]hobject_t's bitwise (instead of nibblewise) if all
OSDs who have participated in peering support the feature.

Note that this means a latecomer PG who does a notify may not have
the feature while the acting set operates with the new sort order.
We will need to be careful about last_backfill in this case.. that patch
it coming.

Signed-off-by: Sage Weil <sage@redhat.com>
Note that this doesn't build, but it will be easier to review this way.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
We need to separate the bitwise and nibblewise implementations because
they diverge too much.  Start by converting the old code to a working
nibblewise sort (compensating for the new ghobject_t sort and new args).

Signed-off-by: Sage Weil <sage@redhat.com>
Handle listing by hash with bitwise sorting.  This involves futzing with
the hex digit sort order by reversing the digit nibbles and so on.

Signed-off-by: Sage Weil <sage@redhat.com>
In particular, make SimpleListTest check both sorting methods and asssert
that the result is sorted.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
We need to know what sort order last_backfill is based on.

Signed-off-by: Sage Weil <sage@redhat.com>
liewegas and others added 21 commits August 14, 2015 15:48
It's doesn't matter, since it's a deterministic permutation of hte value
and we're checking for equality, but avoid the filestore-specific
nibblewise key to avoid confusion.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
…comparator

Signed-off-by: Sage Weil <sage@redhat.com>
We initialize all BackfillIntervals when backfill starts.  At that point we
set the Comparator order via reset().

Signed-off-by: Sage Weil <sage@redhat.com>
Reset the map and set so that the comparator matches the sort order at the
start of each backfill.

Signed-off-by: Sage Weil <sage@redhat.com>
We need to take care not to clear() any of the maps or sets or else we'll lose
the comparator.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
…kfill

Signed-off-by: Sage Weil <sage@redhat.com>
Some work we need to do on any new interval, *including* when the PG is
first created and there is no "interval change".  Factor out the (mostly
new) bits of start_peering_interval() that also need to happen on the
initial interval after the PG is first instantiated into on_new_interval().

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
We need to force a backfill restart even if last_update is up to date.

Signed-off-by: Sage Weil <sage@redhat.com>
This shouldn't normally happen, except that another byzantine path put us
in backfill even though we had a log from 0'0.  If we get back here (for
any other reason), behave by scanning the local objects and backfilling
them.

Signed-off-by: Sage Weil <sage@redhat.com>
If the log is empty, last_backfill != max is meaningless, and will confuse
peers later one.

Signed-off-by: Sage Weil <sage@redhat.com>
…s complete

If we have a full log (from 0'0) then we shouldn't backfill at all--don't
force_restart_backfill in that case!

Signed-off-by: Sage Weil <sage@redhat.com>
Allow the use of a comparator that has a value, and for that
comparator to be adjusted at runtime for the purposes for get_next
sort order.  Since get_next uses the weak_refs map, that is the
only map that needs to be rebuilt.

We assume, for simplicity, that the comparator can be constructed
with no arguments with some default... otherwise this is messier.

Signed-off-by: Sage Weil <sage@redhat.com>
Scrub uses the object_contexts sort order to check whether it is
safe to extend the scrub interval (verifying there aren't ops in
flight).  For this to work we need to use the correct sort order
for the object_context map (as seen by SharedLRU<>::get_next()).

Signed-off-by: Sage Weil <sage@redhat.com>
Calculate which PG comes next in the bit-reversed
hash ordering, and set response.handle to the lowest
hash in that PG's range.

We do this OSD side rather than client side so that
the client doesn't have to wait for the OSD's map
epoch in order to correctly calculate the next hash
in the case of PG splitting.

Signed-off-by: John Spray <john.spray@redhat.com>
Signed-off-by: John Spray <john.spray@redhat.com>
Signed-off-by: John Spray <john.spray@redhat.com>
Because some unit tests benefit from having a
non-power-of-two pg_num.

Signed-off-by: John Spray <john.spray@redhat.com>
Signed-off-by: John Spray <john.spray@redhat.com>
@liewegas
Copy link
Member

see wip-9663-sage

@liewegas liewegas removed this from the infernalis milestone Aug 18, 2015
@liewegas
Copy link
Member

I think we should do this after infernalis unless there is a real pressing need for it there

@ghost
Copy link

ghost commented Sep 1, 2015

@jcsp this needs to be rebased

@jcsp jcsp closed this Oct 28, 2015
@jcsp
Copy link
Contributor Author

jcsp commented Oct 28, 2015

Closed in favour of #6405

@jcsp jcsp deleted the wip-9663-hashorder branch February 17, 2016 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants