Skip to content

14705#265

Closed
aweisberg wants to merge 27 commits intoapache:trunkfrom
belliottsmith:14705
Closed

14705#265
aweisberg wants to merge 27 commits intoapache:trunkfrom
belliottsmith:14705

Conversation

@aweisberg
Copy link
Copy Markdown
Contributor

No description provided.

// There are simply no extra replicas to speculate.
// Handle this separately so it can record failed attempts to speculate due to lack of replicas
if (replicaPlan.contact().size() >= replicaPlan.liveOnly().all().size())
if (replicaPlan.contact().size() == replicaPlan.candidates().size())
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So >= is not necessary because contact should be a subset of candidates (or it should be an unavailable)? Is it more robust to use >= in case that doesn't hold due to a mistake or have an assertion for that?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified to >= in an earlier version of this patch, and decided it was a silly change and rolled it back. Pretty much by definition, if we select contact from candidates, it should always be <=. I guess we could insert an assertion, but at some point we have to drawn the line on what we verify and what we do not. I'm willing to put an assertion in, but I feel it is unnecessary, given we produce almost all contact via a call to candidates.filter

public UnfilteredPartitionIterators.MergeListener getMergeListener(P replicaPlan)
{
return new PartitionIteratorMergeListener<>(replicaPlan, command, this.replicaPlan.consistencyLevel(), this);
// TODO: why are we referencing a different replicaPlan here?
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO indeed looking at DataResolver it's modifying the replica plan without updating the shared one https://github.com/apache/cassandra/pull/265/files#diff-7e5dd130632299911e49b12afe86c85aR121
So they would be different?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was a stale comment. The merging never modified the replicaPlan, and it should never be modified during the SRP. Logically, it should be a snapshot of only the relevant replicas for the SRP.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think I understand, the original replica plan contacted nodes that never responded and weren't part of the read repair. It's a new replica plan that only includes the nodes that responded.. The consistency level happens to be unchanged across both of them. It would be less misdirecting if it fetched the consistency level from the plan that it is given.

* we progressively modify via various forms of speculation (initial speculation, rr-read and rr-write)
* @param <P>
*/
public static class Shared<P extends ReplicaPlan<?, ?, ?>>
Copy link
Copy Markdown
Contributor Author

@aweisberg aweisberg Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of a weird place it end up in terms of mutability and immutability. It's not thread safe and if it's supposed to always be referencing the latest instance why immutable?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I really understand the question or criticism. Could it be mutable? For starters, it would require a lot of other changes to ReplicaPlan and ReplicaCollection, that don't seem warranted for this single use case. Besides that, yes, but this would be less threadsafe (this version is actually perfectly safe to have multiple readers, as we do, they just may not see the latest 'contacts' value promptly). I'd be happy to upgrade it to fully threadsafe if it would make us happier, but this was changed to non-threadsafe specifically in response to prior feedback about the old code that it unnecessarily used volatile. It would be trivial to make it fully threadsafe, and the cost would actually be fairly low, and while I'm fairly sure it isn't needed, it would be easier to reason about.


@Override
public ForRange withSelected(EndpointsForRange newSelected)
public AbstractBounds<PartitionPosition> range()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to get copied around but is unused?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't unused, but it can be made so - and it is now. But for consistency with ReplicaLayout.ForTokenRead, I think it's fairly harmless.

A wider question is whether we actually want a ReplicaLayout for reads at all. They're only a single Endpoints, and I do wonder if it's worth it. But, again, for consistency with ReplicaPlan I think it's a very small price to pay.

public SharedForRangeRead(ForRangeRead replicaPlan) { this.replicaPlan = replicaPlan; }
public void addToContact(Replica replica) { replicaPlan = replicaPlan.withContact(Endpoints.append(replicaPlan.contact(), replica)); }
public ForRangeRead get() { return replicaPlan; }
public ForRangeRead getWithContact(EndpointsForRange newContact) { return replicaPlan.withContact(newContact); }
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, so it's not shared now? We made a new one without updating the reference?

Copy link
Copy Markdown
Contributor

@belliottsmith belliottsmith Sep 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a getter for the case where we don't want to update the reference. I will remove it to avoid any ambiguity. It was introduced so that we could eliminate a lot of type parameters - the withContacts() methods are all now on the most concrete types, so in order to provide one here, we need to implement it per Shared. I will comment to this effect.

{
private ForRangeRead replicaPlan;
public SharedForRangeRead(ForRangeRead replicaPlan) { this.replicaPlan = replicaPlan; }
public void addToContact(Replica replica) { replicaPlan = replicaPlan.withContact(Endpoints.append(replicaPlan.contact(), replica)); }
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addToContact or addContact, grammar seems odd.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does; in general the non-pluralised form of contact doesn't work well with the method names. Perhaps I should just pluralise and be done with it? Not sure why I was averse to this.

* See {@link ReplicaLayout#haveWriteConflicts}
* @return a 'pending' replica collection, that has had its conflicts with natural repaired
*/
public static <E extends Endpoints<E>> E resolveWriteConflictsInPending(E natural, E pending)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these need to be public?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not anymore

// TODO: move to replica layout as well?
final int blockfor;
final L replicaLayout;
final int blockFor; // TODO: move to replica plan as well?
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move it or keep it?

}

/**
* TODO: this method is brittle for its purpose of deciding when we should fail a query;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a JIRA rather than a TODO or just get done?


EndpointsForToken replicas = StorageService.instance.getNaturalAndPendingReplicasForToken(ks, tk);
Replicas.temporaryAssertFull(replicas); // TODO in CASSANDRA-14549
// TODO: this logic could do with revisiting at some point, as it is unclear what its rationale is
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JIRA vs this TODO?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants