New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send 202 if quorum not met but copy is written #11

Closed
wants to merge 4 commits into
from

Conversation

Projects
None yet
2 participants
@bdionne
Contributor

bdionne commented Aug 23, 2011

This requires a branch of chttpd of the same name

@kocolosk

View changes

Show outdated Hide outdated src/fabric_doc_update.erl
{timeout, Acc} ->
{_, _, W1, _, DocReplDict} = Acc,
{_, Resp} = dict:fold(fun force_reply/3, {W1,[]}, DocReplDict),
{accepted, [R || R <- couch_util:reorder_results(AllDocs, Resp), R =/= noreply]};

This comment has been minimized.

@kocolosk

kocolosk Aug 24, 2011

Member

Are you certain accepted is always the correct response here? We should only send accepted if at least one copy of each document was saved.

@kocolosk

kocolosk Aug 24, 2011

Member

Are you certain accepted is always the correct response here? We should only send accepted if at least one copy of each document was saved.

@kocolosk

This comment has been minimized.

Show comment
Hide comment
@kocolosk

kocolosk Aug 24, 2011

Member

I haven't dug deeply into the possible replies from the coordinator, but we want to cover more cases than just timeouts, right? For example, if two copies are down we should hopefully be able to find that out before the 'timeout' arrives. Perhaps we can't until that other pull request is merged.

Member

kocolosk commented Aug 24, 2011

I haven't dug deeply into the possible replies from the coordinator, but we want to cover more cases than just timeouts, right? For example, if two copies are down we should hopefully be able to find that out before the 'timeout' arrives. Perhaps we can't until that other pull request is merged.

@bdionne

This comment has been minimized.

Show comment
Hide comment
@bdionne

bdionne Sep 6, 2011

Contributor

@kocolosk, at a glance it appears we've added an addition layer of {ok, .... to results which messes up reorder_results

This last commit fixes it but I'm testing it further still

Contributor

bdionne commented Sep 6, 2011

@kocolosk, at a glance it appears we've added an addition layer of {ok, .... to results which messes up reorder_results

This last commit fixes it but I'm testing it further still

@kocolosk

This comment has been minimized.

Show comment
Hide comment
@kocolosk

kocolosk Sep 7, 2011

Member

Good catch @bdionne, but that doesn't look like the right fix to me. The coordinator is returning {stop, {Health, Reply}} which gets converted to {ok, {Health, Reply}} by the recv function. The problem with the code is that it treats the first element of the output of the recv function as the Health when it should be splitting the second element into {Health, Reply} instead. I'm not sure about the change you made to the force_reply fold in the timeout clause there.

Member

kocolosk commented Sep 7, 2011

Good catch @bdionne, but that doesn't look like the right fix to me. The coordinator is returning {stop, {Health, Reply}} which gets converted to {ok, {Health, Reply}} by the recv function. The problem with the code is that it treats the first element of the output of the recv function as the Health when it should be splitting the second element into {Health, Reply} instead. I'm not sure about the change you made to the force_reply fold in the timeout clause there.

@bdionne

This comment has been minimized.

Show comment
Hide comment
@bdionne

bdionne Sep 7, 2011

Contributor

Wow, talk about tunnel vision. I fixed the first and reverted the change to the force_reply in the timeout clause.

Contributor

bdionne commented Sep 7, 2011

Wow, talk about tunnel vision. I fixed the first and reverted the change to the force_reply in the timeout clause.

@kocolosk

This comment has been minimized.

Show comment
Hide comment
@kocolosk

kocolosk Sep 7, 2011

Member

Cool, that seems better. The one thing I was thinking we could do to improve this PR is to label each updated document with the appropriate ok/accepted/error tag in the response. If memory serves the _bulk_docs response body has this tag embedded, so it'd be one way to provide more information about the specific state of each update request rather than simply lumping it all together in the status code.

Member

kocolosk commented Sep 7, 2011

Cool, that seems better. The one thing I was thinking we could do to improve this PR is to label each updated document with the appropriate ok/accepted/error tag in the response. If memory serves the _bulk_docs response body has this tag embedded, so it'd be one way to provide more information about the specific state of each update request rather than simply lumping it all together in the status code.

@kocolosk

This comment has been minimized.

Show comment
Hide comment
@kocolosk

kocolosk Sep 20, 2011

Member

This latest commit should address my suggestion from a couple of weeks ago. Of course, we'll need to update the HTTP layer to compensate for this change.

Member

kocolosk commented Sep 20, 2011

This latest commit should address my suggestion from a couple of weeks ago. Of course, we'll need to update the HTTP layer to compensate for this change.

@bdionne

This comment has been minimized.

Show comment
Hide comment
@bdionne

bdionne Sep 20, 2011

Contributor

sorry, I misunderstood this as a more substantial change you wanted to push out.

Most of the paths into this come thru fabric:update_doc which already responds with {accepted, Rev} so it appears only a minor change is needed there and in chttpd_db:update_doc_result_to_json

Contributor

bdionne commented Sep 20, 2011

sorry, I misunderstood this as a more substantial change you wanted to push out.

Most of the paths into this come thru fabric:update_doc which already responds with {accepted, Rev} so it appears only a minor change is needed there and in chttpd_db:update_doc_result_to_json

@kocolosk

View changes

Show outdated Hide outdated src/fabric.erl
@@ -182,6 +182,10 @@ update_doc(DbName, Doc, Options) ->
case update_docs(DbName, [Doc], opts(Options)) of
{ok, [{ok, NewRev}]} ->
{ok, NewRev};
{accepted, [{ok, NewRev}]} ->

This comment has been minimized.

@kocolosk

kocolosk Sep 20, 2011

Member

I think it's impossible for this clause to ever match the response given the latest changes. The only way to set the overall health to accepted is if at least one of the individual document updates has an accepted status.

@kocolosk

kocolosk Sep 20, 2011

Member

I think it's impossible for this clause to ever match the response given the latest changes. The only way to set the overall health to accepted is if at least one of the individual document updates has an accepted status.

@kocolosk

This comment has been minimized.

Show comment
Hide comment
@kocolosk

kocolosk Sep 20, 2011

Member

What more substantial change did you have in mind?

Member

kocolosk commented Sep 20, 2011

What more substantial change did you have in mind?

@bdionne

This comment has been minimized.

Show comment
Hide comment
@bdionne

bdionne Sep 20, 2011

Contributor

not sure, somehow I had in mind that tagging each doc would require more substantial refactoring of chttpd.

Contributor

bdionne commented Sep 20, 2011

not sure, somehow I had in mind that tagging each doc would require more substantial refactoring of chttpd.

@kocolosk

This comment has been minimized.

Show comment
Hide comment
@kocolosk

kocolosk Sep 21, 2011

Member

The last assertion in the first test still fails. The test was written expecting that the documents are in two different shards, but the generated partition map contains only logical shard.

Member

kocolosk commented Sep 21, 2011

The last assertion in the first test still fails. The test was written expecting that the documents are in two different shards, but the generated partition map contains only logical shard.

@kocolosk

This comment has been minimized.

Show comment
Hide comment
@kocolosk

kocolosk Sep 21, 2011

Member

This is looking good from my end. I'd like to squash a few of the simple fixups at least. Need to see if there are logical groupings to the changes that we could keep as separate commits.

Member

kocolosk commented Sep 21, 2011

This is looking good from my end. I'd like to squash a few of the simple fixups at least. Need to see if there are logical groupings to the changes that we could keep as separate commits.

Robert Dionne and others added some commits Aug 18, 2011

Robert Dionne
Send 202 if quorum not met but copy is written
Detects timeouts and returns {accepted,... up thru to the chttpd
layer which becomes a 202 in the case that some docs were written
even though the quorum was not met.

BugzID:12533
Track per-document quorms, respond with overall health
We have a bug where documents that did not elicit any response from a
server are assumed to be executed with the replicated_changes option.
If that's not the case we ought to be handing out 'error' responses for
those documents, but instead it seems we just suppress them from the
response entirely. This patch adds a failing test.

Other than that, the idea here is to have the module respond with
'ok|accepted|error' as the overall health, and then each document gets a
similar tag. An overall health of 'accepted' means each document was
saved at least once, but some documents did not meet the write quorum.
An overall health of 'error' means at least one document was not saved
at all.

BugzID:12533
Tag individual doc results as 'error' or 'accepted'
Check for any successful shard update; if one is found, tag the document
as 'accepted', otherwise, use an arbitrary unsuccessful response.

BugzID:12533
@bdionne

This comment has been minimized.

Show comment
Hide comment
@bdionne

bdionne Sep 21, 2011

Contributor

This seems reasonable, I broke up c3cd53 and put the test case changes in with bbc5625

Contributor

bdionne commented Sep 21, 2011

This seems reasonable, I broke up c3cd53 and put the test case changes in with bbc5625

@bdionne

This comment has been minimized.

Show comment
Hide comment
@bdionne

bdionne Sep 21, 2011

Contributor

Thanks for fixing the tests.

Contributor

bdionne commented Sep 21, 2011

Thanks for fixing the tests.

@kocolosk kocolosk closed this in 92d9f8f Sep 21, 2011

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment