-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for CompactTest failure #2127
Conversation
@kocolosk can you please review this one? If you are not the right person, can you suggest who is the right person to review this? Thanks! |
If I recall correctly this used to be a flaky test. The current check seem to be correct: |
Hi @iilyak , Thanks for the update. Also I saw that the following check tests the Was not aware that we cannot detect compaction completion reliably. Please let me know me if this is incorrect. |
I am still convinced that we have a bug in either compactor or code which detects that compaction is finished. Your numbers confirms that. |
There is third possibility which is incorrect calculation of sizes for attachments. The current test you are trying to fix uses attachments. I remember there were fixes (by @eiri) in the area. The PR is not merged yet. |
@iilyak can you please suggest a way forward? This is the only test failure that we are encountering and remains a blocker for our activities. Is there a way we can skip this test temporarily for now until the root cause of compaction or calculation of attachment sizes is resolved? |
@iilyak any updates for me on this one? |
fwiw my none-approved fixes were for I thought this test is already in skip list, the tag |
@eiri The check in |
@wohali Ah, I see. I should admit I'm still not very familiar with elixir test suite. The fix doesn't look right to me, tbh, I mean it's strange to call test "compaction reduces size of deleted docs" and then change test to assert an exact opposite 😄 If test is repeatedly failing we need to find the root cause of this failure and fix it. I appreciate it's not a simple task, sizes calculation, especially in combination with compaction is convoluted, so maybe someone more proficient than me in elixir suite can disable this test for time being and we'll get back to it in version 3.x |
data_size is deprecated. Does info["sizes.file"] or info["sizes"]["file"] give a better answer? |
@sansato it's the exact same data... we're just going to remove the data_size field at some point from the output. see: couchdb/src/couch_mrview/src/couch_mrview_index.erl Lines 83 to 89 in 91b299d
data_size = sizes.external |
@sansato I'd say But I agree with @iilyak, the fact that in your trace finale data and disk sizes bigger than the originals indicates that compaction not finished at the time of final test assertion, i.e. it's not the question of what to compare, but more of a question when to compare. |
@iilyak @eiri @sansato I have run this compaction test with increased timeouts of 99999999 ms (~27 hours), yet the compaction doesn't complete and the test fails/times out.
@eiri I agree with you on this, I myself am quite new to couchdb/elixir/erlang and it was a challenging task, though I could figure above details with the limited amount of knowledge I could gather. |
The decision for this PR is:
|
If we do end up disabling the test, I would encourage adding a comment pointing to this issue so that future code spelunkers will understand the history behind the decision. |
I debugged the failing
CompactTest
test case and have managed to find the root cause of the failure.After understanding the logic of the test case and tracing the code flow, I realised that the failure happened due to the incorrect assert check at the following location:-
https://github.com/apache/couchdb/blob/master/test/elixir/test/compact_test.exs#L46
This is because the final data size after deletion & further compaction is more than the deleted data size after only deletion, but not compaction.
The opposite was being checked due to which the test case was failing consistently.
Following are the values of the variables in question that I managed to trace:-
I have made the necessary changes and submitted this PR for the above fix.