Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CompactTest fails while executing CouchDB elixir test suite. #2099

Closed
sarveshtamba opened this issue Aug 7, 2019 · 8 comments
Closed

CompactTest fails while executing CouchDB elixir test suite. #2099

sarveshtamba opened this issue Aug 7, 2019 · 8 comments

Comments

@sarveshtamba
Copy link
Contributor

sarveshtamba commented Aug 7, 2019

-CouchDB v2.3(master branch)
-Erlang OTP v21
-Elixir v1.8.2

Seeing the below issue while running CouchDB test suite using make check or make elixir

I have set the timeouts at the following two places to 999_999(lower timeouts of 10000 from exisiting 5000 still caused some tests to fail, will try to figure the most optimum value after some more trials)

https://github.com/apache/couchdb/blob/master/test/elixir/lib/couch.ex#L182
https://github.com/apache/couchdb/blob/master/test/elixir/lib/couch/db_test.ex#L293

Right now the elixir test suite is in progress and all the tests seem to pass, currently test is stuck at the following for quite sometime:-

CompactTest
  * test compaction reduces size of deleted docs

After the long timeout, the following test finally fails as below:-

CompactTest
  * test compaction reduces size of deleted docs (1017513.1ms)

  1) test compaction reduces size of deleted docs (CompactTest)
     test/elixir/test/compact_test.exs:17
     ** (RuntimeError) timed out after 1000053 ms
     code: retry_until(fn ->
     stacktrace:
       (couchdbtest) test/elixir/lib/couch/db_test.ex:301: Couch.DBTest.retry_until/4
       test/elixir/test/compact_test.exs:38: (test)

This is the only test failure that I am seeing right now, and all the other tests pass successfully.

@sarveshtamba
Copy link
Contributor Author

@wohali @kocolosk any inputs on this one?

@kocolosk
Copy link
Member

Hard to say without any additional details like database log files. I don't think I've ever seen that one fail like that on our CI system yet so I don't have any other context to go on.

@sarveshtamba
Copy link
Contributor Author

sarveshtamba commented Aug 20, 2019

Hi @kocolosk ,

I debugged the failing CompactTest test case and have managed to find the root cause of the failure.

After understanding the logic of the test case and tracing the code flow, I realised that the failure happened due to the incorrect assert check at the following location:-
https://github.com/apache/couchdb/blob/master/test/elixir/test/compact_test.exs#L46

This is because the final data size after deletion & further compaction is more than the deleted data size after only deletion, but not compaction.
The opposite was being checked due to which the test case was failing consistently.

Following are the values of the variables in question that I managed to trace:-

	CompactTest
	  * test compaction reduces size of deleted docs

	Value of orig_data_size = 4436.

	Value of orig_disk_size = 103907.

	Value of deleted_data_size = 7455.

	Value of final_data_size = 11924.

	Value of final_disk_size = 218681.

	  * test compaction reduces size of deleted docs (18819.2ms)

I have made the necessary changes and submitted a PR for the same.
#2127

@sarveshtamba
Copy link
Contributor Author

Entire test suite for CouchDB v2.3(current master) executes successfully/passes with Erlang v21 and Elixir v1.8 on PowerPC64LE. Closing this issue. Thanks for all your help and support in getting this through.

@kocolosk
Copy link
Member

Great stuff @sarveshtamba. We are looking to add PPC64LE to the CI matrix soon, so that should help us keep this one green.

@sarveshtamba
Copy link
Contributor Author

@kocolosk any timelines to add PPC64LE to the CI matrix?

@sarveshtamba
Copy link
Contributor Author

Hi @kocolosk ,

The build script to build CouchDB alongwith all of its dependencies on PowerPC64LE is present at the below location:-
https://github.com/ppc64le/build-scripts/blob/master/couchdb/couchdb_ubuntu_16.04.sh

Note that this is tested for building CouchDB v2.3 (current master) with Erlang v21 and Elixir v1.8 only.

Also as per here - https://cwiki.apache.org/confluence/display/INFRA/Jenkins+node+labels , below are the ppc64le nodes available for ASF projects:-

|-------+---------------------------+-|
|ppc64le|                           |2|
|       |hadoop-ppc64le-1,          | |
|       |ubuntu-ppc64-le            | |
|-------+---------------------------+-|

There are currently 2 ppc64le nodes present at https://builds.apache.org/computer/ . They are

  1. https://builds.apache.org/computer/hadoop-ppc64le-1/ and
  2. https://builds.apache.org/computer/ubuntu-ppc64le/

@wohali
Copy link
Member

wohali commented Sep 17, 2019

@sarveshtamba Please don't issue hijack, this isn't the right issue for this topic.

We're aware of those nodes and can't use the hadoop one. Because of our need for redundant builders, we've requested 2 nodes from OSU and received them yesterday. We're also moving to a new Jenkins/CloudBees Core install soon and will set those nodes up then.

@apache apache locked as off-topic and limited conversation to collaborators Sep 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants