Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix out of memory #775

Merged
merged 13 commits into from
Feb 14, 2015
Merged

Fix out of memory #775

merged 13 commits into from
Feb 14, 2015

Conversation

im-denisenko
Copy link
Contributor

This PR fixes OutOfMemory exception (which indeed must be named like OutOfThreadLimit, because problem is not related to memory at all) in travis builds. Last 132 builds of last commit (22 runs * 6 php versions) were successfull, except 3, that were failed in provision.sh on step apt-get install or pip install due to network issues.

Summary of changes:

  • ES threadpool size was limited by using processors: 1 setting
  • Thread stack size was limited (if you have 100M for all threads, and each thread consumes 10M, you can start 10 threads. With 1M per thread, you can start 100. 1024 is minimal value at which ES instance can be started)
  • Third instance of ES was removed, because two is enough for all tests, including nodeShutdown and clusterShutdown
  • Memory limit was fixed (parameter for ulimit -l must be either an integer 256 or "unlimited", but string "256M" was used)
  • Modification of vm.max_map_count was removed, because we can't use sysctl inside travis machine
  • _createIndex now will generate unique name for created index, if name was not provided as first argument
  • Test cluster now accepts connections only from localhost, because I saw from some logs, how node with random name (not "Silver Fox" nor "Skywalker") had joined to the cluster.
  • I/O operations decreased by adding index.store.type: memory setting
  • Test\Base::tearDown now performs deletion of all indices and caches between tests

Travis uses openvz virtualization, and therefore we can't modify kernel settings
See: elastic/elasticsearch#4978
See: http://changelog.travis-ci.com/post/45177235333/builds-now-running-on-openvz
Actually two is really enough.
One of them shuts down in testNodeShutdown and second - in testClusterShutdown.
If keep using three, OutOfMemory always happens, regardless of used settings.
So, it's a lot much simpler to remove one node than try to find wtf is going on in travis.
Don't ever accept connections from same datacenter in test builds.
New script waits while both 9200 and 9201 nodes are started.
Then waits 30 seconds and check them again.
If some node suddenly unavailable, trying to restart cluster.
If cluster was restarted 10 times and still no success, it's fail, go and fix code.
Limiting stack size allow java to spawn more threads,
so this fixes OutOfMemory in travis builds
This allows to be sure that all tests is not depends on each other.
Also, prefer use _createIndex instead of manual creation.
Sleep before assert allows to be sure that all nodes in cluster has created type
@coveralls
Copy link

Coverage Status

Coverage increased (+0.12%) to 84.12% when pulling a03fff2 on im-denisenko:fix-out-of-memory into 29b62a7 on ruflin:master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage increased (+0.12%) to 84.12% when pulling a03fff2 on im-denisenko:fix-out-of-memory into 29b62a7 on ruflin:master.

ruflin added a commit that referenced this pull request Feb 14, 2015
@ruflin ruflin merged commit 452baf9 into ruflin:master Feb 14, 2015
@ruflin
Copy link
Owner

ruflin commented Feb 14, 2015

Nice, thanks. What would be the consequence of still having 3 instances for testing? Would we run into any limitations because of the criterias you mentioned above?

@im-denisenko
Copy link
Contributor Author

Yep, there will be problems.

I was trying a lot not do this, but at the end I failed to get stable 3-nodes cluster. Third node always either dies immediately after start or in the middle of test suite, regardless of used settings. It dies even if phpunit will sleep between tests and run them very slow.

Anyway, I'm sure, that run multiple nodes on machine with such low performance it's not an usual way to use elasticsearch, so it's good thing at least two instances can be started.

@im-denisenko im-denisenko deleted the fix-out-of-memory branch February 14, 2015 21:52
@ruflin
Copy link
Owner

ruflin commented Feb 14, 2015

Ok, good to know. So I assume the issues appear on travis and not on the local machine?

@ruflin
Copy link
Owner

ruflin commented Feb 14, 2015

BTW: The only issue I have with the new tests is that they take quite a bit longer. I assume this is because of the removal of the indices. But here stability goes over speed.

@im-denisenko
Copy link
Contributor Author

Yes, I still can't reproduce it in the virtualbox. I think, it's could be because of virtualization system difference. In travis this is openvz, and in virtualbox this is em... virtualboxvz? I'm not good at this, so I will stop making further assumptions :)

@ruflin
Copy link
Owner

ruflin commented Feb 14, 2015

I'm happy it works now ;-) Thx a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants