New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement parallelization of feature calculation per kind #35
Conversation
Have you checked the maximal memory usage? Are the time series container per kind duplicated? |
The maximal memory usage depends on the size of the input as the time series containers per kind (which we internally store in one dict) are copied to the worker processes. It seems that the Travis jobs stall at some point. All tests run fine on my machine. It might be a memory issue. I will investigate this further and update the PR. |
I looked into the process and memory issues. Travis only gives you 3GB of memory and 2 cores. The combination of parallelizing the individual tests and then parallelizing the different p-value calculations could cause travis to time out. We could use just one joblib process if the environment variable "TRAVIS" is set. |
pytest xdist was set to 4 slaves, maybe too much for 3 GB of ram and 2 cores. I set it to auto. |
finally it is passing... so we should only deploy xdist with 2 slaves. I would prefer to use --n auto in setup.cfg and then change this line in the travis.yml further we have to look at that coverage |
1 similar comment
I don't understand travis behavior at this point. It stalls at random unit tests. If I reduce the number of xdist worker nodes to 1 or 2 it passes on python 3.5.2 but fails on python 2.7. The error I get is
I can not reproduce any of the errors locally. I will have to install a local travis docker image and investigate further. |
I couldn't reproduce the tests stalling in a similar sized vm with the same operating system as Travis. With docker I have yet to try. Looking around a bit I found some travis file containing: # The next couple lines fix a crash with multiprocessing on Travis and are not specific to using Miniconda
- sudo rm -rf /dev/shm
- sudo ln -s /run/shm /dev/shm Maybe they ran into the same issue. However, I'd rather not switch to a sudo Travis image. We could just set the default number of processes to 1 and leave it at that. An alternative would be to set an environment variable in the travis file and set the number of processes accordingly. Regarding the unit test for parallel extraction: In my opinion we don't need it. We rely on multiprocessing.Pool to do its job and I trust they do test to ensure it works as expected. |
The thing is, sometimes not the unit tests where the Pool method is called are stalling. I also tried to reduce the number of processes to 1 this, see commit 3cc9d56. The testing still failed :-/
We should at least test the aggregating of the different jobs for the different types of time series. Maybe we can mock the library? |
I think we have to debug this in a docker box. Here is an explanation how to setup a docker travis box |
Travis is weird. If I repeat the branch Testing with Python2.7 and Python3.5, Py2.7 fails. If I then restart just the python 2.7 job, it passes. Anyway. I turned off xdist and joblib and got a memory allocation error. So it seems that indeed memory is the issue here |
Done in 80803d6 |
If there are several kinds of time series, their features are calculated in parallel using a process pool.
Standard behavior will be one process per cpu. This setting can be overwritten in the FeatureExtractionSettings object provided to
extract_features
.