-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"TypeError: cannot pickle '_io.TextIOWrapper' object" issue blocking final step; python version issue? #570
Comments
Hi Phil Thanks for raising this and for the extra info supplied. I will take a look and see if there's a fix I can make in the OrthoFinder code to resolve this. Best wishes |
Hi Phil Thanks again for reporting this. The notes below are mainly for recording the changes I've made to fix this and why so feel free to ignore them if they're not of interest. On MacOS as of python 3.8 for multiprocessing the spawn start method is now the default instead of fork: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods This does not allow the pickling of TextIOWrapper objects. These are essential for this particular bit of multiprocessing code in OrthoFinder. OrthoFinder parallelises over the gene trees and each of these parallel processes identifies all the orthologs in each gene tree it processes and writes these orthologs to the ortholog results files. I can see only one way to get around this, detailed below. This would be a large amount of work to implement and has some significant downsides, which might not be resolvable. For that reason I've switched back to forking on MacOS. Alternative: There would be a single file writer process, which is responsible for writing all the orthologs, duplications etc to file. All the other processes would pass orthologs etc to this process and it would write them to file. As it would wholly own the TextIOWrapper objects there would be no need to pass them around the multiple processes and this would eliminate the issue. However, there are significant load-distribution issues. If the orthologs, gene duplicates etc from the parallel tree processing threads were produced quicker than the serial ortholog writing thread could manage then an increasingly large backlog could build up and this could exceed the amount of RAM on the machine, causing a crash. It could be possible to parallelise the writing of different ortholog results files over different threads, but this would be complex and would still require very delicate balancing to prevent such problems. Currently, and tree processing thread is also responsible for writing its own orthologs, so no such backlog can build up and the method used automatically balances itself. |
Hi @philoel, I've submitted a fix but I have limited ability to test things on mac, if you'd be able to try it and let me know if it fixes the issue for you that'd be really helpful, thanks! |
Hi there- I'm running the first analysis in a few months, this time with a new machine with new installs of everything including python. After getting almost to the end of the whole run, I noticed that the species tree wasn't inferred correctly so I passed a manually corrected tree, using orthofinder -ft path_to_dir -s manual_tree.txt.
It's hanging up within a heartbeat, with this message (copied to .txt and attached).
I googled the final TypeError, and found GoogleCloudPlatform/gsutil#961 , which suggests that it might be an error with a recent version of python? I'm using Python 3.9.4 for this.
I didn't see "TypeError: cannot pickle '_io.TextIOWrapper' object" in the issues for Orthofinder so I'm not sure if this is new or not.
Anyways, I downgraded my python to 3.6.0 to see if the error might really be solved just with a python change, and it worked nicely.
So, heads up that there is some issue arising in Python 3.9 that interferes with ... pickling things? I'm not sure if there is a better venue to let you know this than by opening an issue.
orthofinder_TypeError_output.txt
Best,
Phil
The text was updated successfully, but these errors were encountered: