New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Celery TypeError "cannot serialize '_io.BufferedReader' object #148
Comments
Could you try upgrading billiard and kombu? Otherwise can I get the output Sent from my phone On Jul 28, 2016 12:37 PM, "arolling" notifications@github.com wrote:
|
This happens for every single script I run, my own as well as the example scripts, including the "sum" script in the "Adding and Managing Scripts" part of the documentation. I just tried upgrading and I have the latest versions. Pip freeze:
|
Thank you, I'll see if I can replicate this with your setup. |
Thanks! In case it's relevant, in order to get it working on my Windows 10 machine my Procfile reads
|
I am unable to reproduce it on a linux machine. I'll check out python on a windows VM later today. What python distribution are you running? |
python 3.5.1 |
Just got the same crash on windows 10 python 3.5. I'll see if I can find a workaround. |
The issue is python 3.5 on windows cannot pickle subprocess.PIPE for capturing stdout/stderr. I don't know an immediate workaround for this. |
That is extremely disappointing. I was trying to Google to figure out what this means, and I came across this StackOverflow question that seems to be addressing the same problem, but I doubt the solution transfers very well. |
Are you able to test this PR: It works on my windows test, but I'd like to verify it works for you first if possible |
Well, good news: no crashes, no errors, and dynamic updating. Bad news: jobs never end, they just execute into infinity and don't complete. Looking at the folder where results are saved, it seems to not be reaching the step where it saves things to zip and tar packages. I did try making the thread a daemon, hoping it would close on its own, but that didn't seem to work. I don't know if this is related - I've been assuming it's a Celery bug, and they no longer officially support Windows - but after clicking "Stop" on the job page I am forced to close and re-start my Celery worker before it will take any additional jobs. Edit:
|
It's actually the line ending that is causing the problem in this bit of code: I'm looking into a cross-platform solution now and fixing our appveyor windows CI to reproduce this error (since oddly it was passing tests) |
I've tested this and I get no more errors! One odd quirk is that my scripts have several status updates that they print to stdout when they hit various points in their 3-10 minute run time and I'm finding that only the first two or three ever show up, and when they do it is at quite a lag (2-3 minutes). When the job is finished, only the messages that showed up on the console are displayed on the "Results" page. Is this expected behavior? |
That's likely due to buffering. How continuous are the messages? Could you elaborate on this statement as well: |
So the messages should come in bunches - two together at once, then a third a second later, then 3 or four seconds before the fourth, then 60 seconds until the next, then 60 seconds, then 2-5 minutes, then several at once, then the job completes. I'm only getting the first two, and those only show up when the job is almost done. The rest of the messages never show up; they aren't stored anywhere as part of the job information, so if they don't appear on the console as it is running, they get dropped and not saved. |
Ok, definitely shouldn't be like that. Any way I can see the script you are running? I tested with the 8 queens problem script and the output from that was as intended. |
I can't send you the script because it's driving a piece of hardware that you don't have. However, I put together a quick math-with-pauses script and had the same (partial) result.
|
Also, sometimes it displays the command-line parameters in white (like in the screenshot) and sometimes that line doesn't show up while running but is present if you navigate to the "Results" page for that job. |
Just a heads up I'm trying to find time to resolve this, just busy atm with work. |
Can you give it another try? It should work as in it'll eventually give you the output, but realtime console output isn't really possible with windows. To get it, you need to unbuffer the output via setting PYTHON_UNBUFFERED=1 in your environment. In the future, I'm going to implement custom parameters for the python executable so you may pass '-u' to it if desired on windows. |
Thank you so much for working on all these Windows-caused problems. I really appreciate it, especially since it seems I'm your only user with Windows. This is very very close! The screen updates properly - at a lag, as you mention, which is not a problem for me - with two small caveats. The final zip and tarball packages and any other file-based results never show up in the "All Files" section after the script is finished running; you have to navigate to the completed job under the Results tab to find them. And on that final static results page, the information that was displayed in the console has vanished and isn't saved with the job. This doesn't seem to happen with the smaller scripts though, like the math based one above. |
Thanks. It seems like the hard part is done. I'll try to wrap this up soon! |
I put in a fix for the file issue and took care of an embarrassing use of |
If by latency you mean total time frame, the scripts take 3-10 minutes to run. After updating with your new fixes, I just tried one that prints every 10-30 seconds over 3-4 minutes and it wrote to the console with very minimal lag (maybe 2 seconds) which is wonderful. However, when it finished and I checked the saved job, the console messages had vanished again. After a bit of testing with the mandlebrot script, I discovered that the distinguishing factor might be whether there is a saved file involved - no saved file, console readout persists; saved file, no console readout. EDIT: After more testing I don't think this is 100% true. |
I test with a script that outputs files, so it isn't that. Are you using sqlite? Maybe there is an issue with concurrency? |
Would it be possible to screencast a session where this happens? |
I believe I am using sqlite, but only one celery worker due to Windows issues (--pool=solo, as mentioned above). I'm sorry, I can't screencast due to confidentiality. But I did a series of tests and I can isolate it: Which print statements output to the console matters. If I eliminate a specific print statement that records the contents of a specific list, the problem goes away. I cannot seem to create a test case that replicates this, however, so I don't know if it's an interaction with other factors. This amount of data does not get saved:
Removing those "Time in Queue" print statements leads to the preservation of the console log. |
That is interesting. Can you paste what print statement is being used? Is it |
This is so strange! Every single output statement I'm using is |
Awesome! I just replicated it with this: |
Yes, that's it exactly! Is this an easy fix, or should I just change all my threaded prints to sys.stdout.write? |
Fixed! It should be good to go now. Thank you so much for your effort! |
Oh great, thanks so much for all your work! |
Fixed in #149 |
I just updated my Wooey install to today's new version and immediately hit a problem. Any script I try to run returns an error (with live updating, which is wonderful!) and in the Celery worker window I get the following message:
I tried downgrading my Celery install from 3.1.23, first to 3.1.15 then to 3.1.12 (which is what I had before the Wooey upgrade) to no avail. Do you have any suggestions?
The text was updated successfully, but these errors were encountered: