-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSError: Apache/mod_wsgi log object is not associated with a file descriptor #890
Comments
What ever Python package you are using is expecting to only ever be run in a process context where stdout/stderr are linked to a tty device or file. Python doesn't technically guarantee that and a file like object is not obligated to provide a Can you paste (as text, not an image), the full stack trace so it is possible to see what Python package you are using? If it is your own code, it needs to be able to deal with |
Thanks for the fast reply...this project is for websraping using openaikey and iam providing sample code that you will understand..........
|
and also it is showing like this.........The error "OSError: Apache/mod_wsgi log object is not associated with a file descriptor" typically occurs when a Python script or application, in this case, a Ray program, is trying to use a logger that is not compatible with the mod_wsgi environment. |
I still need to see the full Python stack trace from the Apache error log to understand what is the calling sequence. |
root@vps:/var/www/recruitment-app/RECRUITMENT |
The problem is here: That package takes code from faulthandler which back in time was known not to handle very well when stdout/stderr were not associated with a file descriptor. They then used though that bad practice themselves to get a file descriptor to use with a sub process for stderr. This is a bad way of doing things they are using. They shouldn't override stderr at all and should just inherit the process state for it. I have no idea whether it will work or not, but you might be able to set:
directive in Apache, but I can't remember what the implications on logging by Apache will be if that is done. |
WSGIRestrictStdout Off
|
I have no idea what playwright is for. I am presuming where it is failing is nothing to do with logging as such. There wouldn't be any point in changing Of note, faulthandler in Python standard library is now implemented in C code and is tolerant of any exception when accessing The C version is closer to doing:
In other words, on any exception, ignore it and try the alternate lookup. I realise you can't readily fix playwright code exception by using |
Okay, looking at Playwright, it is forking browser processes. Doing that inside of a web server such as Apache is a really bad idea. You should look at changing the architecture of things and use a task queueing system like Celery to provide a distinct service which you can make requests of to do your scraping and have the web service part talk to that to do the work and wait for results. In other words, creating major sub processes from web applications is generally not recommended since sub processes would inherit of a lot of strange state from the web server processes. Eg., open socket connections for incoming requests, and so could interfere with the operation of the web server. Thus one usually farms out such stuff to a separate independent service. |
so which webserver you would recommend? |
It is not a case of which web server, but the nature of any front end web server that doesn't make it necessarily a suitable host for doing significant forked sub process execution. The problem is that web servers are usually handling multiple concurrent socket connections from remote HTTP clients. Web servers can have strange setups for log files, including piped loggers, or in process log file rotation mechanisms. And finally web servers can have multiple dynamic worker processes which are spun up and destroyed dynamically as necessary to handle requests. The consequence of fork/execing a non trivial application is that those sub processes would normally inherit all the open file descriptors/sockets for the parent web server process. In the worst case, a non trivial forked application could interfere with the operation of the web server by interacting with those inherited open connections. You could also have issues where your application assumes only one instance of a forked sub process will be run at a time, since the web server may have multiple worker processes and thus you could have more than one. Finally, if expect your forked sub process to keep running indefinitely you may have issues as it could get killed off when the web server worker process decides to shutdown worker processes. For that reason, you are better off not directly forking complicated sub processes out of your main public web server application. Instead create a separate more constrained service application whose task is to run specific jobs to do things independent of whether or not it is being done as part of a web request. Then have the front end web server make requests to do that work to that service application. One way of implementing such a service application for running the jobs is to use an application server such as Celery (https://docs.celeryq.dev/en/stable/). If you really wanted to still implement your application task server as a web service, then use a very light weight single process web server instead. This should still be behind the front end web server though and often these lightweight servers aren't necessarily as secure and robust as your main web server. For this you might use aiohttpd. So it is an architectural design issue. If you don't care and |
thank you for your patience and response.. you are doing a greate job. |
The text was updated successfully, but these errors were encountered: