-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hackathon/real life] non-cvmfs version of pilot does not run at RAL-LCG2 #7657
Comments
Isn't this solved by the last DIRACOS release? |
Why did it show up in the hackathon then ? |
The last release was created only yesterday: https://github.com/DIRACGrid/DIRACOS2/releases/tag/2.42 |
Keep the ticket open until the workshop when we do another hackathon ? |
We are facing a similar problem with the lastest tag 2.43 but only at some sites. I don't know if the error was present also before. Here below the error we get:
Any suggestion? Thank you. |
Solution is somewhere in mamba-org/mamba#2501 Run with |
Yes thank you, but this means that ulimit must be changed by site admins, right? |
We can limit in the pilot. |
ok but how can I do it? |
You do not have to do anything: #7891 |
OK thank you |
During the hackathon pilot jobs at RAL-LCG2 kept failing. I was not able to retrieve the logs of the failed jobs, but from the running jobs I managed to retrieve the following excerpts:
pilot.log
pilot.error
We've seen the same issue on our production instance, and we are working around it by getting the pilot off cvmfs.
Simon thinks this might be related to:
mamba-org/mamba#2501
Note that this behaviour several hundred jobs per hour that then fail, and that this is how my DN got banned at RAL before. (Hence killing all user jobs targeting RAL before leaving the hackthon is a necessity.)
The text was updated successfully, but these errors were encountered: