Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[20.01] Make watcher more resilient to missing files #9738

Merged
merged 2 commits into from May 8, 2020

Conversation

nuwang
Copy link
Member

@nuwang nuwang commented May 7, 2020

This is to solve:

FileNotFoundError: [Errno 2] No such file or directory: '/galaxy/server/database/jobs_directory/000/148/galaxy_148.e'
urllib3.connectionpool DEBUG 2020-05-07 12:26:50,738 https://10.43.0.1:443 "GET /apis/batch/v1/namespaces/initial/jobs?labelSelector=app%3Dgalaxy-galaxy-1588845970-148 HTTP/1.1" 200 145
galaxy.jobs.runners.kubernetes ERROR 2020-05-07 12:26:50,745 No Jobs are available under expected selector app=galaxy-galaxy-1588845970-148
galaxy.jobs.runners ERROR 2020-05-07 12:26:50,746 Unhandled exception checking active jobs
Traceback (most recent call last):
File "/galaxy/server/lib/galaxy/jobs/runners/__init__.py", line 708, in monitor
self.check_watched_items()
File "/galaxy/server/lib/galaxy/jobs/runners/__init__.py", line 735, in check_watched_items
new_async_job_state = self.check_watched_item(async_job_state)
File "/galaxy/server/lib/galaxy/jobs/runners/kubernetes.py", line 469, in check_watched_item
with open(job_state.error_file, 'w') as error_file:
FileNotFoundError: [Errno 2] No such file or directory: '/galaxy/server/database/jobs_directory/000/148/galaxy_148.e'

with open(job_state.error_file, 'w') as error_file:
error_file.write("No Kubernetes Jobs are available under expected selector app=%s\n" % job_state.job_id)
self.mark_as_failed(job_state)
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good! this should at least avoid the missing file to derail the loop of jobs being checked.

@dannon
Copy link
Member

dannon commented May 7, 2020

Can you rebase this against the target branch? We had to force push it to correct something that had been erroneously merged.

@nsoranzo nsoranzo changed the title Make watcher more resilient to missing files [20.01] Make watcher more resilient to missing files May 7, 2020
@nuwang nuwang marked this pull request as ready for review May 7, 2020 15:31
@nuwang
Copy link
Member Author

nuwang commented May 7, 2020

@dannon Done.

@mvdbeek mvdbeek merged commit 9f07c78 into galaxyproject:release_20.01 May 8, 2020
@mvdbeek mvdbeek added this to the 20.05 milestone May 8, 2020
@galaxyproject galaxyproject deleted a comment from galaxybot May 8, 2020
afgane added a commit that referenced this pull request Feb 5, 2021
Nuwan Goonasekera (@nuwang) has been an active member of the Galaxy community for nearly a decade. He has made numerous and significant contributions across the spectrum of Galaxy Project repositories: [bioblend](https://github.com/galaxyproject/bioblend/commits?author=nuwang), [ansible-galaxy](https://github.com/galaxyproject/ansible-galaxy/commits?author=nuwang), [galaxy-helm]https://github.com/galaxyproject/galaxy-helm/commits?author=nuwang), and others. 

He actively helps users and other team members by answering questions via chat and issues on GitHuh, providing constructive feedback and dedicating time to follow up, eg. https://gitter.im/galaxyproject/wg-deployment?at=601ae0a655359c58bf1c8ff7, https://gitter.im/galaxyproject/FederatedGalaxy?at=6019abed428d9727dd4ebacb, CloudVE/cloudbridge#258. 

He has made a number of improvements to the Galaxy codebase demonstrating his grasp and commitment to the core project:
- https://github.com/galaxyproject/galaxy/pulls?page=2&q=is%3Apr+author%3Anuwang
The following pull requests exemplify his contributions over the years:
- #10681
- #9738
- #1814

I propose we add Nuwan to the Galaxy committers group to both recognize his contributions and benefit more directly from his expertise. 

@galaxyproject/core please vote.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants