Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix harbinger timeout bug and _multipleFileReader docstring (resolves #953, resolves #949) #954

Merged

Conversation

arkal
Copy link
Contributor

@arkal arkal commented Jun 8, 2016

Resolves #953
Resolves #949

Harbinger timeout bug was fixed by adding logic for jobs to ensure that the process downloading the
requested file from the job store is still alive, and to handle the case where it isn't.

Docstring in AbstractCacheTest._multipleFileReader was updated to reflect the changes made to it since
the function was written.

@hannes-ucsc hannes-ucsc changed the title Fix harbinger timeout bug and _multipleFileReader docstring (Resolves #953, #949) Fix harbinger timeout bug and _multipleFileReader docstring (resolves #953 and #949) Jun 8, 2016
@hannes-ucsc hannes-ucsc changed the title Fix harbinger timeout bug and _multipleFileReader docstring (resolves #953 and #949) Fix harbinger timeout bug and _multipleFileReader docstring (resolves #953, resolves #949) Jun 8, 2016
@hannes-ucsc
Copy link
Member

Resolves should be lowercase in first commit message.

pid = int(open(harbingerFileName).read())
except ValueError:
# There's a chance the downloading process died before it could write
# into the harbinger file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you have to create the file atomically. How do you know that it didn't write half of the pid which would still parse as an int?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this still here? With the atomic creation you added, it is highly unlikely that you'll get a ValueError, correct? If so, I would rather let the ValueError through.

@arkal arkal force-pushed the issues/953-fix-harbinger-timeout-bug branch from 36ee67f to c99eeae Compare June 8, 2016 03:37
@arkal arkal removed the needs work label Jun 8, 2016
@arkal
Copy link
Contributor Author

arkal commented Jun 8, 2016

@hannes-ucsc i implemented the suggested changes

@hannes-ucsc
Copy link
Member

Build failure tracked as #955.

@hannes-ucsc
Copy link
Member

@arkal, see line notes.

arkal added 2 commits June 8, 2016 13:09
resolves DataBiosphere#953

The harbinger file now stores the PID of the downloading process and other requesting jobs can handle a
situation where the downloading job dies without deleting the harbinger file.
@arkal arkal force-pushed the issues/953-fix-harbinger-timeout-bug branch from c99eeae to 5496501 Compare June 8, 2016 20:12
@arkal
Copy link
Contributor Author

arkal commented Jun 8, 2016

@hannes-ucsc i implemented the suggested changes

@arkal arkal removed the needs work label Jun 8, 2016
@hannes-ucsc hannes-ucsc merged commit a6ba90c into DataBiosphere:master Jun 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Caching hangs on harbinger file in sortTest AbstractCacheTest._multipleFileReader docstring is incorrect
2 participants