New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
yield_checksumfiles
function doesn't handle ChecksumFiles
with file paths as their name
correctly
#666
Comments
I have this fixed on a local branch, once i add a regression test I'll open a PR for it. |
@mvandenburgh and I talked offline, @mvandenburgh is going to try to come up with another example for me to think about this from another angle |
@banesullivan I looked more into this a bit, and there does indeed seem to be a bug, but I think my original theory of what was causing it was wrong; I still have some investigation to do, but so far it seems like the messed up file path happens during file ingestion, not during download. Unfortunately I messed around with the caching too much on my local environment and seem to have completely broken it, so I'll need to rebuild my docker setup from scratch - I can carve out some time tomorrow to do that and hopefully conclude this. |
I think I've figured out the problem - it's coming from the I think what's happening is this -
And Does that make sense @banesullivan? |
Aha, thanks for looking into this further and finding the culprit of this. I will review your PR and help to land this fix |
Well, I put in the fix I suggested and added some tests for this edge case, and they fail. It seems like I was actually right the first time (:sweat_smile:), and the bug is related to the
Say we call this on a So there's two possible solutions - one is to do what I did originally, which is to strip the @banesullivan I believe you wrote most/all of the file caching code, do you have any thoughts on which approach makes sense? |
@mvandenburgh, I have pushed to your #667 in a169f58 and e6dd7d5 which should fix this issue. The problem was that ResonantGeoData/django-rgd/rgd/models/file.py Line 290 in 460aabc
Is mounting under the root cache directory and handles nesting the file with its relative path in the |
To reproduce, run this management script and then try to view the
Raster
that it creates. The tile requests will fail with 500 errors with errors about wrong file paths in the server log.I came across this bug when working on Danesfield. On that project we encode the file path of a given
ChecksumFile
inside thename
column - for example,foo/bar/foobar.txt
represents a file hierarchy ofThis line in RGD is incompatible with this approach, since it assumes the ChecksumFile name is a flat file name instead of a file path. For example, if we have a ChecksumFile with the name
foo.tiff
, the current code would work:Note that
path.parent
would be/tmp/rgd/file_cache/foo.tiff
, which is correct. But if the file name isa/b/c/foo.tiff
:the call to
path.parent
would evaluate to/tmp/rgd/file_cache/a/b/c
, which is incorrect and leads to 500 errors since the server can't find the files in that location.The text was updated successfully, but these errors were encountered: