Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when downloading Landsat with force-level1-landsat search #262

Closed
JariPekko opened this issue Jan 4, 2023 · 10 comments
Closed

Error when downloading Landsat with force-level1-landsat search #262

JariPekko opened this issue Jan 4, 2023 · 10 comments
Assignees
Labels
non-critical error Error/Warning that doesn't affect job or results

Comments

@JariPekko
Copy link

Hi!
I get the following error message when downloading Landsat images from USGS with force-level1-landsat search and i don't know if it's a problem.

Error message

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 592, in _handle_results
    cache[job]._set(i, obj)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 776, in _set
    self._callback(self._value)
  File "/usr/local/lib/python3.8/dist-packages/landsatlinks/download.py", line 79, in callback
    create_force_queue(url, output_dir, queue_fp)
  File "/usr/local/lib/python3.8/dist-packages/landsatlinks/download.py", line 44, in create_force_queue
    scene_name = f'{re.search(utils.PRODUCT_ID_REGEX, url).group(0)}.tar'
AttributeError: 'NoneType' object has no attribute 'group'

I used the following command:
dforce force-level1-landsat search Landsat_tiles.txt images/ --cloudcover 0,70 --queue-file queue.txt --secret usgs_m2m_access.txt --download

Behaviour
The download starts as expected and images are downloaded. After a few minutes the error message from above appears but the process is not aborted and the download continues.

Setup
FORCE version 3.7.10 using Docker
Ubuntu 20.04.5 LTS
Linux Server
500G RAM, 80 CPUs

Question
Do i have to worry?
Is it just a warning that an URL didn't work?

@ernstste
Copy link
Collaborator

ernstste commented Jan 4, 2023

Hi Jari,

the error occurs when trying to add the scene that was just downloaded to the QUEUE file, so you probably don't need to worry about the download itself.
It's not easy to say what the issue is with the information at hand. Does this only happen once? Can you specify which scene it was?

Thanks,
Stefan

@JariPekko
Copy link
Author

Hi Stefan,

thanks for the quick reply. So the download happens correctly but the QUEUE file is not updated correctly?
I just counted the files in the download dir (6129) and the lines of the QUEUE file (5567) in case this is heplful.

I think so far the error happened only once each time i started the process.
I don't know which scene caused it but i can give you all scenes i'm trying to download.

Sensor(s): TM, ETM, OLI
Tile(s): 171074,172074,172075,173074,173075,174074,174075,175074,175075,176073,176074,176075,177070,177071,177072,177073,177074,177075,178070,178071,178072,178073,178074,178075,179070,179072,179073,179074,179075,180072,180073
Date range: 1970-01-01 to 2023-01-04
Included months: 1,2,3,4,5,6,7,8,9,10,11,12
Cloud cover: 0% to 70%

20793 Landsat Level 1 scenes matching criteria found
10.97 TB data volume found
5850 product bundles found in output directory, 14943 not downloaded yet.
Remaining download size: 9.78 TB
Downloading:   1%|=>                                                     | 102/14909 [18:55<42:35:28, 10.36s/product bundle]
Downloading:   1%|==                                                     | 152/14909 [27:11<30:28:13,  7.43s/product bundle]

@davidfrantz
Copy link
Owner

could this be a potential file conflict when parallelly downloading images?

@ernstste
Copy link
Collaborator

ernstste commented Jan 4, 2023

@JariPekko Thanks, it looks like there is definitely an issue with writing the file queue. Please make sure to create the queue for processing yourself before starting the Level 2 processing.

@davidfrantz There is potential for this to happen in the current version. However, according to the traceback the issue here is that the callback function (called after downloading a scene) isn't getting the url passed on properly.

@davidfrantz davidfrantz added the non-critical error Error/Warning that doesn't affect job or results label Jan 5, 2023
@ernstste
Copy link
Collaborator

ernstste commented Jan 6, 2023

I have run several tests and was unfortunately not able to reproduce the issue.

However, the way that the force queue file is created has been reworked to make sure that there aren't conflicts due to parallel access of processes on the same file. Instead of using a callback, we now use multiprocessing.Queue and a dedicated process that listens for results of the other processes and writes the queue file.

@JariPekko maybe you can try to pull the latest davidfrantz/force:latest image and let us know if that solves your issue? Thanks!

@ernstste ernstste self-assigned this Jan 6, 2023
@JariPekko
Copy link
Author

I pulled the latest image used it without changing anything else and it seems to be working as intended now.
The download is going for 1h now with no error.

A small update to the process before the latest davidfrantz/force:latest image:
The downloads did stop completely at some point. The process was still going but no download for several hours. After aborting manually and starting again it was always the same pattern:
Download works as intended -> after a short while the error message from above appears but download is still ongoing ->some time later the download stops

Thanks, and i'll post an update about how it went

@JariPekko
Copy link
Author

I'm happy to report that the download went flawlessly and rather quickly. In one day the ~11TB were downloaded.

Though the QUEUE file didn't seem to update at all. I downloaded 20792 scenes (as requested minus 1) but the QUEUE file had only 5771 lines, which it had before using the new davidfrantz/force:latest image. The QUEUE file had to be written manually afterwards.

Thanks a lot for your quick help!!

@ernstste
Copy link
Collaborator

ernstste commented Jan 9, 2023

Thanks for the feedback Jari!

I also noticed that the download speed has improved by orders of magnitude. I hope there have been changes to the infrastructure and it will stay like this now.

Glad to hear the issue is solved! To be honest I'm a bit puzzled that the queue file wasn't updated in your case. This was tested successfully here and I also had someone contact me in private with the same issue who had no issues writing the queue file after the update. Was the file maybe locked by another process by any chance?

@JariPekko
Copy link
Author

I'm new to Linux and may be overlooking something, but I can't think of another process that would have locked the QUEUE file. I stopped the download (force-level1-landsat search --download) process from before the update. The only other command i did involving the QUEUE file was to count its lines sometimes wc -l queue.txt.

For testing purposes i just downloaded another scene with a new QUEUE file and new directories. Now the QUEUE file was updated correctly.

@ernstste
Copy link
Collaborator

ernstste commented Jan 10, 2023

Good to hear, thanks.

Leaving the commit for reference and closing this as completed.
Feel free to re-open if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
non-critical error Error/Warning that doesn't affect job or results
Projects
None yet
Development

No branches or pull requests

3 participants