Download and multiprocessing updates #226

TimoRoth · 2017-05-03T09:24:08Z

Goes back to the previous download behavior, while keeping the changed config intact.

While testing, I also found that it'd be nice to see which glacier crashed in which task while multiprocessing, so I added a wrapper-exception around it with the rgi_id and task name if those are available.

fmaussion · 2017-05-03T09:31:39Z

While testing, I also found that it'd be nice to see which glacier crashed in which task while multiprocessing, so I added a wrapper-exception around it with the rgi_id and task name if those are available.

Yes, there are plenty of things that would be nice to know after run. I forgot to write them down but I'm going to do new runs soon and maybe we can define a strategy for better runs.

One of the things I'd really like to have is a list of successfully applied tasks on a glacier. Then, if you want to apply the task again on the same glacier it is ignored, unless forced by the user.

fmaussion · 2017-05-03T13:54:46Z

I'm not sure why the test is suddenly failing, I think we can merge this

TimoRoth · 2017-05-03T13:55:38Z

I have one more idea for an addition, the test failure seems unrelated.

TimoRoth · 2017-05-03T14:04:04Z

This adds a parameter to override the cache layout, so if some data files are usually stored in a specific format, it can now be defined without having to manually intervene or copy the files around.

fmaussion · 2017-05-03T14:07:50Z

@TimoRoth if you have time could you look at this PR on xarray? pydata/xarray#1393

Here they implement a relatively simple checksum to make the algorithm more robust. Like Stephan I am quite surprised as to why it's not implemented in urlretrieve per default, but if it can make our downloads more robust...

TimoRoth · 2017-05-03T14:29:31Z

That's a special case for downloads from github, as github can give you a checksum for every file.
As most (all but one) of our downloads are not from github, we can't do something like that.

TimoRoth added 2 commits May 2, 2017 20:30

Partially revert to old download caching logic

37df404

Print task name and rgi_id on error while multiprocessing

6e6e5e5

Ignore broken(?) glaciers in run_benchmark.py

58bb901

TimoRoth force-pushed the master branch from b6dee71 to 58bb901 Compare May 3, 2017 11:33

Add optional cache name override

a2bd81c

fmaussion merged commit c7a08b9 into OGGM:master May 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Download and multiprocessing updates #226

Download and multiprocessing updates #226

TimoRoth commented May 3, 2017

fmaussion commented May 3, 2017

fmaussion commented May 3, 2017

TimoRoth commented May 3, 2017

TimoRoth commented May 3, 2017

fmaussion commented May 3, 2017

TimoRoth commented May 3, 2017

Download and multiprocessing updates #226

Download and multiprocessing updates #226

Conversation

TimoRoth commented May 3, 2017

fmaussion commented May 3, 2017

fmaussion commented May 3, 2017

TimoRoth commented May 3, 2017

TimoRoth commented May 3, 2017

fmaussion commented May 3, 2017

TimoRoth commented May 3, 2017