Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetcher downloads data every time it is called #1638

Closed
BramshQamar opened this issue Sep 13, 2018 · 10 comments
Closed

Fetcher downloads data every time it is called #1638

BramshQamar opened this issue Sep 13, 2018 · 10 comments

Comments

@BramshQamar
Copy link
Contributor

Every fetcher in fetcher.py downloads data every time it's called. Even when data is already in place.

for example:

from dipy.data.fetcher import fetch_scil_b0, read_siemens_scil_b0
fetch_scil_b0()

output:

Data size is approximately 9.2MB Downloading "datasets_multi-site_all_companies.zip" to /Users/bramshqamar/.dipy Download Progress: [##################################] 100.00% of 9.19 MBFiles successfully downloaded to /Users/bramshqamar/.dipy Out[1]: ({'datasets_multi-site_all_companies.zip': ('https://digital.lib.washington.edu/researchworks/bitstream/handle/1773/38479/datasets_multi-site_all_companies.zip', None)}, '/Users/bramshqamar/.dipy')

lets fetch again:
fetch_scil_b0()

output:

Data size is approximately 9.2MB Downloading "datasets_multi-site_all_companies.zip" to /Users/bramshqamar/.dipy Download Progress: [##################################] 100.00% of 9.19 MBFiles successfully downloaded to /Users/bramshqamar/.dipy Out[2]: ({'datasets_multi-site_all_companies.zip': ('https://digital.lib.washington.edu/researchworks/bitstream/handle/1773/38479/datasets_multi-site_all_companies.zip', None)}, '/Users/bramshqamar/.dipy')

This is the case with every fetcher.

@BramshQamar
Copy link
Contributor Author

I am on Mac. I have Python 3.6.1, Numpy 1.12.1, and Nibabel 2.3.0.

@arokem
Copy link
Contributor

arokem commented Sep 18, 2018

Is the data on your hard-drive in between calls? Look in ~/.dipy

@skoudoro
Copy link
Member

this data does not have md5, can it be the problem @arokem?

@arokem
Copy link
Contributor

arokem commented Sep 18, 2018

Might be. I can replicate this bug with the SCIL b0 dataset, but not with other datasets. @BramshQamar: Did you experience this also with other datasets?

I bet we need to change something around this line: https://github.com/nipy/dipy/blob/master/dipy/data/fetcher.py#L175

@skoudoro
Copy link
Member

I bet we need to change something around this line:

or update the dataset. having a digital signature like md5 should be good practice, right? this permit to make sure that the dataset does not change.

What do you think?

@arokem
Copy link
Contributor

arokem commented Sep 18, 2018

Yes. Even better.

@skoudoro
Copy link
Member

So, can you update it? I do not know how to access to https://digital.lib.washington.edu/

@arokem
Copy link
Contributor

arokem commented Sep 18, 2018

I don't think you need to access that website. Just add this line: #1643

@BramshQamar
Copy link
Contributor Author

I was trying with some fetchers I wrote and fetch_scil_b0. I will add md5 in my fetchers and will test them.
Thank You @arokem @skoudoro

@skoudoro
Copy link
Member

fix by #1643

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants