Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow spaces in filenames in registry files #315

Merged
merged 4 commits into from
Jul 21, 2022

Conversation

dokempf
Copy link
Contributor

@dokempf dokempf commented Jul 18, 2022

Previously, the tokenization would not allow this. The commit introduces
shlex.split as a tokenizer that is aware of quotes and escapes.

This fixes #313

Previously, the tokenization would not use this. The commit introduces
shlex.split as a tokenizer that is aware of quotes and escapes.
@welcome
Copy link

welcome bot commented Jul 18, 2022

💖 Thank you for opening your first pull request in this repository! 💖

A few things to keep in mind:

No matter what, we are really grateful that you put in the effort to do this!

@dokempf
Copy link
Contributor Author

dokempf commented Jul 18, 2022

I think this is ready for review. I had a look at the CodeCov failure and it seems quite unrelated, the changes on this PR should be all covered.

Copy link
Member

@leouieda leouieda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dokempf! A few minor questions and suggestions only. I don't know why codecov failed but maybe it will go away when the CI runs again.

pooch/tests/data/tiny space data.txt Outdated Show resolved Hide resolved
pooch/tests/data/registry-spaces.txt Outdated Show resolved Hide resolved
@dokempf
Copy link
Contributor Author

dokempf commented Jul 20, 2022

@leouieda Thanks for the review, I have integrate your feedback!

@dokempf dokempf requested a review from leouieda July 20, 2022 19:37
@leouieda
Copy link
Member

Thanks @dokempf! Looks great to me. Just waiting on CI to finish 🤖.

@leouieda leouieda merged commit 65d74c9 into fatiando:main Jul 21, 2022
@welcome
Copy link

welcome bot commented Jul 21, 2022

🎉 Congrats on merging your first pull request and welcome to the team! 🎉

If you would like to be added as a author on the Zenodo archive of the next release, add your full name, affiliation, and ORCID (optional) to the AUTHORS.md file of this repository. Feel free to do this in a new pull request if needed.

We hope that this was a good experience for you. Let us know if there is any way that the contributing process could be improved.

@leouieda
Copy link
Member

Thanks for this @dokempf! I'll make a release soon with this and your other PR.

@dokempf
Copy link
Contributor Author

dokempf commented Jul 22, 2022

@leouieda Great, I would try to implement #317 as well before that release.

@mscheltienne
Copy link

I'm running v1.7.0, and it seems like this is not fully resolved.
Filename: "LICENSE (copy)", with a space.

Run: pooch.make_registry(DATASET, output=REGISTRY, recursive=True) with DATASET the path to the folder containing LICENSE (copy) and more. It results in this registry:

LICENSE 1f7e3edb3fc584df7f618e17cef4befa18232b99302dd8d55e519ab3d9e028b4
LICENSE (copy) 1f7e3edb3fc584df7f618e17cef4befa18232b99302dd8d55e519ab3d9e028b4
ssp/ssp_0_230120.fif b34bd4c97e8e32854c40dd01025bb5f236f6def3cf77fbb16b8e4c0f0c3e69d1
ssp/ssp_0_ias_230120.fif ba8d51427be963268884f6dc2b994b494268e1e80648aaf8cffca2a01240cb19
ssp/ssp_60_230120.fif 9465dcabfb9808b13eced54a644b9104f3d5d8b348bb4315f76dd572eff35ef2
ssp/ssp_60_ias_230120.fif 14f55a49ae8fba62c868fb117daf902f32981cecf7bc8675a4e21e44018d9ce8
ssp/ssp_68_230120.fif c6b5f56a30582b930e5c83f4404c82579929fde8cc650d062dc33429e95ebc4f
ssp/ssp_68_ias_230120.fif 379ea05dc48869c57368d354f17e65e943d555322d63b3b5caba3ae7221374cf
version.txt e2556a181068db2c7e3b2b127de33540448820fb1e97da29239833b6a8e09764

And now fetcher.load_registry(REGISTRY) with fetcher being an instance of Pooch yields:

fetcher.load_registry(REGISTRY)
fetcher.registry

>>>
{'LICENSE': '(copy)',
 'ssp/ssp_0_230120.fif': 'b34bd4c97e8e32854c40dd01025bb5f236f6def3cf77fbb16b8e4c0f0c3e69d1',
 'ssp/ssp_0_ias_230120.fif': 'ba8d51427be963268884f6dc2b994b494268e1e80648aaf8cffca2a01240cb19',
 'ssp/ssp_60_230120.fif': '9465dcabfb9808b13eced54a644b9104f3d5d8b348bb4315f76dd572eff35ef2',
 'ssp/ssp_60_ias_230120.fif': '14f55a49ae8fba62c868fb117daf902f32981cecf7bc8675a4e21e44018d9ce8',
 'ssp/ssp_68_230120.fif': 'c6b5f56a30582b930e5c83f4404c82579929fde8cc650d062dc33429e95ebc4f',
 'ssp/ssp_68_ias_230120.fif': '379ea05dc48869c57368d354f17e65e943d555322d63b3b5caba3ae7221374cf',
 'version.txt': 'e2556a181068db2c7e3b2b127de33540448820fb1e97da29239833b6a8e09764'}

Which is wrong.

@dokempf
Copy link
Contributor Author

dokempf commented Jul 28, 2023

@mscheltienne You need to adapt your registry file to put ".." around the filename that contains spaces or use \ to escape the whitespace character. I guess there is no way around this limitation as the parsing would be ambiguous otherwise.

@mscheltienne
Copy link

I agree, but that should be done automatically by pooch since this registry file was created with pooch.make_registry.

@dokempf
Copy link
Contributor Author

dokempf commented Jul 28, 2023

Oh, sorry for my poor reading skills. I agree, pooch should output a quoted filename in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Registry files cannot handle file names containing spaces
3 participants