-
Notifications
You must be signed in to change notification settings - Fork 3
add file list func #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
21d48c8 to
a4addda
Compare
|
Added a test for the local file list function |
dsaps/models.py
Outdated
| for root, dirs, files in os.walk(directory, topdown=True): | ||
| for file in files: | ||
| if file.endswith(file_extension): | ||
| full_file_path = os.path.join(root, file).replace('\\', '/') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for using the replace here? I'm assuming it has something to do with Windows, but I'd like to better understand the requirement this is solving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is legacy code from my last job that I copied without thinking, it doesn't seem necessary so I'll remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, I would just get rid of this function entirely. The standard library already provides the ability to recursively generate a list of files matching a specific extension:
import glob
files = glob.glob('/path/to/search/**/*.pdf', recursive=True)If there are potentially going to be lots (thousands) of files I'd suggest using iglob instead. This functionality is also in the pathlib module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I will save this for when I create a click command for the full workflow. Thanks!
907d9e5 to
327d496
Compare
tests/test_models.py
Outdated
| assert '1234' in child_list | ||
|
|
||
|
|
||
| def test_build_file_list_local(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you remove the function that this is testing this test would obviously no longer be needed. For future reference though, I'd suggest using one of pytest's temp dir fixtures. I've found a few isolated cases where they don't quite fit, but they will generally be a better option than trying to manage temp files and temp dirs on your own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, though I hope I don't have consider too many local file operations in the future
2ff9e2d to
63ca607
Compare
aaab6b3 to
be4795e
Compare
What does this PR do?
Adding functions to build a file list from a local file system as well as a web directory. These functions are the first components of a workflow to ingest new content into DSpace. I haven't figured how I want to test the local file system function yet but there is a test for the web directory function.
Includes new or updated dependencies?
YES