Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a folder of outputs to acro #130

Merged
merged 15 commits into from
Aug 8, 2023
Merged

Add a folder of outputs to acro #130

merged 15 commits into from
Aug 8, 2023

Conversation

mahaalbashir
Copy link
Contributor

Create a function that:

  1. Takes a folder from the user and adds each file in that folder to an acro object as a custom output.
  2. Rename the outputs from numbers (output_0, output_1,..) to the name of the files.
  3. Call "finalise" to create the SDC files (config.json, checksums, results.json).

@codecov
Copy link

codecov bot commented Aug 6, 2023

Codecov Report

Merging #130 (8e86986) into main (9f3ec7e) will increase coverage by 0.05%.
Report is 3 commits behind head on main.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #130      +/-   ##
==========================================
+ Coverage   97.14%   97.20%   +0.05%     
==========================================
  Files           6        6              
  Lines         666      680      +14     
==========================================
+ Hits          647      661      +14     
  Misses         19       19              
Files Changed Coverage Δ
acro/acro.py 99.02% <100.00%> (+0.04%) ⬆️
acro/record.py 98.51% <100.00%> (+0.03%) ⬆️

@mahaalbashir
Copy link
Contributor Author

@rpreen I would like your advice on something, please.
The case:
If the user calls the add_to_acro function twice. For example: if the user calls it first then adds other files and calls it again.

The problem:
When the user first calls the function, the SDC files config.json, checksums, and results.json are created. Then if the user calls it again the function will assume that the SDC files are part of the user's files and will add them to the results.json and to the checksum folder.

The solution (now):
Before the function add the files to the acro object, it checks if the SDC files exist and deletes them if they are.

if "results.json" in os.listdir(path):  # pragma: no cover
         os.remove(f"{path}/results.json")
if "config.json" in os.listdir(path):  # pragma: no cover
        os.remove(f"{path}/config.json")
if "checksums" in os.listdir(path):  # pragma: no cover
        shutil.rmtree(f"{path}/checksums")

Do you think there is a better solution?

@rpreen
Copy link
Collaborator

rpreen commented Aug 6, 2023

Maybe have the user specify both src and dest paths; and don't worry about what happens in the dest in this PR?

In another PR, we need to prompt to ask the user if they are ok with overwriting or changing the dest for finalise() anyway, so that will then cover this use case?

acro/record.py Outdated
Comment on lines 173 to 176
for root, _, files in os.walk(os.getcwd()):
if root != os.path.join(os.getcwd(), path) and filename in files:
shutil.copy(os.path.join(root, filename), path)
output.append(Path(os.path.join(root, filename)).name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wont work if (a) If the file doesn't exist (b) the file is in a folder in a parent directory (c) what if there are multiple subdirectories which have the same filename - it'll just use the first one found? There is also no warning if any of these things happen, it just doesn't make it to the destination folder.

Actually, I think if the file doesn't exist the current custom_output() will just pretend to add the file without actually checking whether it exists so we should fix that here too...

Signed-off-by: mahaalbashir <mahaalialbashir@gmail.com>
acro/record.py Outdated
self.output_id += 1
logger.info("add_custom(): %s", output.uid)
else:
logger.info("The file %s doesn't exists", filename) # pragma: no cover
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the message should be more explicit:

logger.info("WARNING: Unable to add %s because the file does not exist", filename)

Part of me thinks this should be using warnings.warn() but then I think it will print the source line in record.py which is probably too much info for the researcher here

@rpreen rpreen linked an issue Aug 8, 2023 that may be closed by this pull request
@rpreen rpreen merged commit fffeaf3 into main Aug 8, 2023
5 checks passed
@rpreen rpreen deleted the add_outputs_to_acro branch August 8, 2023 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

script to create acro object post-hoc from directory of outputs
2 participants