Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise exception on trying to compute MD5 sum of directory. #256

Closed
wants to merge 1 commit into from
Closed

Raise exception on trying to compute MD5 sum of directory. #256

wants to merge 1 commit into from

Conversation

madig
Copy link

@madig madig commented May 3, 2018

For clarity.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0001%) to 99.739% when pulling 561b913 on madig:raise-on-md5-dir into 088d2ba on pydoit:master.

Copy link
Member

@schettino72 schettino72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks.

So what was happening when passing a directory?

Can we have a separate unit-test?

@@ -36,6 +36,9 @@ def get_file_md5(path):
@param path: (string) file path
@return: (string) md5
"""
if os.path.isdir(path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this is probably not a good location for doing this check. check_modified was carefully implemented to avoid getting file_stat more than once. since this is a relatively expensive operation.

@madig
Copy link
Author

madig commented May 4, 2018

Before, I'd get this stack trace:

Traceback (most recent call last):
  File "c:\program files\python36\lib\site-packages\doit\doit_cmd.py", line 177, in run
    return command.parse_execute(args)
  File "c:\program files\python36\lib\site-packages\doit\cmd_base.py", line 127, in parse_execute
    return self.execute(params, args)
  File "c:\program files\python36\lib\site-packages\doit\cmd_base.py", line 420, in execute
    return self._execute(**exec_params)
  File "c:\program files\python36\lib\site-packages\doit\cmd_run.py", line 260, in _execute
    return runner.run_all(self.control.task_dispatcher())
  File "c:\program files\python36\lib\site-packages\doit\runner.py", line 256, in run_all
    self.run_tasks(task_dispatcher)
  File "c:\program files\python36\lib\site-packages\doit\runner.py", line 223, in run_tasks
    self.process_task_result(node, catched_excp)
  File "c:\program files\python36\lib\site-packages\doit\runner.py", line 187, in process_task_result
    self.dep_manager.save_success(task)
  File "c:\program files\python36\lib\site-packages\doit\dependency.py", line 512, in save_success
    state = self.checker.get_state(dep, self._get(task.name, dep))
  File "c:\program files\python36\lib\site-packages\doit\dependency.py", line 400, in get_state
    md5 = get_file_md5(dep)
  File "c:\program files\python36\lib\site-packages\doit\dependency.py", line 39, in get_file_md5
    with open(path, 'rb') as file_data:
PermissionError: [Errno 13] Permission denied: 'sources\\newFont-Bold.ufo'

The use case here is that in the font world, the UFO format is a directory with all data for one font contained within in separate files and subdirectories. I originally wanted to pass in the path to the directory in the hope that doit will know when something within the directory changed.

So where would that check be better kept?

@schettino72
Copy link
Member

on get_status() there is already some code to check if file exist or not. I guess it would make sense to check if file is a directory on same location.

Some use-cases have thousands of file_dep, so performance need to be taken into account.
It would be preferable to not actually not check if file is directory and deal with a raised exception (if possible). Otherwise use the stat [1] result. Not the high-level isdir() function.

[1] https://docs.python.org/3/library/stat.html#stat.S_ISDIR

@madig
Copy link
Author

madig commented May 7, 2018

Some use-cases have thousands of file_dep

Yep, that's our use-case ;)

It would be preferable to not actually not check if file is directory and deal with a raised exception

You mean in get_status()?

@schettino72
Copy link
Member

@madig so what the status on this?

About checking path vs catching exception. I see you get a permission denied exception and the goal was to give a clear error message... need to be careful not to say a folder is not supported when permission is the real problem.

I think all code changes should go in get_status()

@madig
Copy link
Author

madig commented May 30, 2018

Didn't have time yet, still on my todo list.

@schettino72
Copy link
Member

Closing this as there is no activity for a long time... feel free to reopen if fix the pending issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants