Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs/_ftp_parse.py fails when parsing directory listing on NT format if folder or filename contains AM or PM #395

Closed
xoriath opened this issue May 16, 2020 · 0 comments · Fixed by #396

Comments

@xoriath
Copy link
Contributor

xoriath commented May 16, 2020

The regex pattern used to parse the directory listing for NT servers uses a greedy (default) pattern when parsing the modified date column. This causes the (<modified>) capture group to capture until the last AM or PM, which could be in the file name.

Example file listing that fails:

01-29-20  04:11AM       <DIR>          Clock at AM or PM

This fails with the following trace

  File ".venv\lib\site-packages\fs\ftpfs.py", line 746, in scandir
    if not self.supports_mlst and not self.getinfo(path).is_dir:
  File ".venv\lib\site-packages\fs\ftpfs.py", line 614, in getinfo
    directory = self._read_dir(dir_name)
  File ".venv\lib\site-packages\fs\ftpfs.py", line 495, in _read_dir
    _list = [Info(raw_info) for raw_info in ftp_parse.parse(lines)]
  File ".venv\lib\site-packages\fs\_ftp_parse.py", line 70, in parse
    raw_info = parse_line(line)
  File ".venv\lib\site-packages\fs\_ftp_parse.py", line 81, in parse_line
    return decode_callable(line, match)
  File ".venv\lib\site-packages\fs\_ftp_parse.py", line 163, in decode_windowsnt
    raw_info["details"]["size"] = int(match.group("size"))
ValueError: invalid literal for int() with base 10: ''
xoriath added a commit to xoriath/pyfilesystem2 that referenced this issue May 16, 2020
By making this capture pattern non-greedy it will no longer
capture until the last AM or PM in a directory listing line.
xoriath added a commit to xoriath/pyfilesystem2 that referenced this issue Sep 30, 2020
By making this capture pattern non-greedy it will no longer
capture until the last AM or PM in a directory listing line.
althonos pushed a commit that referenced this issue Sep 30, 2020
By making this capture pattern non-greedy it will no longer
capture until the last AM or PM in a directory listing line.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant