Skip to content
This repository has been archived by the owner on Feb 19, 2021. It is now read-only.

ENH: Enable parsing of date from filename (without strict filename structure) #440

Merged

Conversation

jat255
Copy link
Contributor

@jat255 jat255 commented Nov 16, 2018

This PR adds a new (optional) feature that will pull a date out of the filename. I have a document source that is often just YYYY-MM-DD followed by some text description, and so none of the strict date parsing from https://github.com/danielquinn/paperless/blob/5c3cb1e4ab2bb286659339c721323406c5f7cc04/src/documents/models.py#L431-L435 was working for me. We already have a great date parser for the text of a file, so this extends it to the filename as well, provided PAPERLESS_FILENAME_DATE_ORDER is defined in the config file.

I also modified the massive date regex to permit year-first dates, as well as match on dates that are followed/preceded by either - or _ (rather than just word breaks).

Against @danielquinn's wishes, I run with a time zone specified in my settings, and so I've changed the date tests to be specific to a user's defined time zone, as well as made the date formats explicit in each test, so there shouldn't be problems testing my code from Europe, or running the original tests in the US. I also added a blurb to the documentation, as well as a couple tests for the filename date parsing, and ensured that all the tests passed on my local machine after these additions.

Please feel free to add comments/suggestions/etc.

@danielquinn danielquinn merged commit 730daa3 into the-paperless-project:master Dec 1, 2018
@danielquinn
Copy link
Collaborator

All done :-) I had to go through and conform the code to the style guide, but outside of that, it's some great work and great to have in the project, thanks!

@jat255 jat255 deleted the ENH_filename_date_parsing branch December 3, 2018 02:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants