Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi word match #277

Merged
merged 5 commits into from Dec 27, 2017

Conversation

Projects
None yet
2 participants
@ishirav
Copy link

commented Dec 23, 2017

Hi Daniel,

Thank you for the Paperless project, it's exactly what I was looking for and it's extremely well-built.

Here's a pull request for your consideration: support for multi-word search terms in match text.

Use case: suppose I want to add Bob Marley as a correspondent, but his name may appear either as "Bob Marley" or "Marley, Bob". So with this feature, I can use the MATCH_ANY mode and set the match text to "Bob Marley" "Marley, Bob".

ishirav added some commits Dec 23, 2017

@danielquinn

This comment has been minimized.

Copy link
Collaborator

commented Dec 27, 2017

I think this may be the cleanest pull request this project has ever seen. Nice work, and thanks for your contribution!

@danielquinn danielquinn merged commit 0611792 into the-paperless-project:master Dec 27, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@danielquinn

This comment has been minimized.

Copy link
Collaborator

commented Jan 6, 2018

@ishirav I just ran the tests with --pythonwarnings=all and I got this DeprecationWarning from the code you added in this PR:

documents/tests/test_matchables.py::TestMatching::test_match_all
  /home/daniel/Projects/Paperless/src/documents/models.py:136: DeprecationWarning: bad escape \s
    return [normspace(r"\s+", (t[0] or t[1]).strip()) for t in findterms(self.match)]

The complaint appears to be coming from a list comprehension you added containing a \s+ here, but I don't really understand what you're trying to do there. Can you help me out?

@ishirav

This comment has been minimized.

Copy link
Author

commented Jan 15, 2018

The idea behind this line is to build a regular expression, such as "Daniel\s+Quinn" so that any whitespace characters will be acceptable between the words.
One way to get rid of the warning:

return [normspace(" ", (t[0] or t[1]).strip()).replace(" ", r"\s+") for t in findterms(self.match)]
@danielquinn

This comment has been minimized.

Copy link
Collaborator

commented Jan 17, 2018

Excellent. Thank you :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.