Create tags from sub directories #69

jayme-github · 2020-11-29T15:05:40Z

The names of sub directories in the consumer directory will be added as tags for the document to be consumed.
To enable this, set:
PAPERLESS_CONSUMER_RECURSIVE=1
PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=1

Fixes #50

While this basically works, I had a bad time trying to test this. Debugging tests is a bit hard in general because of the async nature but in particular this always gives me table locking errors (for document_tags). While that might make sense, I also tried to not create tags in the document_consumer (just Tags.objects.get()) which seems to still lock the table.

I'm not an expert at Django ORM etc. et all, so maybe you have idea on how to work around this @jonaswinkler

jonaswinkler · 2020-11-29T15:26:52Z

Well, the test case clearly complains about 'Space Tag' not being found, so something might be amiss here.

jayme-github · 2020-11-29T15:31:24Z

Well, the test case clearly complains about 'Space Tag' not being found, so something might be amiss here.

Exactly what I said, see: https://travis-ci.org/github/jonaswinkler/paperless-ng/jobs/746576607#L529

jonaswinkler · 2020-11-29T15:32:05Z

OH. okay.

jonaswinkler · 2020-11-29T15:53:38Z

Swap out TestCase base class for TransactionTestCase, which does not put each test case in a single transaction.

jayme-github · 2020-11-29T16:52:38Z

Swap out TestCase base class for TransactionTestCase, which does not put each test case in a single transaction.

Sweet, thanks!

jonaswinkler · 2020-11-30T11:12:14Z

any reason you switched to pyinotify? this is old and hasn't seen changes since 5 years. I just did some more testing and wanted to merge.

jonaswinkler · 2020-11-30T11:21:22Z

And it works in all cases. The only thing you missed was the read_delay=1000, which is important for certain scenarios.

jayme-github · 2020-11-30T11:21:37Z

any reason you switched to pyinotify? this is old and hasn't seen changes since 5 years. I just did some more testing and wanted to merge.

I was unable to get the tests green with inotify_simple/inotifyrecursive as read_delay does not seem to work reliably with recursive watchers (not getting any events at all) and disabling read_delay breaks test_slow_write_and_move for example. As this felt very fragile over all (as the tests sometimes do succeed locally) I wanted to see if pyinotify would make a difference (as it's not ctypes but a C extension).

jonaswinkler · 2020-11-30T11:42:10Z

Since it applies to folders as well. make a folder, instantly copy a file, and it won't get picked up. not ideal, i've got some ideas on how to make that better, but for now, it works.

And inotify_simple has been used in this project for a very long time, and people didnt complain :)

jayme-github · 2020-11-30T13:20:08Z

Since it applies to folders as well. make a folder, instantly copy a file, and it won't get picked up. not ideal, i've got some ideas on how to make that better, but for now, it works.

Yeah. That's why I wanted to see if pyinotify would handle that any better (e.g. create the sub-watch immediately)

And inotify_simple has been used in this project for a very long time, and people didnt complain :)

Sure, but neither with recursive mode nor with read_delay. ;-)

Anyways. I pushed back the inotifyrecursive version plus a sleep in the test case after creating the directories. Maybe the read_delay could be lowered quite a bit. As it seems it's mostly useful during move, a shorter period should be fine as well.

The names of sub directories in the consumer directory will be added as tags for the document to be consumed. To enable this, set: PAPERLESS_CONSUMER_RECURSIVE=1 PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=1 Fixes the-paperless-project#50

jonaswinkler · 2020-11-30T13:49:48Z

Well, you see what i was trying to test in that test case. Some scanners like to do that. Write files to file.~df, and move to file.pdf when done. At some point I'll write a check that curates supported file extensions from registered parsers and checks against that, which would remove the need for read_relay.

jayme-github · 2020-11-30T14:15:17Z

Well, you see what i was trying to test in that test case. Some scanners like to do that. Write files to file.~df, and move to file.pdf when done. At some point I'll write a check that curates supported file extensions from registered parsers and checks against that, which would remove the need for read_relay.

Oh, okay. I did not recognize that as another scanner quirk. :) In that case the longer timeout does make sense ofc.

New Crowdin updates

jayme-github force-pushed the feature-directory-tags branch from 3311561 to 6a055d9 Compare November 29, 2020 16:52

jayme-github force-pushed the feature-directory-tags branch from 6a055d9 to cd94284 Compare November 30, 2020 10:35

jayme-github force-pushed the feature-directory-tags branch from cd94284 to 46c72c7 Compare November 30, 2020 13:17

Create tags from sub directories

fa9a5cc

The names of sub directories in the consumer directory will be added as tags for the document to be consumed. To enable this, set: PAPERLESS_CONSUMER_RECURSIVE=1 PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=1 Fixes the-paperless-project#50

jayme-github force-pushed the feature-directory-tags branch from 46c72c7 to fa9a5cc Compare November 30, 2020 13:22

jonaswinkler merged commit c5dbd7a into jonaswinkler:dev Nov 30, 2020

mweimerskirch pushed a commit to mweimerskirch/paperless-ng that referenced this pull request Feb 17, 2022

Merge pull request jonaswinkler#69 from paperless-ngx/l10n_dev

f8b5ec0

New Crowdin updates

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create tags from sub directories #69

Create tags from sub directories #69

jayme-github commented Nov 29, 2020 •

edited

Loading

jonaswinkler commented Nov 29, 2020

jayme-github commented Nov 29, 2020

jonaswinkler commented Nov 29, 2020

jonaswinkler commented Nov 29, 2020

jayme-github commented Nov 29, 2020

jonaswinkler commented Nov 30, 2020

jonaswinkler commented Nov 30, 2020

jayme-github commented Nov 30, 2020

jonaswinkler commented Nov 30, 2020 •

edited

Loading

jayme-github commented Nov 30, 2020

jonaswinkler commented Nov 30, 2020

jayme-github commented Nov 30, 2020

Create tags from sub directories #69

Create tags from sub directories #69

Conversation

jayme-github commented Nov 29, 2020 • edited Loading

jonaswinkler commented Nov 29, 2020

jayme-github commented Nov 29, 2020

jonaswinkler commented Nov 29, 2020

jonaswinkler commented Nov 29, 2020

jayme-github commented Nov 29, 2020

jonaswinkler commented Nov 30, 2020

jonaswinkler commented Nov 30, 2020

jayme-github commented Nov 30, 2020

jonaswinkler commented Nov 30, 2020 • edited Loading

jayme-github commented Nov 30, 2020

jonaswinkler commented Nov 30, 2020

jayme-github commented Nov 30, 2020

jayme-github commented Nov 29, 2020 •

edited

Loading

jonaswinkler commented Nov 30, 2020 •

edited

Loading