Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Classifier special case when no items are set to automatic matching #3858

Merged
merged 2 commits into from Jul 24, 2023

Conversation

stumpylog
Copy link
Member

Proposed change

Fixes a pretty special case, where items once existed to train the models with, but no items do any longer. Basically, if the file existed, the classifier is loaded by the consumer, then sent via signals to set_xyz, which uses it to predict items it shouldn't be any longer.

As extra precaution, the prediction matching also ensures the item is still set to MATCH_AUTO. It's another special case where the matching item could have been changed from AUTO to something else, but the classifier hasn't yet been trained with that change.

Fixes #3848

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Other (please explain):

Checklist:

  • I have read & agree with the contributing guidelines.
  • If applicable, I have included testing coverage for new code in this PR, for backend and / or front-end changes.
  • If applicable, I have tested my code for new features & regressions on both mobile & desktop devices, using the latest version of major browsers.
  • If applicable, I have checked that all tests pass, see documentation.
  • I have run all pre-commit hooks, see documentation.
  • I have made corresponding changes to the documentation as needed.
  • I have checked my modifications for any breaking changes.

…hing, in case the classifier hasn't been run since a type was changed
@stumpylog stumpylog requested a review from a team as a code owner July 24, 2023 17:28
@paperless-ngx-secretary paperless-ngx-secretary bot added backend non-trivial Requires approval by several team members labels Jul 24, 2023
@github-actions github-actions bot added the bug Bug report or a Bug-fix label Jul 24, 2023
@codecov
Copy link

codecov bot commented Jul 24, 2023

Codecov Report

Merging #3858 (90cac98) into dev (8c7554e) will decrease coverage by 0.02%.
The diff coverage is 50.00%.

@@            Coverage Diff             @@
##              dev    #3858      +/-   ##
==========================================
- Coverage   95.26%   95.24%   -0.02%     
==========================================
  Files         335      335              
  Lines       12659    12663       +4     
  Branches     1039     1039              
==========================================
+ Hits        12059    12061       +2     
- Misses        595      597       +2     
  Partials        5        5              
Flag Coverage Δ
backend 94.05% <50.00%> (-0.03%) ⬇️
frontend 96.66% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/documents/matching.py 91.83% <ø> (ø)
src/documents/tasks.py 95.13% <50.00%> (-1.29%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@shamoon
Copy link
Member

shamoon commented Jul 24, 2023

Thanks for digging into this. So the added condition on listing predictions would handle cases where only some of the models were changed to None whereas the removing the classifier completely would only apply if all were changed to None, correct?

@stumpylog
Copy link
Member Author

Yes. That's the idea. There's also a window where the classifier will still exist (hasn't been retrained), but the matching items have been changed and could be returned from predictions. The added filtering will catch that as well, until the file is removed.

Copy link
Member

@shamoon shamoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thanks

@stumpylog stumpylog merged commit 802e559 into dev Jul 24, 2023
29 checks passed
@stumpylog stumpylog deleted the fix/3848-classifier-special-case branch July 24, 2023 19:31
@github-actions
Copy link
Contributor

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion or issue for related concerns.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backend bug Bug report or a Bug-fix non-trivial Requires approval by several team members
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Trouble with None Matching
2 participants