Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Don't crash on bad ASNs during indexing #2586

Merged
merged 2 commits into from Feb 3, 2023

Conversation

stumpylog
Copy link
Member

Proposed change

I don't really know how the ASN value in the linked issue was allowed past the migration, but I guess it was. Perhaps the choice of DB backend affects it?

So, during indexing, if an ASN value would value the indexing to fail, instead log an error and reset the ASN to a an allowed value. I do think this is better than a straight up crash, but if there's differing opinions, let me know.

Fixes #2583

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Other (please explain)

Checklist:

  • I have read & agree with the contributing guidelines.
  • If applicable, I have tested my code for new features & regressions on both mobile & desktop devices, using the latest version of major browsers.
  • If applicable, I have checked that all tests pass, see documentation.
  • I have run all pre-commit hooks, see documentation.
  • I have made corresponding changes to the documentation as needed.
  • I have checked my modifications for any breaking changes.

@stumpylog stumpylog requested review from a team as code owners February 2, 2023 16:43
@paperless-ngx-secretary paperless-ngx-secretary bot added backend non-trivial Requires approval by several team members labels Feb 2, 2023
@github-actions github-actions bot added the bug Bug report or a Bug-fix label Feb 2, 2023
@codecov
Copy link

codecov bot commented Feb 2, 2023

Codecov Report

Merging #2586 (476fac2) into dev (06e2500) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##              dev    #2586      +/-   ##
==========================================
+ Coverage   92.67%   92.72%   +0.04%     
==========================================
  Files         139      139              
  Lines        5996     6003       +7     
==========================================
+ Hits         5557     5566       +9     
+ Misses        439      437       -2     
Flag Coverage Δ
backend 92.72% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/documents/consumer.py 95.97% <100.00%> (ø)
src/documents/index.py 93.49% <100.00%> (+0.15%) ⬆️
src/documents/models.py 98.29% <100.00%> (+0.02%) ⬆️
src/paperless_tika/parsers.py 80.64% <0.00%> (+3.22%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@stumpylog
Copy link
Member Author

Ah, now that I can see the source files, the coverage change in src/paperless_tika/parsers.py is because Gotenberg or Tika had a problem converting at least once, covering the exception handling. A flaky bit discovered, and can be fixed with a test to directly cover it.

Copy link
Member

@shamoon shamoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the crash on startup is a bad thing so Im good with this.

It is potentially modifying someones data in a kind of silent way (yes it logs the error of course but our users don't typically see these without looking), which obviously isn't ideal but I still don't really see the use case of an ASN > 9 million or whatever so on balance I think this is the better compromise. Again, my opinion

@stumpylog
Copy link
Member Author

The range is about 4.2 billion with the validators as they are now. I do really think that should be enough. The spawning issue had an ASN of a few quadrillion, which seems to have been accidentally added anyway.

@shamoon
Copy link
Member

shamoon commented Feb 2, 2023

Ha yes, 4.2 billion = enough

@stumpylog stumpylog merged commit faecd59 into dev Feb 3, 2023
@stumpylog stumpylog deleted the fix/2583-dont-allow-bad-asns branch February 3, 2023 16:31
@shamoon shamoon added this to the v1.12.3 milestone Feb 9, 2023
@github-actions
Copy link
Contributor

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion or issue for related concerns.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backend bug Bug report or a Bug-fix non-trivial Requires approval by several team members
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[BUG] Re-indexing fails with out of range error
2 participants