Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check WMArchive doc size and cut off big docs #11967

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

vkuznet
Copy link
Contributor

@vkuznet vkuznet commented Apr 15, 2024

Fixes #11960

Status

In development

Description

Provide check for newly created WMArchive document before sending them to WMArchive service. The size threshold can be configured and by default equal to 8MB (current threshold on CMSWEB nginx). To avoid flooding log with very large docs a short version of the WMArchive document (a slice) is provided and printed out to the logger together with full size of the document and used threshold.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

External dependencies / deployment changes

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 tests no longer failing
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 1 warnings
    • 2 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 3 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15014/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one component configuration to be fixed.
In addition, the problem with this solution is that we will keep piling up large documents in the database, and the component will continue to load them cycle after cycle.

A better solution would be to find the worst offender field, truncate it and move on with document injection.

However, Andrea and I have been trying to put this on hold such that we can focus on the containerization. Given that this problem is under control at the moment, I'd suggest to come back to this in the future (or if things are still failing around).ntrol and we can get back to this once we are done with other developments. And

@vkuznet
Copy link
Contributor Author

vkuznet commented Apr 17, 2024

A better solution would be to find the worst offender field, truncate it and move on with document injection.

Alan, I doubt your suggestion for better solution stands. We still need to find what is causing large data size and for that we need somehow to identify rejected docs. Without actual details how you want to hunt for worst offenders (please keep in mind that we may have combination of fields contributing to large data size) I think there is no such solution, we still need to reject and dump these docs to get full sense of the actual issue.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 1 warnings
    • 2 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 3 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15020/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented May 6, 2024

@amaltaro , @anpicci should we resume this activity? I requested review of this PR once again, please give me your feedback should I proceed with it or not.

@anpicci
Copy link
Contributor

anpicci commented May 6, 2024

@vkuznet In my understanding, the situation is similar to three weeks ago, so probably it would be better to keep this issue on hold until we finalize the containerization problem, even though I see your concern

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WMArchive Grafana Monitoring Down / No Data
4 participants