-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full Python code fusion / refactor and hardening 2nd edition #13188
Conversation
I've spent another week trying to work down some strange race conditions that only happened on pypy 3.7 (?), where the thread that checked the timeout did not finish in time for the process monitor to be informed that a timeout had happened. @ottorei Could you update the patch and report back please ? |
Since the last 3 commits, all tests that succeed now fail with missing [EDIT] Seems to be okay now, something strange in the build farm I guess [/EDIT] |
Sure, I'll update my servers tomorrow with the latest commits. |
@ottorei Any news ? |
Sorry for the delay, just updated today. The new code has been running a few hours now without noticeable issues. I will let it run for some time and check the journalctl logs for anything interesting. |
I've been tested this for my dev server with 100 devices and i didn't had any issues. |
@deajan ready to try to merge it? |
@murrant I'm on the road right now. |
I have been running this for the past few days now on a 5 server cluster with ~8000 devices. I checked the journalctl logs of the poller service on multiple servers matching error lines - did not see any noticeable issues other than planned service breaks. Good job ;) |
This pull request has been mentioned on LibreNMS Community. There might be relevant details there: |
…s#13188) * New service/discovery/poller wrapper * Convert old wrapper scripts to bootstrap loaders for wrapper.py * Move wrapper.py to LibreNMS module directory * Reformat files * File reformatting * bootstrap files reformatting * Fusion service and wrapper database connections and get_config_data functions * Moved subprocess calls to command_runner * LibreNMS library and __init__ fusion * Reformat files * Normalize logging use * Reformatting code * Fix missing argument for error log * Fix refactor typo in DBConfig class * Add default timeout for config.php data fetching * distributed discovery should finish with a timestamp instead of an epoch * Fix docstring inside dict prevents service key to work * Fix poller insert statement * Fix service wrapper typo * Update docstring since we changed function behavior * Normalize SQL statements * Convert optparse to argparse * Revert discovery thread number * Handle debug logging * Fix file option typo * Reformat code * Add credits to source package * Rename logs depending on the wrapper type * Cap max logfile size to 10MB * Reformat code * Add exception for Redis < 5.0 * Make sure we always log something from service * Fix bogus description * Add an error message on missing config file * Improve error message when .env file cannot be loaded * Improve wrapper logging * Fix cron run may fail when environment path is not set * Add missing -wrapper suffix for logs * Conform to prior naming scheme * Linter fix * Add inline copy of command_runner * Another linter fix * Raise exception after logging * Updated inline command_runner * Add command_runner to requirements * I guess I love linter fixes ;) * Don't spawn more threads than devices * Fix typo in log call * Add exit codes to log on error, add command line to debug log * Add thread name to error message * Log errors in end message for easier debugging * Typo fix * In love of linting
Please give a short description what your pull request is for
This PR is a full refactor of all the python code, which has grown to have many duplicates over time.
It replaces #13090, #13094 and #13125
The main aim is to make the codebase leaner and more maintainable, while hardening the code.
It should have no user visible effect hopefully.
Enhancements:
Fixes:
Please note
Testers
If you would like to test this pull request then please run:
./scripts/github-apply <pr_id>
, i.e./scripts/github-apply 5926
After you are done testing, you can remove the changes with
./scripts/github-remove
. If there are schema changes, you can ask on discord how to revert.