- Reverting the ptrace removing changes and fixing the typo in requirements.txt
- Temporary removing ptrace dependency to try to fix check Jenkins failures
- Cleaning up mainthread.name from log messages
- Cleanup
- Find sys.executable dinamically inside forkRestart && Typo
- Revert the @emulator deocorator related change for DataCollectorAPI
- Decrease the scan interval for AgentWatchdogScanner to once per hour.
- Adding an exception for AgentWatchdog when it comes to building the sets of running threads
- Fix N/A status in WMStats for components with bad threads
- Fix type in shutdown function message causing string format error
- Implement the logic for restarting the timers for any still alive but previously dalayed threads considering the new setup of timer per thread.
- Reduce AgentWatchdogPoller main cycle logging verbosity
- Split AgentWatchdogPoller and AgentWatchdogScanner polling cycles
- Add ASCII graph to the AgentWatchdogPoller docstring
- Skip creation of WorkQueueManagerReqMgrPoller timer for local WorkQueue
- Fix bug caused by missing dataLocataionInterval config parameter
- Adjust AgentWAtchdogPoller algorithm to the new timer per thread logic && Fix bugs for comp eraly component exits
- Set configsection per thread && Fix bad timers names
- Fix alert action execution bug && Improve timer logging and redirect exception logging as well && Delay AgentWatchdog threads startup
- Moving from timer per component to timer per thread && Setting default timer action as alert instead of forkRestart
- Fix wrong thread_info key in ProcessStatus exception
- Fix AlertManagerAPI bug
- Add alert functionality && switch AgentWatchdogPoller's timers action from forkrestart to alert
- Add the AgentWatchdogScanner.py file
- Create the AgentWatchdogScanner Class and Thread
- Add type hints to WatchdogAction
- Fix log messages
- Make actionLimit configurable
- Wrap the timer modification methods in try: except: blocks to avoid loosing timer threads and exception logging && Improve logging messages details
- Simplify the restartUpdateAction signature to avoid bypassing the timer callback signature checks at init time && Add timer and actionCounter reset capabilities && Reset only timer value when action is applied and both timer and actionCounter when signal is recieved to avoid limited timer lifetime
- Reduce expPids lists && Update timers on every action && Improve logging
- Add restartUpdateAction wrapper && Fix a previously not noticed bug in forkRestart && In mainthread - update timer if it s still alive restatrt it if dead
- Set the acltionlimit mechanism in motion && Add Timer restart logic for components which has changed running states
- Add protection active timers to timer.restart() in order to avoid threads duplications in the background
- Add an internal Timer.restart method for full reconfiguration and restart
- Add timer's actionConuter and actionLimit to control how many times an action can be applied before the timer gets destroyed
- Fix missing imports for ptrace.process exception handlers
- Fix ProcessTracers arguments
- Add the functionality for building basics ptrace based tests to wmcoreDTools.isComponentAlive
- Add wmcoreDTools.shutdown optional params to the docstring && Print one forgotten error mesage
- Add a helper _loadConfig function to wmcoreDTools
- Add a serialaisable actionString as timer attribute
- Add missing self.config atributes in service instance which are missing them
- Add python-ptrace to requirements.txt
- Add all components threads checks in the main algorithm
- Turn reset failure info message to warnings for all components
- Add proper logic for finding the component name based on comparision of the current object namespace and all config sections namespaces.
- Add proper exception handling when calling getComponentThreads
- Move wmcoreD and wmcoeDTools from printouts to proper logging
- Enable AgentWatchdog for all components and threads
- Add mechanism to find the shortest pollOinterval among all component's threads
- Implementing a mechanism for regular refresh of timers data on disk
- Fully enable the watchdog mechanism for ErrorHandler && Remove a minor bug with a missing process on reruns of AgentWatchdog polling cycle
- Finish implementation of resetWatchdogTimer && move it unde Utils.wmcoreDTools
- Preserve timers on disk
- Make Timer class as agnostic as possible && Prepare for future timer's data communication between components' threads
- Move Watchdog timers as a sepaate class && inherit from Thread
- Rename resetTimer method to sigHandler
- Adding an extremely important comment line
- Fix main thread algorithm logic to avoid endless runs && Properly set polling cycles to fit the AgentWatchdog needs
- Add random factor to timers' intervals && Set default watchdog polling cycle to 1 min && Improve comments
- Move the main algorithm for AgentWathchdogPoller to be a blocking indefinite cycle && Implement the propoer signal redirecting from the main thread to the timers
- Add basic re-configure timers login in the main agorith && Improve printouts
- Add protection for recieving signals for expired timers
- Developing the timers threads && Adding the basic signal handling mechanisms
- Separating AgentWtchdog as an independent component
- Add useWmcoreD flag to forkRestart && Add loadPath set and get methods to the Configuration class.
- Build robust calls to restart actions for AgentWatchdog
- Fix thread count logic in wmcoreDtools printout logic
- Fix startup logic and threads.json creation && Add proper parsing of threads.json to getComponentThreads.
- Avoid bash actions for fetching daemon status
- Add support for both config path and config objects && Fix connectionTest signature && fix old docstrings
- Adding the first lines for AgnetWatchdog thread as part of AgentStatusWatcher component
- Rename checkComponentThreads to getComponentThreads
- Update pid and thread Info data structures && Add trace flag for the isComponentAlive calls
- Add option to check components by PIDS also
- Fix isComponentAlive docstring
- Fix orphan vs. lost threads logic
- Start building the construction for thread health checks
- Catch up with master changes && Add base isComponentAlive function signature
- Avoid sys.exit from inside functions
- Return full pidTree from wmcoreDTolls.status
- Rename checkProcessThreads && ruturn orphanThreads as well
- Return list of running threads from checkProcessThreads
- Split wmcoreD in functional and actuator parts.
- Merge pull request #12440 from khurtado/gpufix
- Merge pull request #12442 from vlimant/vlimant-Extension
- add Extension as SubRequestType
- Unquote gpu related classads to be compatible with new requirements format:
- Merge pull request #12434 from amaltaro/fix-12421
- Test: turn SC_Nano template into a RelVal sub-request
- Make NanoAOD input data chunk policy consistent for RelVal workflows