Skip to content

2.4.4rc2

@todor-ivanov todor-ivanov tagged this 24 Sep 07:11
  - Reverting the ptrace removing changes and fixing the typo in requirements.txt
  - Temporary removing ptrace dependency to try to fix check Jenkins failures
  - Cleaning up mainthread.name from log messages
  - Cleanup
  - Find sys.executable dinamically inside forkRestart && Typo
  - Revert the @emulator deocorator related change for DataCollectorAPI
  - Decrease the scan interval for AgentWatchdogScanner to once per hour.
  - Adding an exception for AgentWatchdog when it comes to building the sets of running threads
  - Fix N/A status in WMStats for components with bad threads
  - Fix type in shutdown function message causing string format error
  - Implement the logic for restarting the timers for any still alive but previously dalayed threads considering the new setup of timer per thread.
  - Reduce AgentWatchdogPoller main cycle logging verbosity
  - Split AgentWatchdogPoller and AgentWatchdogScanner polling cycles
  - Add ASCII graph to the AgentWatchdogPoller docstring
  - Skip creation of WorkQueueManagerReqMgrPoller timer for local WorkQueue
  - Fix bug caused by missing dataLocataionInterval config parameter
  - Adjust AgentWAtchdogPoller algorithm to the new timer per thread logic && Fix bugs for comp eraly component exits
  - Set configsection per thread && Fix bad timers names
  - Fix alert action execution bug && Improve timer logging and redirect exception logging as well && Delay AgentWatchdog threads startup
  - Moving from timer per component to timer per thread && Setting default timer action as alert instead of forkRestart
  - Fix wrong thread_info key in ProcessStatus exception
  - Fix AlertManagerAPI bug
  - Add alert functionality && switch AgentWatchdogPoller's timers action from forkrestart to alert
  - Add the AgentWatchdogScanner.py file
  - Create the AgentWatchdogScanner Class and Thread
  - Add type hints to WatchdogAction
  - Fix log messages
  - Make actionLimit configurable
  - Wrap the timer modification methods in try: except: blocks to avoid loosing timer threads and exception logging && Improve logging messages details
  - Simplify the restartUpdateAction signature to avoid bypassing the timer callback signature checks at init time && Add timer and actionCounter reset capabilities && Reset only timer value when action is applied and both timer and actionCounter when signal is recieved to avoid limited timer lifetime
  - Reduce expPids lists && Update timers on every action && Improve logging
  - Add restartUpdateAction wrapper && Fix a previously not noticed  bug in forkRestart && In mainthread  - update  timer if it s still alive restatrt it if dead
  - Set the acltionlimit mechanism in motion && Add Timer restart logic for components which has changed running states
  - Add protection active timers to timer.restart() in order to avoid threads duplications in the background
  - Add an internal Timer.restart method for full reconfiguration and restart
  - Add timer's actionConuter and actionLimit to control how many times an action can be applied before the timer gets destroyed
  - Fix missing imports for ptrace.process exception handlers
  - Fix ProcessTracers arguments
  - Add the functionality for building basics ptrace based tests to wmcoreDTools.isComponentAlive
  - Add wmcoreDTools.shutdown optional params to the docstring && Print one forgotten error mesage
  - Add a helper _loadConfig function to wmcoreDTools
  - Add a serialaisable actionString as timer attribute
  - Add missing self.config atributes in service instance which are missing them
  - Add python-ptrace to requirements.txt
  - Add all components threads checks in the main algorithm
  - Turn reset failure info message to warnings for all components
  - Add proper logic for finding the component name based on comparision of the current object namespace and all config sections namespaces.
  - Add proper exception handling when calling getComponentThreads
  - Move wmcoreD and wmcoeDTools from printouts to proper logging
  - Enable AgentWatchdog for all components and threads
  - Add mechanism to find the shortest pollOinterval among all component's threads
  - Implementing a mechanism for regular refresh of timers data on disk
  - Fully enable the watchdog mechanism for ErrorHandler && Remove a minor bug with a missing process on reruns of AgentWatchdog polling cycle
  - Finish implementation of resetWatchdogTimer && move it unde Utils.wmcoreDTools
  - Preserve timers on disk
  - Make Timer class as agnostic as possible && Prepare for future timer's data communication between components'  threads
  - Move Watchdog timers as a sepaate class && inherit from Thread
  - Rename resetTimer method to sigHandler
  - Adding an extremely important comment line
  - Fix main thread algorithm logic to avoid endless runs && Properly set polling cycles to fit the AgentWatchdog needs
  - Add random factor to timers' intervals && Set default watchdog polling cycle to 1 min  && Improve comments
  - Move the main algorithm for AgentWathchdogPoller to be a blocking indefinite cycle && Implement the propoer signal redirecting from the main thread to the timers
  - Add basic re-configure timers login in the main agorith && Improve printouts
  - Add protection for recieving signals for expired timers
  - Developing the timers threads && Adding the basic signal handling mechanisms
  - Separating AgentWtchdog as an independent component
  - Add useWmcoreD flag to forkRestart && Add loadPath set and get methods to the Configuration class.
  - Build robust calls to restart actions for AgentWatchdog
  - Fix thread count logic in wmcoreDtools printout logic
  - Fix startup logic and threads.json creation && Add proper parsing of threads.json to getComponentThreads.
  - Avoid bash actions for fetching daemon status
  - Add support for both config path and config objects && Fix connectionTest signature && fix old docstrings
  - Adding the first lines for AgnetWatchdog thread as part of AgentStatusWatcher component
  - Rename checkComponentThreads to getComponentThreads
  - Update pid and thread Info data structures && Add trace flag for the isComponentAlive calls
  - Add option to check components by PIDS also
  - Fix isComponentAlive docstring
  - Fix orphan vs. lost threads logic
  - Start building the construction for thread health checks
  - Catch up with master changes && Add base isComponentAlive function signature
  - Avoid sys.exit from inside functions
  - Return full pidTree from wmcoreDTolls.status
  - Rename checkProcessThreads && ruturn orphanThreads as well
  - Return list of running threads from checkProcessThreads
  - Split wmcoreD in functional and actuator parts.
  - Merge pull request #12440 from khurtado/gpufix
  - Merge pull request #12442 from vlimant/vlimant-Extension
  - add Extension as SubRequestType
  - Unquote gpu related classads to be compatible with new requirements format:
  - Merge pull request #12434 from amaltaro/fix-12421
  - Test: turn SC_Nano template into a RelVal sub-request
  - Make NanoAOD input data chunk policy consistent for RelVal workflows
Assets 2
Loading