Skip to content

Stackstorm jobs are stuck and not running #6013

@nehadubx

Description

@nehadubx

SUMMARY

We have an issue where Stackstorm jobs are getting stuck and platform gets into hung state when there are 400+ alerts received in the same time interval.
Issue: Alerts getting Stuck in Scheduled or Delay state
Volume : 400+ alerts
Pattern: To run each alert sequentially

STACKSTORM VERSION

st2 --version: st2 3.7.0, on Python 3.8.10

OS, environment, install method

ST2: Docker in our CaaS Container.
Kubernetes HA. System Requirement is aligned with St2 doc

Steps to reproduce the problem

We face this issue as in when we receive same type of alerts for the same host and check value for which correlation is maintained. In case we receive such multiple alerts at same minute, the 1st alert start running and rest all alerts are getting hung state. We have a volume of 400+ alerts coming to Stackstorm and requires execution.

Expected Results

Alerts should not get stuck and to run as expected by the product

Actual Results

Alerts are stuck and then requires manual cleaning and handling

Need immediate support.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions