Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traffic Stats does not failover to another TM when one stops responding #5942

Closed
mabrodis opened this issue Jun 15, 2021 · 0 comments · Fixed by #5983
Closed

Traffic Stats does not failover to another TM when one stops responding #5942

mabrodis opened this issue Jun 15, 2021 · 0 comments · Fixed by #5983
Labels
bug something isn't working as intended Traffic Stats related to Traffic Stats
Milestone

Comments

@mabrodis
Copy link

I'm submitting a ...

  • bug report

Traffic Control components affected ...

  • Traffic Stats

Current behavior:

Traffic Stats does not gracefully handle the situation when a Traffic Monitor it is using stops responding. What happens currently is if the Traffic Monitor that TS is getting data from goes away then TS just will stop collecting stats and will not write out any stats/data/metrics to an output platform (e.g. InfluxDB).

Expected behavior:

Traffic Stats ideally should have a list of Online TMs that it can get data from and if one does not respond simply move on and use another one and continue to collect stats and output stats to whatever the desired output platform is.

Minimal reproduction of the problem with instructions:

  1. Have TS running and have two Online TMs
  2. Stop TM1 and if TS does not log an error then start that one up and stop TM2.
  3. Once you stop the TM that TS is connecting to you will cease to see messages like this in the traffic_stats.log:
    [INFO] 2021-06-15 21:03:40 Collected XXX deliveryservice stats values for cdn @ 1623791020
    [INFO] 2021-06-15 21:03:40 Collected XXXX cache stats values for cdn @ 1623791020
  4. You won't see those messages because TS is not collecting stats because the TM it was using is not responding and it does not try the other TM, it appears to be stuck trying to use the one that is no longer responding.

Versions in the test:
TS: 3.0
TM: 5.2

Anything else:

@mabrodis mabrodis added the bug something isn't working as intended label Jun 15, 2021
@ocket8888 ocket8888 added the Traffic Stats related to Traffic Stats label Jun 16, 2021
@rawlinp rawlinp added this to the 6.0.0 milestone Jun 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something isn't working as intended Traffic Stats related to Traffic Stats
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants