Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus: Duplicated Timeseries in CollectorRegistry #83802

Closed
veleek opened this issue Dec 12, 2022 · 6 comments
Closed

Prometheus: Duplicated Timeseries in CollectorRegistry #83802

veleek opened this issue Dec 12, 2022 · 6 comments

Comments

@veleek
Copy link
Contributor

veleek commented Dec 12, 2022

The problem

Prometheus is unexpectedly causing errors. This error is not actionable for the user so it's not clear what caused the problem or how to fix it.

What version of Home Assistant Core has the issue?

core-2022.12.0

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant Container

Integration causing the issue

Prometheus

Link to integration documentation on our website

https://www.home-assistant.io/integrations/prometheus/

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

2022-12-12 04:34:54.548 ERROR (MainThread) [homeassistant] Error doing job: Future exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 258, in _metric
    return self._metrics[metric]
KeyError: 'sensor_unit_floors'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 176, in handle_state_changed
    getattr(self, handler)(state)
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 507, in _handle_sensor
    _metric = self._metric(metric, self.prometheus_cli.Gauge, documentation)
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 263, in _metric
    self._metrics[metric] = factory(
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/metrics.py", line 316, in __init__
    super(Gauge, self).__init__(
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/metrics.py", line 107, in __init__
    registry.register(self)
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/registry.py", line 27, in register
    raise ValueError(
ValueError: Duplicated timeseries in CollectorRegistry: {'hass_sensor_unit_floors'}

Additional information

No response

@home-assistant
Copy link

Hey there @knyar, mind taking a look at this issue as it has been labeled with an integration (prometheus) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of prometheus can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Change the title of the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign prometheus Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


prometheus documentation
prometheus source
(message by IssueLinks)

@knyar
Copy link
Contributor

knyar commented Dec 14, 2022

Thank you for reporting this. Looking at the code in question it seems that creating duplicate metrics should only be possible if _metric gets called concurrently with the same metric name (two threads can hit the KeyError exception at the same time and attempt to create a new metric with the same name). I am not really familiar with Home Assistant concurrency model, but currently the Prometheus component does not use any synchronization primitives and mostly assumes sequential execution.

Do you have a set of steps that I could use to reproduce this on a fresh Home Assistant instance? This would help investigating and fixing this.

@veleek
Copy link
Contributor Author

veleek commented Dec 19, 2022

Hey @knyar, sorry I swear that this was continually repoing on my install, but I can't even seem to get it to happen again anymore. If I manage to get it to occur again I'll try to investigate further myself, but maybe it was just a transient issue. I say close the bug for now unless there's a reasonable way for you to handle it gracefully.

Sorry for the randomization! Feel free to ping me for a PR in return for your time!

@lmvlmv
Copy link

lmvlmv commented Feb 22, 2023

If helpful, I've just been fiddling with home assistant and prometheus and ran into this. Might be relevant that the rpi running hass is not highly powered and there's a complaint from home assistant that "recorder" is taking time to start. Perhaps that's leading to to a timing issue? I did briefly see some values show up in prometheus so this can work...

Logs from the nomadjob/container:

s6-rc: info: service legacy-services successfully started
2023-02-22 10:56:25.211 WARNING (MainThread) [homeassistant.setup] Setup of recorder is taking over 10 seconds.
2023-02-22 10:56:46.408 ERROR (MainThread) [homeassistant] Error doing job: Future exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 262, in _metric
    return self._metrics[metric]
KeyError: 'device_tracker_state'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 180, in handle_state_changed
    getattr(self, handler)(state)
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 364, in _handle_device_tracker
    metric = self._metric(
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 267, in _metric
    self._metrics[metric] = factory(
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/metrics.py", line 316, in __init__
    super(Gauge, self).__init__(
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/metrics.py", line 107, in __init__
    registry.register(self)
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/registry.py", line 27, in register
    raise ValueError(
ValueError: Duplicated timeseries in CollectorRegistry: {'hass_device_tracker_state'}
2023-02-22 10:56:46.429 ERROR (MainThread) [homeassistant] Error doing job: Future exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 262, in _metric
    return self._metrics[metric]
KeyError: 'device_tracker_state'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 180, in handle_state_changed
    getattr(self, handler)(state)
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 364, in _handle_device_tracker
    metric = self._metric(
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 267, in _metric
    self._metrics[metric] = factory(
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/metrics.py", line 316, in __init__
    super(Gauge, self).__init__(
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/metrics.py", line 107, in __init__
    registry.register(self)
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/registry.py", line 27, in register
    raise ValueError(
ValueError: Duplicated timeseries in CollectorRegistry: {'hass_device_tracker_state'}
2023-02-22 10:56:46.434 ERROR (MainThread) [homeassistant] Error doing job: Future exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 262, in _metric
    return self._metrics[metric]
KeyError: 'device_tracker_state'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 180, in handle_state_changed
    getattr(self, handler)(state)
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 364, in _handle_device_tracker
    metric = self._metric(
  File "/usr/src/homeassistant/homeassistant/components/prometheus/__init__.py", line 267, in _metric
    self._metrics[metric] = factory(
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/metrics.py", line 316, in __init__
    super(Gauge, self).__init__(
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/metrics.py", line 107, in __init__
    registry.register(self)
  File "/usr/local/lib/python3.10/site-packages/prometheus_client/registry.py", line 27, in register
    raise ValueError(
ValueError: Duplicated timeseries in CollectorRegistry: {'hass_device_tracker_state'}

@knyar
Copy link
Contributor

knyar commented Feb 22, 2023

This seems like a duplicate of #80656. If someone would like to prepare a PR introducing locking, I'll be happy to review.

@issue-triage-workflows
Copy link

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@issue-triage-workflows issue-triage-workflows bot closed this as not planned Won't fix, can't repro, duplicate, stale May 30, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Jun 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants