Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compiling missing statistics losing rows #101616

Merged
merged 17 commits into from Oct 8, 2023
Merged

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented Oct 7, 2023

Note this is easier to review without white space since there is a context manager removed

Proposed change

When we compiled missing statistics we would create a session inside another session. Since sqlalchemy sessions are thread local, when used with the scoped_session context manager, when the inner session was finished, it would call session.close() and delete all the pending inserts in the outer session (because its actually the same session since its the same thread). The end result would be all the stats would get lumped together.

This changes compiling platform stats to pass the session to each platform so we avoid the session within a session pattern. Its possible a custom component may implement their own platform stats, which would make this a breaking change but that seems unlikely and fixing this is likely worth any fallout if that is the case.

This change refactors compile_missing_statistics to use get_latest_short_term_statistics_with_session which will use the same session for the whole process to avoid the session within a session pattern to ensure session.close() only happens when we are actually finished.

We had no explicit tests for compile_missing_statistics (only indirect coverage). They have been added to ensure this does not regress in the future.

fixes #101613

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • I have followed the perfect PR recommendations
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
  • Untested files have been added to .coveragerc.

To help with the load of incoming pull requests:

@home-assistant
Copy link

home-assistant bot commented Oct 7, 2023

Hey there @home-assistant/core, mind taking a look at this pull request as it has been labeled with an integration (recorder) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of recorder can trigger bot actions by commenting:

  • @home-assistant close Closes the pull request.
  • @home-assistant rename Awesome new title Renames the pull request.
  • @home-assistant reopen Reopen the pull request.
  • @home-assistant unassign recorder Removes the current integration label and assignees on the pull request, add the integration domain after the command.

@bdraco
Copy link
Member Author

bdraco commented Oct 7, 2023

It looks like we don't have good coverage for compile_missing_statistics

@bdraco
Copy link
Member Author

bdraco commented Oct 8, 2023

Either I've screwed up the test or I'm actually hitting the bug in #101613

I think I'm actually hitting the bug 👍

Anyways I'm about 9 hours into this today so I need to pick it back up tomorrow

@bdraco
Copy link
Member Author

bdraco commented Oct 8, 2023

So the problem here is that we create sessions inside sessions when than get .close() called on the session and it looses the data in the session and since the session is per thread when we close the session from the inner query we delete the inserts

@bdraco bdraco changed the title Always fetch orm rows for latest short term stats Fix compiling missing statistics loosing rows Oct 8, 2023
@bdraco bdraco changed the title Fix compiling missing statistics loosing rows Fix compiling missing statistics losing rows Oct 8, 2023
@bdraco bdraco added this to the 2023.10.2 milestone Oct 8, 2023
Comment on lines -384 to -391
# There is already an active session when this code is called since
# it is called from the recorder statistics. We need to make sure
# this session never gets committed since it would be out of sync
# with the recorder statistics session so we mark it as read only.
#
# If we ever need to write to the database from this function we
# will need to refactor the recorder statistics to use a single
# session.
Copy link
Member Author

@bdraco bdraco Oct 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have refactored this as soon as I noticed this was happening when I added this comment in f6f3565 This wasn't the source of the problem, but its when I noticed this pattern.

I added the comment because I was worried this was a bit brittle but I didn't realize it was actually a problem as well at the time because I didn't understand the full impact of the nested sessions.

The irony is, I was too concerned about refactoring risk that I under-estimated the impact here even though I thought it was a problem enough to add this comment 🤦

Comment on lines +453 to 455
last_stats = statistics.get_latest_short_term_statistics_with_session(
hass, session, to_query, {"last_reset", "state", "sum"}, metadata=old_metadatas
)
Copy link
Member Author

@bdraco bdraco Oct 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the only call that created a new session here which is the source of the issue

@bdraco
Copy link
Member Author

bdraco commented Oct 8, 2023

Ran on production overnight.

All good

I'm much more confident in this code now that we have explicit coverage for it.

@bdraco bdraco marked this pull request as ready for review October 8, 2023 16:06
@bdraco bdraco requested a review from a team as a code owner October 8, 2023 16:06
@frenck frenck merged commit c6ed022 into dev Oct 8, 2023
34 checks passed
@frenck frenck deleted the orm_rows_short_term_stats branch October 8, 2023 17:43
@bdraco
Copy link
Member Author

bdraco commented Oct 8, 2023

Thanks

@github-actions github-actions bot locked and limited conversation to collaborators Oct 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Energy dashboard displays incorrects statistics in 2023.10.1 (after daily restart at 4am)
2 participants