Fix extraneous cloud sessions#1504
Merged
Merged
Conversation
…at are being aggregated.
…e being re-aggregated
Member
|
When you re-aggregate a row it should automatically remove the corresponding rows from the session table. This is done in the |
Member
|
I think i have found the answer to my question. But the question is now why didn't you use the dedicated aggregation class that is designed for the purpose of handling aggregation with session|job tables? |
jpwhite4
approved these changes
Mar 19, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://app.asana.com/0/808093868887967/1199187338120726
This fixes an issue where extra sessions may exist in the
session_recordstable when ingesting data to fill a time gap where data was not ingested. An example of a gap in the data could be the file copying not running on a given day and then a catch-up run being done later. TheCloudStateReconstructorTransformIngestorconstructs all sessions for all instances that have an event greater than the last_modified_start_date specified when you runxdmod-ingestoror the relevantetl_overseer.phpcommand and places the in the event_reconstructed table.To make sure we have the correct sessions this change removes all the sessions for instances we are aggregating from the session_records table. This means that when the
cloud-session-recordsaction runs it will load all the sessions from the event_reconstructed table which has the correct sessions for each instance and none of the extra sessions.It also removes rows from
cloudfact_by_day_sessionlistthat have session_ids tied to instances that are being aggregated. If this is not done it would result in rows incloudfact_by_day_sessionlistthat have session_ids that no longer exist.Motivation and Context
Preventing sessions that are incorrect from existing in the
session_recordstableTests performed
Tested in docker
Checklist: