fix(reports): Clear last value when state is WORKING #19941

john-bodley · 2022-05-04T03:17:23Z

SUMMARY

This PR fixes an issue with the alert feature, specifically the execution log reports different alert values for different states within the same execution run. Per this Slack thread the underlying issue is the report schedule synopsis—from which the execution log is created, i.e., a point-in-time snapshot—is when the state is WORKING the last_value and last_value_row_json need to be cleared otherwise they incorrectly reflect the values of the previous execution run.

I’m actually not sure whether there’s any value logging the WORKING state to the execution log. Does it provide any relevant information to the user?

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before Migration

After Migration

TESTING INSTRUCTIONS

Added unit tests and migration. The migration successfully ran in under a minute on a replica of Airbnb's production database with ~ 50,000 execution logs.

ADDITIONAL INFORMATION

Has associated issue:
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

john-bodley · 2022-05-04T03:18:27Z

superset/reports/commands/execute.py

@@ -94,35 +94,39 @@ def __init__(
        self._start_dttm = datetime.utcnow()
        self._execution_id = execution_id

-    def set_state_and_log(
+    def update_report_schedule_and_log(


I felt that update_report_schedule_and_log was more reflective of the function logic.

john-bodley · 2022-05-04T03:20:01Z

superset/reports/commands/execute.py

        self._report_schedule.last_state = state
-        self._report_schedule.last_eval_dttm = dttm
+        self._report_schedule.last_eval_dttm = datetime.utcnow()


There's no need to pass in the timestamp which isn't used elsewhere.

john-bodley · 2022-05-04T03:20:54Z

superset/reports/commands/execute.py

@@ -132,7 +136,7 @@ def create_log(
            end_dttm=datetime.utcnow(),
            value=self._report_schedule.last_value,
            value_row_json=self._report_schedule.last_value_row_json,
-            state=state,
+            state=self._report_schedule.state,


There's no need to pass in the state given the report schedule record has been updated with the new state.

codecov · 2022-05-04T03:24:20Z

Codecov Report

Merging #19941 (17ed438) into master (aff10a7) will decrease coverage by 0.19%.
The diff coverage is 86.66%.

@@            Coverage Diff             @@
##           master   #19941      +/-   ##
==========================================
- Coverage   66.52%   66.33%   -0.20%     
==========================================
  Files        1714     1712       -2     
  Lines       65052    64049    -1003     
  Branches     6722     6722              
==========================================
- Hits        43279    42489     -790     
+ Misses      20061    19848     -213     
  Partials     1712     1712

Flag	Coverage Δ
hive	`53.67% <20.00%> (+0.76%)`	⬆️
mysql	`82.03% <86.66%> (+0.09%)`	⬆️
postgres	`82.09% <86.66%> (+0.10%)`	⬆️
presto	`53.53% <20.00%> (+0.77%)`	⬆️
python	`82.52% <86.66%> (+0.11%)`	⬆️
sqlite	`81.83% <86.66%> (+0.08%)`	⬆️
unit	`49.03% <0.00%> (+0.99%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
superset/reports/commands/execute.py	`91.60% <86.66%> (+0.09%)`	⬆️
superset/views/users/api.py	`86.20% <0.00%> (-2.69%)`	⬇️
superset/datasets/commands/importers/v0.py	`91.47% <0.00%> (-2.15%)`	⬇️
superset/connectors/base/models.py	`86.70% <0.00%> (-1.95%)`	⬇️
superset/tasks/scheduler.py	`67.39% <0.00%> (-1.50%)`	⬇️
superset/models/helpers.py	`90.40% <0.00%> (-0.74%)`	⬇️
superset/sql_validators/presto_db.py	`82.43% <0.00%> (-0.47%)`	⬇️
superset/initialization/__init__.py	`91.42% <0.00%> (-0.30%)`	⬇️
superset/reports/dao.py	`83.20% <0.00%> (-0.27%)`	⬇️
superset/views/utils.py	`80.69% <0.00%> (-0.24%)`	⬇️
... and 43 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aff10a7...17ed438. Read the comment docs.

ktmud · 2022-05-05T02:02:27Z

Is the report scheduler currently adding two execution logs for every execution, one in WORKING state, one in COMPLETE? Was the original intention to update current WORKING execution to COMPLETE (or FAILED) once an execution is done?

eschutho · 2022-05-09T23:26:52Z

superset/migrations/versions/cbe71abde154_fix_report_schedule_and_log.py

+        .all()
+    ):
+        schedule.last_value = None
+        schedule.last_value_row_json = None


@john-bodley do you think there's a chance that if there is a report that's in a working state, that we could potentially wind up altering the value after the report has completed? Just based off your migration timing of around a minute and most of these reports take under a minute to run, we may wind up with some reports that wind up with a completed state and no value, i.e., the commit happens here after the state has changed on the report side. Just wondering if it's safer not to change reports that are in progress.

@eschutho I guess with any migration there's a risk of a race condition happening whilst the migration is running. I've replicated the session.commit() to ensure that there's also a commit directly after the ReportSchedule which likely only has a handful of reports in WORKING state.

eschutho

Looks great. Just one question, but not a blocker.

eschutho · 2022-05-09T23:42:11Z

Is the report scheduler currently adding two execution logs for every execution, one in WORKING state, one in COMPLETE?

Yes, the logs are immutable, so you'll get two logs, one for each state.

Was the original intention to update current WORKING execution to COMPLETE (or FAILED) once an execution is done?

No, I don't believe this was the intention.

eschutho · 2022-05-09T23:45:23Z

I’m actually not sure whether there’s any value logging the WORKING state to the execution log. Does it provide any relevant information to the user?

@john-bodley I suppose it could give the user a start time. It's usually the same as the scheduled time, but if the report is taking a long time to run, it could indicate to the user that something is happening.

ktmud · 2022-05-10T00:11:10Z

No, I don't believe this was the intention.

Then I don't understand why we have an end_dttm (and "duration") for each log? If we are not updating the log record, i.e., it is a "fire and forgot" operation, shouldn't a start_ddm be enough? The log list table is also less readable when you have duplicate records with same execution id, same timestamps but one has empty values and a spinner icon that never stops.

john-bodley · 2022-05-13T04:16:18Z

I think this PR makes things less wrong so it's directionally correct. There's likely merit in rethinking what the desirable UX is from a reporting standpoint, though that can be handled in a different PR.

* fix(reports): Clear last value when state is WORKING * Update cbe71abde154_fix_report_schedule_and_log.py Co-authored-by: John Bodley <john.bodley@airbnb.com>

john-bodley requested a review from a team as a code owner May 4, 2022 03:17

pull-request-size bot added the size/L label May 4, 2022

john-bodley commented May 4, 2022

View reviewed changes

john-bodley requested review from eschutho and yousoph May 4, 2022 03:21

fix(reports): Clear last value when state is WORKING

6903c48

john-bodley force-pushed the john-bodley--fix-report-last-value branch from 39d2abf to 6903c48 Compare May 4, 2022 21:38

john-bodley requested a review from ktmud May 5, 2022 00:11

john-bodley requested a review from etr2460 May 6, 2022 17:18

eschutho reviewed May 9, 2022

View reviewed changes

eschutho approved these changes May 9, 2022

View reviewed changes

Update cbe71abde154_fix_report_schedule_and_log.py

17ed438

john-bodley merged commit 89a844a into apache:master May 13, 2022

john-bodley deleted the john-bodley--fix-report-last-value branch May 13, 2022 04:54

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.0.0 labels Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(reports): Clear last value when state is WORKING #19941

fix(reports): Clear last value when state is WORKING #19941

john-bodley commented May 4, 2022 •

edited

john-bodley May 4, 2022

john-bodley May 4, 2022

john-bodley May 4, 2022

codecov bot commented May 4, 2022 •

edited

ktmud commented May 5, 2022 •

edited

eschutho May 9, 2022

john-bodley May 13, 2022

eschutho left a comment

eschutho commented May 9, 2022 •

edited

eschutho commented May 9, 2022 •

edited

ktmud commented May 10, 2022 •

edited

john-bodley commented May 13, 2022

fix(reports): Clear last value when state is WORKING #19941

fix(reports): Clear last value when state is WORKING #19941

Conversation

john-bodley commented May 4, 2022 • edited

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before Migration

After Migration

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

john-bodley May 4, 2022

Choose a reason for hiding this comment

john-bodley May 4, 2022

Choose a reason for hiding this comment

john-bodley May 4, 2022

Choose a reason for hiding this comment

codecov bot commented May 4, 2022 • edited

Codecov Report

ktmud commented May 5, 2022 • edited

eschutho May 9, 2022

Choose a reason for hiding this comment

john-bodley May 13, 2022

Choose a reason for hiding this comment

eschutho left a comment

Choose a reason for hiding this comment

eschutho commented May 9, 2022 • edited

eschutho commented May 9, 2022 • edited

ktmud commented May 10, 2022 • edited

john-bodley commented May 13, 2022

john-bodley commented May 4, 2022 •

edited

codecov bot commented May 4, 2022 •

edited

ktmud commented May 5, 2022 •

edited

eschutho commented May 9, 2022 •

edited

eschutho commented May 9, 2022 •

edited

ktmud commented May 10, 2022 •

edited