Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[migration] metadata for dashboard filters #9109

Conversation

graceguo-supercat
Copy link

@graceguo-supercat graceguo-supercat commented Feb 10, 2020

CATEGORY

Choose one

  • Bug Fix
  • Enhancement (new features, refinement)
  • Refactor
  • Add tests
  • Build / Development Environment
  • Documentation

SUMMARY

Migrate filter_immune_slices and filter_immune_filter_fields since we enabled dashboard scoped filter metadata filter_scopes.

TEST PLAN

We will not copy old filter_immune_slices and filter_immune_filter_fields metadata. If upgrade failed, users will lose filter immune settings in their dashboards.

So before system admin do this upgrade, please backup dashboards table.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

REVIEWERS

@serenajiang @etr2460 @mistercrunch

@codecov-io
Copy link

codecov-io commented Feb 10, 2020

Codecov Report

Merging #9109 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #9109      +/-   ##
=========================================
- Coverage    59.1%   59.1%   -0.01%     
=========================================
  Files         372     372              
  Lines       11920   11922       +2     
  Branches     2917    2919       +2     
=========================================
+ Hits         7045    7046       +1     
- Misses       4693    4694       +1     
  Partials      182     182
Impacted Files Coverage Δ
superset-frontend/src/chart/chartAction.js 43.33% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2913063...008ce3f. Read the comment docs.

@graceguo-supercat graceguo-supercat force-pushed the gg-MigrateScopeFiltersMetadata branch 2 times, most recently from e29866a to 9331512 Compare February 10, 2020 19:13
@graceguo-supercat graceguo-supercat changed the title [migration] metadata for dashboard filters [WIP][migration] metadata for dashboard filters Feb 10, 2020
@graceguo-supercat graceguo-supercat force-pushed the gg-MigrateScopeFiltersMetadata branch 8 times, most recently from dd1e53d to 24206ab Compare February 11, 2020 05:12
@graceguo-supercat graceguo-supercat changed the title [WIP][migration] metadata for dashboard filters [migration] metadata for dashboard filters Feb 11, 2020
Copy link
Contributor

@serenajiang serenajiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly 🐍 nits. How long does this take to run?

viz_type = Column(String(250))


dashboard_slices = Table(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for consistency, can you make this the same format as Slice and Dashboard?

class DashboardSlices(Base):
    __tablename__ = "dashboard_slices"
    id = Column(Integer, primary_key = True)
    ...and so on

Copy link
Author

@graceguo-supercat graceguo-supercat Feb 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a few places in our code base that use variable (instead of class) for relationship table.

Comment on lines 82 to 87
slice_ids = [slice.id for slice in dashboard.slices]
filters = (
session.query(Slice)
.filter(and_(Slice.id.in_(slice_ids), Slice.viz_type == "filter_box"))
.all()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the query for filters is necessary, since the slices shouuuld have all the info loaded, so you could just do something like:

filters = [slice for slice in dashboard.slices if slice.viz_type == "filter_box"]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it works! thank you.

)

# if dashboard has filter_box
if len(filters):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly prefer

if filters:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.


dashboards = session.query(Dashboard).all()
for i, dashboard in enumerate(dashboards):
print("scanning dashboard ({}/{}) >>>>".format(i + 1, len(dashboards)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't have a strong preference for which, but we should probably either stick to print or logging.info and not do both.

Copy link
Author

@graceguo-supercat graceguo-supercat Feb 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in my test, scanning 11000 records takes ~4 minutes.
this print some progress in the console so that during deployment we can see how much work is done. Otherwise it will look like db upgrade is hanging there.
the logging.info is used to save a record (can't see it from console).


session.merge(dashboard)
except Exception as e:
logging.exception(f"dashboard {dashboard.id} has error: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you tested this, were there ever any errors? I'm curious what kind of errors might occur.

Copy link
Author

@graceguo-supercat graceguo-supercat Feb 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i mean to catch any user data caused null pointer exception. for example, user may have added invalid json_metada that I can't parse ? but should be very rare.

slice_params = filter_slice.params or {}
configs = slice_params.get("filter_configs") or []

if slice_params.get("date_filter", False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if slice_params.get("date_filter", False):
if slice_params.get("date_filter"):

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 82 to 84
slice_ids = [slice.id for slice in dashboard.slices]
filters = [
slice
for slice in dashboard.slices
if slice.id.in_(slice_ids) and slice.viz_type == "filter_box"
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the extra slice_ids step - can just do.

filters = [slice for slice in dashboard.slices if slice.viz_type == "filter_box"]

@graceguo-supercat graceguo-supercat force-pushed the gg-MigrateScopeFiltersMetadata branch 3 times, most recently from b396284 to 2f427c6 Compare February 11, 2020 22:26
Copy link
Member

@john-bodley john-bodley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@graceguo-superset could you add a line item in UPDATING.md with the warning you mentioned in the PR description.


json_metadata.pop("filter_immune_slices", None)
json_metadata.pop("filter_immune_slice_fields", None)
if filter_scopes:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this logic can be put within the if filters: block.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

logging.info(
f"Adding filter_scopes for dashboard {dashboard.id}: {json.dumps(filter_scopes)}"
)
if json_metadata:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don’t we also need to update the JSON metadata of the record if it’s falsey?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct! I fixed it.
some old dashboards have no json_metadata. so after this migration i want to keep their json_metadata attribute be None instead of {}

"filter_immune_slice_fields", {}
).items():
for column in columns:
if immuned_by_column.get(column, None) is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets use a collections.defaultdict here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@graceguo-supercat
Copy link
Author

ping @john-bodley

slice_params = json.loads(filter_slice.params or "{}")
configs = slice_params.get("filter_configs") or []

if slice_params.get("date_filter"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know this is the exhaustive list of slice parameters?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screen_Shot_2020-02-14_at_10_29_54_AM

I was thinking about to add an iteration for all the possible date filter related keys. But think this function will be used by:

  • db migration, only run once,
  • dashboard import, to convert filter_immune metadata in old dashboard to use new filter_scopes
    Given the usage is kind of backward compatible not future compatible, I feel enumeration probably is good enough?

@graceguo-supercat graceguo-supercat merged commit f4ad15e into apache:master Feb 14, 2020
@graceguo-supercat graceguo-supercat deleted the gg-MigrateScopeFiltersMetadata branch June 11, 2020 23:20
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.36.0 labels Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 0.36.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants