Skip to content

Conversation

@amitmiran137
Copy link
Member

@amitmiran137 amitmiran137 commented Aug 9, 2021

SUMMARY

Superset enables us to run different caching strategies under celery beat.
The ability here is to provide additional strategies from the config that can later be run if the strategy is chosen

In order to use Strategy type like
EXTRA_CACHING_STRATEGIES: List[Type[Strategy]] = []
It was required to split from the single cache.py file into smaller modules in order to avoid importing superset.app
image

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@pull-request-size pull-request-size bot added size/S and removed size/XS labels Aug 9, 2021
@codecov
Copy link

codecov bot commented Aug 9, 2021

Codecov Report

Merging #16150 (beab7dc) into master (7b72443) will increase coverage by 0.01%.
The diff coverage is 66.84%.

❗ Current head beab7dc differs from pull request most recent head 919c36d. Consider uploading reports for the commit 919c36d to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16150      +/-   ##
==========================================
+ Coverage   76.47%   76.49%   +0.01%     
==========================================
  Files         997     1002       +5     
  Lines       53245    53272      +27     
  Branches     6777     6777              
==========================================
+ Hits        40720    40750      +30     
+ Misses      12295    12292       -3     
  Partials      230      230              
Flag Coverage Δ
hive 81.27% <66.84%> (?)
mysql ?
postgres ?
presto 81.34% <66.84%> (?)
python 81.70% <66.84%> (+0.02%) ⬆️
sqlite ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/common/query_actions.py 92.85% <ø> (ø)
superset/connectors/sqla/models.py 89.70% <ø> (+1.65%) ⬆️
superset/dashboards/api.py 93.03% <ø> (ø)
superset/dashboards/commands/importers/v0.py 89.94% <ø> (ø)
superset/datasets/api.py 89.23% <ø> (-2.70%) ⬇️
superset/db_engine_specs/bigquery.py 87.75% <ø> (ø)
superset/db_engine_specs/elasticsearch.py 90.24% <0.00%> (ø)
superset/db_engine_specs/mysql.py 94.04% <ø> (-3.58%) ⬇️
superset/db_engine_specs/presto.py 89.95% <ø> (+6.48%) ⬆️
superset/db_engines/hive.py 85.18% <ø> (+85.18%) ⬆️
... and 64 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7b72443...919c36d. Read the comment docs.

@pull-request-size pull-request-size bot added size/XS and removed size/S labels Aug 9, 2021
@junlincc junlincc requested review from dpgaspar and ktmud August 12, 2021 02:55
amitmiran137 added 4 commits August 16, 2021 17:35
…gies

* upstream/master: (64 commits)
  check roles before fetching reports (#16260)
  chore: upgrade mypy and add type guards (#16227)
  fix: pivot columns with ints for name (#16259)
  chore(pylint): Bump Pylint to 2.9.6 (#16146)
  fix examples tab for dashboard (#16253)
  chore: bump superset-ui packages to 0.17.84 (#16251)
  chore: Shows the dataset description in the gallery dropdown (#16200)
  fix(Dashboard): Omnibar dropdown visibility and keyboard commands (#16168)
  chore: bump py version for integration test (#16213)
  fix: skip perms on query context update (#16250)
  refactor: external metadata fetch API (#16193)
  feat(dao): admin can remove self from object owners (#15149)
  fix(dashboard): cross filter chart highlight when filters badge icon clicked (#16233)
  fix: validate_parameters and query (#16241)
  fix: Remove Advanced Analytics tag for 2 charts (#16240)
  Revert "feat: Changing Dataset names (#16199)" (#16235)
  feat: Allow users to connect via legacy SQLA form (#16201)
  fix: remove encryption from db params (#16214)
  fix(Explore): Show the tooltip only when label does not fit the container in METRICS/FILTERS/GROUP BY/SORT BY of the DATA panel (#16060)
  Show/hide tooltips (#16192)
  ...

# Conflicts:
#	superset/tasks/caching/cache_strategy.py
* master:
  chore(pylint): Reenable ungrouped-imports check (#16256)
  chore(pylint): Re-enable super-with-arguments check (#16138)
  fix: disable text reports for now (#16257)
  fix: pivot col names in post_process (#16262)
  chore: Improves the flow to create a new chart (#16252)
  Move pagination outside of table (#16232)
@amitmiran137 amitmiran137 requested a review from a team August 17, 2021 10:33
Copy link
Member

@craig-rueda craig-rueda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. Please add some tests :).

This task periodically hits charts to warm up the cache.

"""
logger.info("Loading strategy")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be DEBUG and should have a little more context - like Running cache warmup strategy x...

"""
logger.info("Loading strategy")
class_ = None
extra_strategies: List[Type[Strategy]] = app.config["EXTRA_CACHING_STRATEGIES"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of EXTRA_CACHING_STRATEGIES, we should have CACHING_STRATEGIES, which is set to a default set of strategies. This way, downstreams can override the complete set of strategies easily.

if class_.__name__ == strategy_name:
break
else:
message = f"No strategy {strategy_name} found!"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No f-strings in log messages.

Nit: Cache warmup strategy xxx not found

logger.error(message, exc_info=True)
return message

logger.info("Loading %s", class_.__name__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be DEBUG

except TypeError:
message = "Error loading strategy!"
logger.exception(message)
return message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why return, just raise?

for url in strategy.get_urls():
try:
logger.info("Fetching %s", url)
request.urlopen(url) # pylint: disable=consider-using-with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this deal with auth? I'm assuming most of these charts will live behind some auth mechanism?

Copy link
Member

@dpgaspar dpgaspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a question related with the scoped session, and adding tests would be great. @craig-rueda question regarding auth bugs me also


def get_urls(self) -> List[str]:
urls = []
session = db.create_scoped_session()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? we should be using with session_scope(nullpool=True) as session

@mayurnewase
Copy link
Contributor

mayurnewase commented Dec 30, 2021

hi @amitmiran137 I see this was closed without merge, what was the reason?
I have few questions here.

  • it calls charts explore url and request lib doesn't load javascript so it doesn’t really gets the data from endpoint which is POST explore_json or POST data which actually loads data into cache.
  • also this won't generate thumbnail urls, which would be usefull to warm up as well.

so should we just invoke headless driver like we do for reports?
I can help take over this along with fixing authentication.

@craig-rueda craig-rueda deleted the feat/extra_strategies branch January 1, 2022 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants