Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(database): Database Filtering via custom configuration #24580

Conversation

Antonio-RiveroMartnez
Copy link
Member

SUMMARY

There might be scenarios where you want to perform custom filtering on the list of databases returned by the DatabaseRestApi, right now, there's no way besides monkey patching that you can do so. This PR adds a new config definition in our app and makes use of it in the DatabaseFilter which is applied to all searches so you can add custom filtering if needed via config, adding live filtering capabilities to our searches and easing customization.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

  1. Add a filtering function in your config file so all GET or GET list requests can make use of it and return a filtered result

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Jul 3, 2023

Codecov Report

Merging #24580 (a47bdc2) into master (226c7f8) will decrease coverage by 0.02%.
The diff coverage is 86.06%.

❗ Current head a47bdc2 differs from pull request most recent head dac34ff. Consider uploading reports for the commit dac34ff to get more accurate results

@@            Coverage Diff             @@
##           master   #24580      +/-   ##
==========================================
- Coverage   69.08%   69.06%   -0.02%     
==========================================
  Files        1906     1906              
  Lines       74168    74114      -54     
  Branches     8164     8165       +1     
==========================================
- Hits        51239    51187      -52     
+ Misses      20807    20804       -3     
- Partials     2122     2123       +1     
Flag Coverage Δ
hive 54.14% <39.87%> (+0.20%) ⬆️
mysql 79.48% <83.54%> (+0.08%) ⬆️
postgres ?
presto 54.04% <39.87%> (+0.20%) ⬆️
python 83.46% <86.07%> (-0.03%) ⬇️
sqlite 78.13% <70.25%> (+0.08%) ⬆️
unit 54.81% <48.73%> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/daos/annotation.py 100.00% <ø> (+12.76%) ⬆️
superset/examples/utils.py 0.00% <0.00%> (ø)
superset/views/base.py 73.33% <25.00%> (ø)
superset/databases/commands/update.py 74.44% <50.00%> (+0.28%) ⬆️
...erset-frontend/src/SqlLab/components/App/index.jsx 82.75% <83.33%> (-0.58%) ⬇️
superset/extensions/metastore_cache.py 96.61% <90.00%> (-1.51%) ⬇️
superset-frontend/src/logger/LogUtils.ts 97.29% <100.00%> (+0.07%) ⬆️
...otation_layers/annotations/commands/bulk_delete.py 87.50% <100.00%> (ø)
...t/annotation_layers/annotations/commands/delete.py 83.33% <100.00%> (-1.29%) ⬇️
superset/annotation_layers/commands/bulk_delete.py 84.61% <100.00%> (ø)
... and 30 more

... and 12 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

- Add new configuration so we can inject extra filters to our databases when running the DatabaseFilter in base_filters
- Add tests for our new config and its usage
assert rv.status_code == 200

# Cleanup
first_model = db.session.query(Database).get(first_response.get("id"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to figure out how to do this via fixture or after every test so we can always land back in a normal state

uri = f"api/v1/database/"
rv = self.client.get(uri)
data = json.loads(rv.data.decode("utf-8"))
self.assertEqual(data["count"], len(dbs))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hughhhh here I am testing our current behavior (default) where all databases must be returned if nothing is being set in the config, so dynamic_filter is not defined. Then, I'm adding the patch for the config to add the filter function and testing it's being applied because dynamic_filter is defined.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added explicit assertions to check whether the filter method has been called when defined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here I am testing our current behavior (default) where all databases must be returned if nothing is being set in the config, so dynamic_filter is not defined. Then, I'm adding the patch for the config to add the filter function and testing it's being applied because dynamic_filter is defined.

Can you write that down in the function docstring? :)

Copy link
Member

@john-bodley john-bodley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide some scenarios where you would use this? I also wonder if this should be generalized for all entity types.

Additionally I’m not saying I’m a supporter of Monkey Patching, but we should be cognizant that sometimes the method is used to relax (as opposed to further restrict) filters whereas this approach only addresses the later.

@@ -41,6 +41,16 @@ class DatabaseFilter(BaseFilter): # pylint: disable=too-few-public-methods
# TODO(bogdan): consider caching.

def apply(self, query: Query, value: Any) -> Query:
# Dynamic Filters need to be applied to the Query before we filter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like a good place for a docstring comment.

- Add explicit assertions to validate the filter is not called when not defined
- Use docstring comment
@Antonio-RiveroMartnez
Copy link
Member Author

Antonio-RiveroMartnez commented Jul 5, 2023

Could you provide some scenarios where you would use this? I also wonder if this should be generalized for all entity types.

Additionally I’m not saying I’m a supporter of Monkey Patching, but we should be cognizant that sometimes the method is used to relax (as opposed to further restrict) filters whereas this approach only addresses the later.

There might be cases where I have a given database that I want to make visible/hidden to my users based on a Feature Flag (live change). Right now, our options would be overriding the entire DatabaseRestApi or Monkey Patch it, with this change we extend the options we have, we can define our custom filter method in the config and get it applied it to my responses in a easy way.

When it comes to generalized for all entities, I will use the same concept we use for EXTRA_RELATED_QUERY_FILTERS and define a databases key that would be the one we pull from the DatabaseFilter, that way we can use the config to further extension later on if needed for other entities, i.e, adding a dashboards, charts key in it etc. Also, by doing this renaming the config to EXTRA_DYNAMIC_QUERY_FILTERS.

- Generalize the new config so we can use it for other entities down the road.
- Pull the new database key from the config so we apply any given filter method in the databases API
- Adjust our tests with the new config name and structure
Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elegant!

uri = f"api/v1/database/"
rv = self.client.get(uri)
data = json.loads(rv.data.decode("utf-8"))
self.assertEqual(data["count"], len(dbs))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here I am testing our current behavior (default) where all databases must be returned if nothing is being set in the config, so dynamic_filter is not defined. Then, I'm adding the patch for the config to add the filter function and testing it's being applied because dynamic_filter is defined.

Can you write that down in the function docstring? :)

- Add more info to our comments in our tests
@Antonio-RiveroMartnez Antonio-RiveroMartnez merged commit 6657353 into apache:master Jul 6, 2023
@john-bodley
Copy link
Member

@Antonio-RiveroMartnez regarding your comment,

There might be cases where I have a given database that I want to make visible/hidden to my users based on a Feature Flag (live change).

If that was the case shouldn't the permission override logic be handled in the security manager? Note I'm not blocking this change, I just think there's merit in making sure we measure twice cut once in terms of adding new functionality.

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 8, 2024
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 3.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants