Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Tableau dashboard metadata extractors #333

Merged
merged 45 commits into from Aug 24, 2020

Conversation

ccarterlandis
Copy link
Contributor

Summary of Changes

This PR introduces 5 extractors for Tableau dashboard metadata:

  • TableauDashboardMetadataExtractor
  • TableauDashboardTableExtractor
  • TableauDashboardQueryExtractor
  • TableauDashboardLastModifiedtExtractor
  • TableauExternalTableExtractor

See amundsen-io/amundsen#565 for the related feature proposal.

Tests

Basic unit tests were added for all extractors except the external table extractor. Tests for Tableau authencation and the utility methods have not been created yet.

Documentation

Added sections for each of the extractors to the README.md.

CheckList

Make sure you have checked all steps below to ensure a timely review.

  • PR title addresses the issue accurately and concisely. Example: "Updates the version of Flask to v1.0.2"
  • PR includes a summary of changes.
  • PR adds unit tests, updates existing unit tests, OR documents why no test additions or modifications are needed.
  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
  • [] PR passes make test

TODO

  • Fix any linting errors
  • Make sure this PR passes make test

@ccarterlandis
Copy link
Contributor Author

@feng-tao I'm guessing those upstream commits + merge conflicts are because master has changed since I created this PR. This PR should remain draft until the linting errors have been fixed and & make test passes, after that it can be opened for review

@ccarterlandis
Copy link
Contributor Author

Tagging for visibility 🚀 @alevene @GustoFergy

@alevene alevene force-pushed the tableau-dashboard-extractors branch 2 times, most recently from b2bf2fb to 84326fb Compare August 18, 2020 21:58
@alevene
Copy link
Contributor

alevene commented Aug 19, 2020

@dorianj I've applied the typing changes. Please review, thanks!

@feng-tao
Copy link
Member

if the pr is ready for review, would be good to change the pr status.

@alevene
Copy link
Contributor

alevene commented Aug 19, 2020

@feng-tao could you do this for me? I'm getting this note:

Only those with write access to this repository can mark a draft pull request as ready for review.

@feng-tao feng-tao marked this pull request as ready for review August 19, 2020 17:06
@feng-tao
Copy link
Member

@feng-tao could you do this for me? I'm getting this note:

Only those with write access to this repository can mark a draft pull request as ready for review.

maybe because the pr is created by carter? Anyway I have opened the pr from draft to review :)

import unittest

from mock import patch
from pyhocon import ConfigFactory # noqa: F401
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: don't add new noqas, these aren't needed anymore

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed these, will fix. Thanks

logging.basicConfig(level=logging.INFO)


def mock_query(*_args, **_kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that mypy isn't mad about there not being a return type on this, I have no idea why it's not... but it doesn't matter here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I know why. There was a mixup along the way and our branch did a rename of tests/unit/extractor/dashboard/__init__.py to tests/unit/extractor/dashboard/tableau/__init__.py. So, it's not treated as a module and mypy skips it.
I'll fix.

@dorianj
Copy link
Contributor

dorianj commented Aug 19, 2020

@dorianj I've applied the typing changes. Please review, thanks!

Typings LGTM!

@feng-tao
Copy link
Member

@alevene @ccarterlandis I will go over the pr early next week.

@@ -57,7 +57,7 @@ def create_next_relation(self) -> Union[Dict[str, Any], None]:

def _create_relation_iterator(self) -> Optional[Iterator[Dict[str, Any]]]:
for table_id in self._table_ids:
m = re.match('(\w+)://(\w+)\.(\w+)\/(\w+)', table_id)
m = re.match(r'(\w+)://(\w+)\.(\w+)\/(\w+)', table_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you rebase this line to r'([^./]+)://([^./]+)\.([^./]+)\/([^./]+)'

Copy link
Member

@feng-tao feng-tao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code is very well written, lgtm. Let me know once you rebased. thanks @alevene @ccarterlandis

def execute(self) -> Iterator[Dict[str, Any]]:
response = self.execute_query()

for query in response['customSQLTables']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment on which API it is using?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what you'd like to see? There is a comment for this class that calls out what model is expected (DashboardQuery). The API relates to the query found in the TableauDashboardQueryExtractor class in this file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I miss that part. No worry then.

@@ -0,0 +1,366 @@
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add the header here as well?

This patch adds a very barebones Tableau dashboard integration.

Currently, it supports adding workbooks, users and owners (which are fake), and
finding the last updated timestamp. To run it, startup Amundsen as normal and then
run `python example/scripts/sample_data_loader.py`. You'll need to make sure that
you've added your Tableau site config and credentials to the top of the sample loader
script. It also does not support creation of tables and columns at this time, so you'll
need to add those manually using a CSV job.

Signed-off-by: Carter Landis <carter.landis@gusto.com>
This patch introduces a new Tableau dashboard extractor
for gathering the queries used in Tableau workbooks.

In Tableau, these are classified as "custom SQL tables", and appear as a
datasource for the sheets in which they are used. Luckily, the Tableau
Metadata API is smart enough to include the tables used in a custom SQL
query in the workbook <> table relation.

Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
ccarterlandis and others added 26 commits August 24, 2020 12:14
Signed-off-by: Carter Landis <carter.landis@gusto.com>
(cherry picked from commit e8284e0935ea60eb65733433f1cba6c866466fbc)
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
(cherry picked from commit 80196e534b402db863a62e6d381e037a5ff95337)
Signed-off-by: Carter Landis <carter.landis@gusto.com>
The external table extractor needs to use query variables
in order to filter which external data connections are captured
as external tables.

(cherry picked from commit 32462fab306b117ca41c3e05593f7f124fe51efd)
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
(cherry picked from commit d587624aab55b5a950c4791d706a6bcc892c7ee0)
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
(cherry picked from commit 84b1ed91591d54ad34b7e54e843923634321e9b9)
Signed-off-by: Carter Landis <carter.landis@gusto.com>
(cherry picked from commit 124ccb388f9d8d28ecc1b5a43eeb71a825e63ce0)
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
(cherry picked from commit ed8e0a6a9b2b1968f63027045464b47de5886100)
Signed-off-by: Carter Landis <carter.landis@gusto.com>
…sanitize methods

Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Python 3.8 is smart enough to understand named string interpolation
with out the call to .format() added by this patch. Python 3.7 is not
smart enough to make the same assumption.

Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Carter Landis <carter.landis@gusto.com>
Signed-off-by: Alex Levene <alex.levene@gusto.com>
Signed-off-by: Alex Levene <alex.levene@gusto.com>
Signed-off-by: Alex Levene <alex.levene@gusto.com>
Signed-off-by: Alex Levene <alex.levene@gusto.com>
Signed-off-by: Alex Levene <alex.levene@gusto.com>
…ter module re-instated

Signed-off-by: Alex Levene <alex.levene@gusto.com>
Signed-off-by: Alex Levene <alex.levene@gusto.com>
Signed-off-by: Alex Levene <alex.levene@gusto.com>
@alevene alevene force-pushed the tableau-dashboard-extractors branch from c19efe8 to 49c9c8b Compare August 24, 2020 19:08
Copy link
Member

@feng-tao feng-tao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! thanks @alevene @ccarterlandis for the great contribution!
could you also add Tableau icon like amundsen-io/amundsenfrontendlibrary#501 into frontend repo
and update https://github.com/amundsen-io/amundsen#dashboard-connectors to include tableau? thanks a lot

@feng-tao feng-tao merged commit 46207ee into amundsen-io:master Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants