Skip to content

[AIRFLOW-6790] Add basic Tableau Integration#7410

Merged
kaxil merged 3 commits intoapache:masterfrom
feluelle:feature/AIRFLOW-6790-tableau-integration-basic
Feb 22, 2020
Merged

[AIRFLOW-6790] Add basic Tableau Integration#7410
kaxil merged 3 commits intoapache:masterfrom
feluelle:feature/AIRFLOW-6790-tableau-integration-basic

Conversation

@feluelle
Copy link
Member

@feluelle feluelle commented Feb 13, 2020

This PR adds a TableauHook which wraps the tableauserverclient python library in an AirflowHook to connect to a Tableau Server. It also contains a TableauRefreshWorkbookOperator which refreshes a Tableau workbook / extract and a TableauJobStatusSensor which pokes the status of a Tableau Job until it has finished.


Issue link: AIRFLOW-6790

Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
  • Unit tests coverage for changes (not needed for documentation changes)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

* For document-only changes commit message can start with [AIRFLOW-XXXX].


In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

Copy link
Member Author

@feluelle feluelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tableau does belong to Salesforce but can be used separately.

So the question is whether we want to install it separately or together with simple-salesforce.

WDYT?

@feluelle feluelle force-pushed the feature/AIRFLOW-6790-tableau-integration-basic branch from 7672250 to 7ee25f9 Compare February 13, 2020 10:55
@feluelle feluelle force-pushed the feature/AIRFLOW-6790-tableau-integration-basic branch 2 times, most recently from 2a84b4f to f4ffb15 Compare February 17, 2020 14:12
@feluelle
Copy link
Member Author

Ready for another review @eladkal :)

@feluelle feluelle force-pushed the feature/AIRFLOW-6790-tableau-integration-basic branch 3 times, most recently from 17e281b to f76a451 Compare February 18, 2020 17:27
@codecov-io
Copy link

codecov-io commented Feb 18, 2020

Codecov Report

Merging #7410 into master will decrease coverage by 0.59%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #7410     +/-   ##
=========================================
- Coverage   86.66%   86.06%   -0.6%     
=========================================
  Files         877      881      +4     
  Lines       41252    41323     +71     
=========================================
- Hits        35751    35565    -186     
- Misses       5501     5758    +257
Impacted Files Coverage Δ
airflow/models/connection.py 95.07% <ø> (ø) ⬆️
...s/salesforce/operators/tableau_refresh_workbook.py 100% <100%> (ø)
airflow/providers/salesforce/hooks/tableau.py 100% <100%> (ø)
...providers/salesforce/sensors/tableau_job_status.py 100% <100%> (ø)
...e/example_dags/example_tableau_refresh_workbook.py 100% <100%> (ø)
airflow/utils/db.py 98.29% <100%> (+0.01%) ⬆️
...flow/providers/apache/cassandra/hooks/cassandra.py 21.51% <0%> (-72.16%) ⬇️
...w/providers/apache/hive/operators/mysql_to_hive.py 100% <0%> (ø) ⬆️
airflow/kubernetes/volume_mount.py 44.44% <0%> (-55.56%) ⬇️
airflow/api/auth/backend/kerberos_auth.py 83.09% <0%> (ø) ⬆️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7c95c81...f76a451. Read the comment docs.

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @feluelle. Having a Tableau operator would be a great addition. I've added some minor comments.

setup.py Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Salesforce and Tableau are tightly integrated. I would split this into another group.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I am fine with this :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I put it into a separate group, does it mean it also belongs to the tableau provider? No, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fokko do you think we should move it to a new provider called Tableau?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense to me :)

Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested against our Tableau server and it works as expected.

@feluelle feluelle force-pushed the feature/AIRFLOW-6790-tableau-integration-basic branch from 6bd1605 to 7b7082c Compare February 20, 2020 14:09
@feluelle feluelle requested a review from Fokko February 20, 2020 14:11
@feluelle
Copy link
Member Author

@eladkal please take another look.

@feluelle feluelle changed the title [AIRFLOW-6790] Add basic Tableau Integration WIP: [AIRFLOW-6790] Add basic Tableau Integration Feb 20, 2020
@feluelle feluelle changed the title WIP: [AIRFLOW-6790] Add basic Tableau Integration [WIP] [AIRFLOW-6790] Add basic Tableau Integration Feb 20, 2020
@feluelle feluelle force-pushed the feature/AIRFLOW-6790-tableau-integration-basic branch from 7b7082c to 8710e5a Compare February 20, 2020 16:04
@feluelle feluelle changed the title [WIP] [AIRFLOW-6790] Add basic Tableau Integration [AIRFLOW-6790] Add basic Tableau Integration Feb 20, 2020
@feluelle feluelle changed the title [AIRFLOW-6790] Add basic Tableau Integration [DO NOT MERGE] [AIRFLOW-6790] Add basic Tableau Integration Feb 20, 2020
@feluelle feluelle force-pushed the feature/AIRFLOW-6790-tableau-integration-basic branch 2 times, most recently from f05e32a to 54c842c Compare February 20, 2020 16:57
@feluelle feluelle changed the title [DO NOT MERGE] [AIRFLOW-6790] Add basic Tableau Integration [AIRFLOW-6790] Add basic Tableau Integration Feb 20, 2020
- add TableauHook which wraps the tableauserverclient python library in an AirflowHook to connect to a Tableau Server
- add TableauRefreshWorkbookOperator which refreshes a Tableau workbook / extract
- add TableauJobStatusSensor which pokes the status of a Tableau Job until it has finished
@feluelle feluelle force-pushed the feature/AIRFLOW-6790-tableau-integration-basic branch from 54c842c to 7769a52 Compare February 20, 2020 17:23
@feluelle
Copy link
Member Author

Ready for another round @Fokko @eladkal. I am going to continue tomorrow :)

Thank you for your time!

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good @feluelle

# The following task queries the status of the workbook refresh job until it succeeds.
task_check_job_status = TableauJobStatusSensor(
site_id='my_site',
job_id="{{ ti.xcom_pull(task_ids='refresh_tableau_workbook') }}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be refresh_tableau_workbook_non_blocking ?

:type tableau_conn_id: str
"""

def __init__(self, site_id: Optional[str] = None, tableau_conn_id: str = 'tableau_default'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked in previous review about the switching the default of site_id from empty string to None. You had a good point that empty string is the default of the pacakge
https://tableau.github.io/server-client-python/docs/api-ref#authentication
Why did you enetualy decided to switch to None?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


db.merge_conn(
models.Connection(
conn_id='tableau_test_password',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this connection is required? Can't it use the tableau_default that was added to utils/db.py?

@kaxil kaxil merged commit a9ad0a9 into apache:master Feb 22, 2020
@kaxil
Copy link
Member

kaxil commented Feb 22, 2020

Good work @feluelle 🎉

galuszkak pushed a commit to FlyrInc/apache-airflow that referenced this pull request Mar 5, 2020
@feluelle feluelle deleted the feature/AIRFLOW-6790-tableau-integration-basic branch April 30, 2020 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants