[ENHANCEMENT] BigQuery integration should use a view instead of a table #2082

alessandrolacorte · 2020-11-25T11:02:59Z

Changes proposed in this pull request:

The BigQuery Integration should use a view instead of a table, for cost reasons. With the current implementation, it creates a temporary table, which could be a copy of the original table. In BigQuery, you pay for the amount of data scanned, so in the case of tables with TBs of data, it can become very expensive. The alternative is to use a view, which incurs in no cost of duplicating the source table.

eugmandel · 2020-11-30T20:51:27Z

@alessandrolacorte Thank you for submitting this PR! We will review it this week.

eugmandel · 2020-12-09T23:47:10Z

@alessandrolacorte GE stores the results of a query in a table to make it faster to run queries against it (during validation). Do you know if using a view instead of a table will preserve the execution speed?

alessandrolacorte · 2020-12-10T12:25:54Z

@alessandrolacorte GE stores the results of a query in a table to make it faster to run queries against it (during validation). Do you know if using a view instead of a table will preserve the execution speed?

Hello! I can guarantee that the view will preserve the execution speed. We can go into the details of how BigQuery works and how Dremel (the BigQuery engine) generates the execution plan, but that might be of an overkill.
In essence, views in BigQuery are not materialized and they have zero execution penalty, it is as fast as querying the underlying table.

eugmandel · 2020-12-14T22:44:40Z

@alessandrolacorte Awesome - will merge and it will go out with today's release.

alexsherstinsky

LGTM

…le (great-expectations#2082) * Change from using a table to a view * Updating changelog.rst Co-authored-by: Eugene Mandel <eugene.mandel@gmail.com> Co-authored-by: Alex Sherstinsky <alexsherstinsky@users.noreply.github.com>

alessandrolacorte added 2 commits November 20, 2020 19:08

Change from using a table to a view

8d67400

Updating changelog.rst

f2e2f68

alessandrolacorte mentioned this pull request Nov 25, 2020

[ENHANCEMENT] BigQuery integration should use a view instead of a table #2081

Closed

Merge branch 'develop' into feature-update-bigquery-temp-table-to-view

8799d58

alessandrolacorte marked this pull request as ready for review November 25, 2020 11:26

Merge branch 'develop' into feature-update-bigquery-temp-table-to-view

5a7630f

eugmandel added 2 commits December 1, 2020 11:28

Merge branch 'develop' into feature-update-bigquery-temp-table-to-view

5c4c94f

Merge branch 'develop' into feature-update-bigquery-temp-table-to-view

f7204f7

Merge branch 'develop' into feature-update-bigquery-temp-table-to-view

a986a9e

alexsherstinsky approved these changes Dec 14, 2020

View reviewed changes

alexsherstinsky added 2 commits December 14, 2020 15:53

Merge branch 'develop' into feature-update-bigquery-temp-table-to-view

b87c8e4

Merge branch 'develop' into feature-update-bigquery-temp-table-to-view

efd77d8

alexsherstinsky merged commit bbace6f into great-expectations:develop Dec 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENHANCEMENT] BigQuery integration should use a view instead of a table #2082

[ENHANCEMENT] BigQuery integration should use a view instead of a table #2082

alessandrolacorte commented Nov 25, 2020

eugmandel commented Nov 30, 2020

eugmandel commented Dec 9, 2020

alessandrolacorte commented Dec 10, 2020

eugmandel commented Dec 14, 2020

alexsherstinsky left a comment

[ENHANCEMENT] BigQuery integration should use a view instead of a table #2082

[ENHANCEMENT] BigQuery integration should use a view instead of a table #2082

Conversation

alessandrolacorte commented Nov 25, 2020

eugmandel commented Nov 30, 2020

eugmandel commented Dec 9, 2020

alessandrolacorte commented Dec 10, 2020

eugmandel commented Dec 14, 2020

alexsherstinsky left a comment

Choose a reason for hiding this comment