Skip to content

Commit

Permalink
Merge current branch into next (#1719)
Browse files Browse the repository at this point in the history
* add content

* add content

* add content

* Update website/blog/2022-06-30-extract-sql-function.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* add content

* add content

* Update dimensional-modeling.md

* add content

* add content

* small grammar fix

* changes to include expression metrics

* Fix all broken links in docs subfolder

* Fix blog post broken links

* fixes!

* Remove unused tag item from blog sidebar

* Update redirect

* Fix best practice link

* updating for link

* Update Oracle profile
- New python driver `python-oracledb` related changes which vastly simplifies dbt-oracle installation
- Wallet Configuration is explained in a better manner for 1-way TLS and m-TLS
- Content organization

* Update website/docs/docs/building-a-dbt-project/metrics.md

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>

* updating wording in metrics

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* reflecting changes in upgrade doc

* Add bold record to data change tables

* Update website/docs/terms/data-warehouse.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/docs/terms/data-warehouse.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/docs/terms/data-warehouse.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/docs/terms/data-warehouse.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/blog/2022-06-30-lower-sql-function.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/blog/2022-06-30-lower-sql-function.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/blog/2022-06-30-lower-sql-function.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/blog/2022-06-30-lower-sql-function.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/blog/2022-06-30-lower-sql-function.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Add links

* Update website/blog/2022-06-30-lower-sql-function.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update 2022-06-30-lower-sql-function.md

* Update website/blog/2022-06-30-extract-sql-function.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/blog/2022-06-30-coalesce-sql.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Apply suggestions from code review

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Add links

* Apply suggestions from code review

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Add strikethrough

* small syntax fixes

* Fix hyperlink

* Update website/docs/terms/data-warehouse.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Apply suggestions from code review

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* updating docs

* fixing conflict

* Add updated redirects

* Fix jinja links

* Fix conflicts

* Update jinja reference redirect

* Add links

* Update website/blog/2022-06-30-extract-sql-function.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/blog/2022-06-30-extract-sql-function.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Add truncate

* Update website/blog/2022-06-30-coalesce-sql.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Shorten reading list

* Update website/blog/2022-07-05-date-trunc-sql-love-letter.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/blog/2022-07-05-date-trunc-sql-love-letter.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Update website/blog/2022-07-05-date-trunc-sql-love-letter.md

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* Set featured to true

* Apply suggestions from code review

Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>

* https://github.com/dbt-labs/dbt-core/discussions/5468

* Up[date redirects

* Replace two more migration redirects

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* Update website/blog/2022-07-12-change-data-capture-metrics.md

Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>

* remove images

* Update revenue-meme.png

* Update revenue-meme.png

* Update revenue-meme.png

* Update redirect

* Update website/blog/2022-07-12-change-data-capture-metrics.md

* Changes in response to review comments
- Changed the phrase one-way TLS to simply TLS
- Included link to official ADB documentation for wallet less TLS connection
- Changed service names to example service names
- Changed the ordering of connection methods. The ordering is TNS Alias, Connect String and Database hostname

* Update website/docs/guides/migration/versions/06-upgrading-to-v1.2.md

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>

* Update release date

* remove next.docs redirect

* remove extra properties

* Update redirects

* Add redirects back

* add git action to merge current branch into next

* Add wildcard for migration guides

* Add back specific redirects

* v1.2 docs omnibus (#1698)

* Add itertools module #1368

* Manifest v6 #1667

* Add set + zip #1635

* File selector method #1627

* Selector inheritance #1628

* Global config for target-path, log-path #1687

* Update website/docs/reference/dbt-jinja-functions/zip.md

Co-authored-by: Anders <swanson.anders@gmail.com>

* Update website/docs/reference/dbt-jinja-functions/set.md

Co-authored-by: Anders <swanson.anders@gmail.com>

* PR feedback

Co-authored-by: Anders <swanson.anders@gmail.com>

* Update to `grants` documentation (#1707)

* Update docs on hooks re grants

* Update BQ language, link to docs

* Configuring grants, inheritance, reorg

* PR feedback

* Apply correct redirects for older versions

* Update snowflake-configs.md

In researching the use of time travel with dbt, I found the documentation to be mis-leading and thought I needed to change my dbt configuration. The comment that transient tables do not participate in time travel is not correct. Per Snowflake's documentation, Transient tables persist until they are dropped and the time travel default is 1 day and transient tables do not participate in fail safe.

* add links

* Add links

* Fix note

* Fix notes

Co-authored-by: johnblust <97988576+johnblust@users.noreply.github.com>
Co-authored-by: Kira Furuichi <kirafuruichi2019@u.northwestern.edu>
Co-authored-by: Callum McCann <cmccann51@gmail.com>
Co-authored-by: john-rock <johnmrock.jr@gmail.com>
Co-authored-by: Anders <swanson.anders@gmail.com>
Co-authored-by: Abhishek Singh <abhishek.o.singh@oracle.com>
Co-authored-by: Callum McCann <101437052+callum-mcdata@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com>
Co-authored-by: Anders Swanson <anders.swanson@dbtlabs.com>
Co-authored-by: John Rock <46692803+john-rock@users.noreply.github.com>
Co-authored-by: Jason Karlavige <jkarlavige@gmail.com>
Co-authored-by: Beaulieuj <90111776+JoniBeaulieu@users.noreply.github.com>
Co-authored-by: Leona B. Campbell <3880403+runleonarun@users.noreply.github.com>
  • Loading branch information
15 people committed Jul 15, 2022
1 parent 65ab008 commit 608558f
Show file tree
Hide file tree
Showing 107 changed files with 1,786 additions and 319 deletions.
18 changes: 18 additions & 0 deletions .github/workflows/create_next_pr.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
on:
push:
branches:
- "current"

jobs:
pull-request:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: pull-request
uses: repo-sync/pull-request@v2
with:
source_branch: "current"
destination_branch: "next"
pr_title: "Merge current branch into next"
pr_body: "*An automated PR to keep the next branch up to date with current*"
github_token: ${{ secrets.GITHUB_TOKEN }}
26 changes: 21 additions & 5 deletions _redirects
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,15 @@
/dbt-cli/installation-guides/ubuntu-debian /dbt-cli/install/overview 302
/dbt-cli/installation-guides/windows /dbt-cli/install/overview 302
/dbt-cli/installation /dbt-cli/install/overview 302
/dbt-jinja-functions /reference/dbt-jinja-functions 302
/docs /docs/introduction 302
/docs/adapter /docs/writing-code-in-dbt/jinja-context/adapter 302
/docs/analyses /docs/building-a-dbt-project/analyses 302
/docs/api-variable /docs/writing-code-in-dbt/api-variable 302
/docs/archival /docs/building-a-dbt-project/archival 302
/docs/artifacts /docs/dbt-cloud/using-dbt-cloud/artifacts 302
/docs/best-practices /guides/best-practices 302
/docs/guides/best-practices /guides/best-practices 302
/docs/bigquery-configs /reference/resource-configs/bigquery-configs 302
/docs/building-a-dbt-project/building-models/bigquery-configs /reference/resource-configs/bigquery-configs 302
/docs/building-a-dbt-project/building-models/configuring-models /reference/model-configs
Expand Down Expand Up @@ -96,6 +98,9 @@
/docs/global-cli-flags /reference/global-cli-flags 302
/docs/graph /docs/writing-code-in-dbt/jinja-context/graph 302
/docs/guides/writing-custom-schema-tests /docs/guides/writing-custom-generic-tests
/docs/guides/best-practices /guides/best-practices
/docs/guides/best-practices#choose-your-materializations-wisely /guides/best-practices 302
/docs/guides/best-practices#version-control-your-dbt-project /guides/legacy/best-practices#version-control-your-dbt-project 302
/docs/hooks /docs/building-a-dbt-project/hooks-operations 302
/docs/init /reference/commands/init 302
/docs/install-from-source /dbt-cli/installation 302
Expand Down Expand Up @@ -251,6 +256,8 @@
/docs/writing-code-in-dbt/macros /docs/building-a-dbt-project/jinja-macros 302
/docs/writing-code-in-dbt/using-jinja /guides/getting-started/learning-more/using-jinja 302
/faqs/getting-help/ /guides/legacy/getting-help 302
/migration-guide/upgrading-to-0-17-0 /guides/migration/versions 302
/migration-guide/upgrading-to-0-18-0 /guides/migration/versions 302
/reference/accounts /dbt-cloud/api 302
/reference/api /dbt-cloud/api 302
/reference/connections /dbt-cloud/api 302
Expand Down Expand Up @@ -278,6 +285,7 @@ https://tutorial.getdbt.com/* https://docs.getdbt.com/:splat 301!
/reference/project-configs/modules-paths /reference/project-configs/packages-install-path 302
/docs/dbt-cloud/using-dbt-cloud/cloud-slack-notifications /docs/dbt-cloud/using-dbt-cloud/cloud-notifications 302
/reference/warehouse-profiles/presto-profile /reference/profiles.yml 302
/setting-up /guides/getting-started/getting-set-up/setting-up-bigquery 302
/tutorial/setting-up /guides/getting-started 302
/tutorial/test-and-document-your-project /guides/getting-started/building-your-first-project/test-and-document-your-project 302
/tutorial/build-your-first-models /guides/getting-started/building-your-first-project/build-your-first-models 302
Expand All @@ -295,15 +303,23 @@ https://tutorial.getdbt.com/* https://docs.getdbt.com/:splat 301!
/tutorial/learning-more/* /guides/getting-started/learning-more/:splat 301
/tutorial/getting-set-up/* /guides/getting-started/getting-set-up/:splat 301
/tutorial/building-your-first-project/* /guides/getting-started/building-your-first-project/:splat 301
/tutorial/refactoring-legacy-sql /guides/getting-started/learning-more/refactoring-legacy-sql 302
# migration and legacy guides
/docs/guides/migration-guide/upgrading-from-0-10-to-0-11 /guides/migration/versions/upgrading-to-0-11-0 302
/docs/guides/migration-guide/upgrading-to-014 /guides/migration/versions/upgrading-to-0-14-0 302
/docs/upgrading-to-014 /guides/migration/versions/upgrading-to-0-14-0 302
/docs/upgrading-to-0-14-1 /guides/migration/versions/upgrading-to-0-14-1 302
/docs/upgrading-to-0-16-0 /guides/migration/versions/upgrading-to-0-16-0 302
/docs/guides/migration-guide/upgrading-to-0-14-0 /guides/migration/versions 302
/docs/guides/migration-guide/upgrading-to-0-15-0 /guides/migration/versions 302
/docs/guides/migration-guide/upgrading-to-0-16-0 /guides/migration/versions 302
/docs/guides/migration-guide/upgrading-to-0-17-0 /guides/migration/versions 302
/docs/guides/migration-guide/upgrading-to-0-18-0 /guides/migration/versions 302
/docs/guides/migration-guide/upgrading-to-0-19-0 /guides/migration/versions 302
/docs/guides/migration-guide/upgrading-from-0-10-to-0-11 /guides/migration/versions 302
/docs/guides/migration-guide/upgrading-to-014 /guides/migration/versions 302
/docs/upgrading-to-014 /guides/migration/versions 302
/docs/upgrading-to-0-14-1 /guides/migration/versions 302
/docs/upgrading-to-0-16-0 /guides/migration/versions 302
/docs/guides/migration-guide/upgrading-to-0-20-0 /guides/migration/versions/upgrading-to-v0.20 302
/docs/guides/migration-guide/upgrading-to-0-21-0 /guides/migration/versions/upgrading-to-v0.21 302
/docs/guides/migration-guide/upgrading-to-1-0-0 /guides/migration/versions/upgrading-to-v1.0 302
/docs/guides/migration-guide/upgrading-to-v1.0 /guides/migration/versions/upgrading-to-v1.0 302
/docs/guides/getting-help /guides/legacy/getting-help 302
/docs/guides/migration-guide/* /guides/migration/versions/:splat 301!
/docs/guides/best-practices /guides/best-practices
Expand Down
6 changes: 3 additions & 3 deletions website/blog/2019-05-01-how-we-structure-dbt-projects.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,11 @@ It’s important to note that **this is not the only, or the objectively best, w

* our views on data model design; which in turn are influenced by:
* the kinds of analytics problems we are solving for clients
* the data stack we typically work within, in which multiple data sources are loaded by third party tools, and the data warehouse is optimized for analytical queries (therefore we aren’t tightly bounded by performance optimization considerations).
* the data stack we typically work within, in which multiple data sources are loaded by third party tools, and the <Term id="data-warehouse" /> is optimized for analytical queries (therefore we aren’t tightly bounded by performance optimization considerations).

Our opinions are **almost guaranteed to change over time** as we update our views on modeling, are exposed to more analytics problems, and data stacks evolve. It’s also worth clearly stating here: the way we structure dbt projects makes sense for our projects, but may not be the best fit for yours! This article exists on Discourse so that we can have a conversation – I would love to know how others in the community are structuring their projects.

In comparison, the (recently updated) [best practices](/docs/guides/best-practices) reflect principles that we believe to be true for any dbt project. Of course, these two documents go hand in hand – our projects are structured in such a way that makes the those principles easy to observe, in particular:
In comparison, the (recently updated) [best practices](/guides/best-practices) reflect principles that we believe to be true for any dbt project. Of course, these two documents go hand in hand – our projects are structured in such a way that makes the those principles easy to observe, in particular:

* Limit references to raw data
* Rename and recast fields once
Expand Down Expand Up @@ -127,7 +127,7 @@ Some dbt users prefer to have one `.yml` file per model (e.g. `stg_braintree__cu

Earlier versions of the dbt documentation recommended implementing “base models” as the first layer of transformation – and we used to organize and name our models in this way, for example `models/braintree/base/base_payments.sql`.

We realized that while the reasons behind this convention were valid, the naming was an opinion, so in our recent update to the [best practices](/docs/guides/best-practices), we took the mention of base models out. Instead, we replaced it with the principles of “renaming and recasting once” and “limiting the dependencies on raw data”.
We realized that while the reasons behind this convention were valid, the naming was an opinion, so in our recent update to the [best practices](/guides/best-practices), we took the mention of base models out. Instead, we replaced it with the principles of “renaming and recasting once” and “limiting the dependencies on raw data”.

That being said, in our dbt projects every source flows through exactly one model of the following form:

Expand Down
4 changes: 2 additions & 2 deletions website/blog/2021-02-05-dbt-project-checklist.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ This post is the checklist I created to guide our internal work, and I’m shari
## ✅ Project structure
------------------------------------------------------------------------------------------------------------------------------------------------------

* If you are using dimensional modeling techniques, do you have staging and marts models?
* If you are using <Term id="dimensional-modeling" /> techniques, do you have staging and marts models?
* Do they use table prefixes like ‘fct\_’ and ‘dim\_’?
* Is the code modular? Is it one transformation per one model?
* Are you filtering as early as possible?
Expand Down Expand Up @@ -156,7 +156,7 @@ This post is the checklist I created to guide our internal work, and I’m shari

**Useful links**

* [Version control](/docs/guides/best-practices/#version-control-your-dbt-project)
* [Version control](/guides/legacy/best-practices#version-control-your-dbt-project)
* [dbt Labs' PR Template](/blog/analytics-pull-request-template)

## ✅ Documentation
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ This approach is nearly identical to the former (completely separate repositorie
* Does not prevent conflicting business logic or duplicate macros
* All models must have unique names across all packages

\*\* The project will include the information from the dbt projects but might be missing information that is pulled from your data warehouse if you are on multiple Snowflake accounts/Redshift instances. This is because dbt is only able to query the information schema from that one connection.
\*\* The project will include the information from the dbt projects but might be missing information that is pulled from your <Term id="data-warehouse" /> if you are on multiple Snowflake accounts/Redshift instances. This is because dbt is only able to query the information schema from that one connection.

## So… to mono-repo or not to mono-repo?
-------------------------------------------------------------------------------
Expand Down
2 changes: 1 addition & 1 deletion website/blog/2021-09-11-sql-dateadd.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ I am sorry - that’s just a blank 2x2 matrix. I've surrendered to just searchin

But couldn’t we be doing something better with those keystrokes, like typing out and then deleting a tweet?

dbt (and the [dbt_utils](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/#dateadd-source-macros-cross_db_utils-dateadd-sql-) macro package) helps us smooth out these wrinkles of writing SQL across data warehouses.
dbt (and the [dbt_utils](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/#dateadd-source-macros-cross_db_utils-dateadd-sql-) macro package) helps us smooth out these wrinkles of writing SQL across <Term id="data-warehouse">data warehouses</Term>.

Instead of looking up the syntax each time you use it, you can just write it the same way each time, and the macro compiles it to run on your chosen warehouse:

Expand Down
2 changes: 1 addition & 1 deletion website/blog/2021-11-22-primary-keys.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ Having tests configured and running in production using the [`dbt test`](https:/

Does your warehouse even _support_ primary keys at all? If it does, how can you actually find out if a table has a primary key set, and what that primary key is?

Let’s walk through primary key support + access across the major cloud data warehouse platforms.
Let’s walk through primary key support + access across the major cloud <Term id="data-warehouse" /> platforms.


### TL;DR on primary key support across warehouses
Expand Down
2 changes: 1 addition & 1 deletion website/blog/2021-11-22-sql-surrogate-keys.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ output:
| `null` | 123 | \|123 |


Let’s take a look at how generating surrogate keys specifically looks in practice across data warehouses, and how you can use one simple dbt macro ([dbt_utils.surrogate_key](https://github.com/dbt-labs/dbt-utils#surrogate_key-source)) to abstract away the null value problem.
Let’s take a look at how generating surrogate keys specifically looks in practice across <Term id="data-warehouse">data warehouses</Term>, and how you can use one simple dbt macro ([dbt_utils.surrogate_key](https://github.com/dbt-labs/dbt-utils#surrogate_key-source)) to abstract away the null value problem.


### A surrogate_key macro to the rescue
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ In addition to learning the basic pieces of dbt, we're familiarizing ourselves w

If we decide not to do this, we end up missing out on what the dbt workflow has to offer. If you want to learn more about why we think analytics engineering with dbt is the way to go, I encourage you to read the [dbt Viewpoint](/docs/about/viewpoint)!

In order to learn the basics, we’re going to [port over the SQL file](/tutorial/refactoring-legacy-sql) that powers our existing "patient_claim_summary" report that we use in our KPI dashboard in parallel to our old transformation process. We’re not ripping out the old plumbing just yet. In doing so, we're going to try dbt on for size and get used to interfacing with a dbt project.
In order to learn the basics, we’re going to [port over the SQL file](/guides/getting-started/learning-more/refactoring-legacy-sql) that powers our existing "patient_claim_summary" report that we use in our KPI dashboard in parallel to our old transformation process. We’re not ripping out the old plumbing just yet. In doing so, we're going to try dbt on for size and get used to interfacing with a dbt project.

**Project Appearance**

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ JaffleGaggle has to keep track of information about their interactions with thei

All of these questions require aggregating + syncing data from application usage, workspace information, and orders into the CRM for the sales team to have at their fingertips.

This aggregation process requires an analytics warehouse, as all of these things need to be synced together outside of the application database itself to incorporate other data sources (billing / events information, past touchpoints in the CRM, etc). Thus, we can create our fancy customer 360 within JaffleGaggle’s data warehouse, which is a standard project for a B2B company’s data team.
This aggregation process requires an analytics warehouse, as all of these things need to be synced together outside of the application database itself to incorporate other data sources (billing / events information, past touchpoints in the CRM, etc). Thus, we can create our fancy customer 360 within JaffleGaggle’s <Term id="data-warehouse" />, which is a standard project for a B2B company’s data team.

**Diving into data modeling**

Expand Down
2 changes: 1 addition & 1 deletion website/blog/2022-02-23-founding-an-AE-team-smartsheet.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Enter this story. I’m Nate and I manage the Analytics Engineering team at [Sma

## State of Analytics Before Analytics Engineering

Smartsheet, in general, has a great analytics setup. Strong data engineering and data analytics teams. A cloud data warehouse and an on-prem BI tool for front-end data visibility.  However, even with that foundation, there were some limitations under the hood requiring action:
Smartsheet, in general, has a great analytics setup. Strong data engineering and data analytics teams. A cloud <Term id="data-warehouse" /> and an on-prem BI tool for front-end data visibility.  However, even with that foundation, there were some limitations under the hood requiring action:

### (1) Multiple undocumented transformation databases

Expand Down
2 changes: 1 addition & 1 deletion website/blog/2022-04-19-complex-deduplication.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Let’s get rid of these dupes and send you on your way to do the rest of the *s

<!--truncate-->

You’re here because your duplicates are *special* duplicates. These special dupes are not the basic ones that have same exact values in every column and duplicate <Term id="primary-key">primary keys</Term> that can be easily fixed by haphazardly throwing in a `distinct` (yeah that’s right, I called using `distinct` haphazard!). These are *partial* duplicates, meaning your entity of concern's primary key is not unique *on purpose* (or perhaps you're just dealing with some less than ideal data syncing). You may be capturing historical, type-two slowly changing dimensional data, or incrementally building a table with an append-only strategy, because you actually want to capture some change over time for the entity your recording. (Or, as mentioned, your loader may just be appending data indiscriminately on a schedule without much care for your time and sanity.) Whatever has brought you here, you now have a table where the <Term id="grain" /> is not your entity’s primary key, but instead the entity’s primary key + the column values that you’re tracking. Confused? Let’s look at an example.
You’re here because your duplicates are *special* duplicates. These special dupes are not the basic ones that have same exact values in every column and duplicate <Term id="primary-key">primary keys</Term> that can be easily fixed by haphazardly throwing in a `distinct` (yeah that’s right, I called using `distinct` haphazard!). These are *partial* duplicates, meaning your entity of concern's primary key is not unique *on purpose* (or perhaps you're just dealing with some less than ideal data syncing). You may be capturing historical, type-two slowly changing <Term id="dimensional-modeling">dimensional</Term> data, or incrementally building a table with an append-only strategy, because you actually want to capture some change over time for the entity your recording. (Or, as mentioned, your loader may just be appending data indiscriminately on a schedule without much care for your time and sanity.) Whatever has brought you here, you now have a table where the <Term id="grain" /> is not your entity’s primary key, but instead the entity’s primary key + the column values that you’re tracking. Confused? Let’s look at an example.

Here’s your raw table:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Analysts are interfacing with data from the outside in. They are in meetings wit

- Precomputed views/tables in a BI tool
- Read-only access to the dbt Cloud IDE docs
- Full list of tables and views in their data warehouse
- Full list of tables and views in their <Term id="data-warehouse" />

#### Precomputed views/tables in a BI tool

Expand Down
Loading

0 comments on commit 608558f

Please sign in to comment.