# Analytics Engineering

### Introduction

In the last couple of lessons, we have learned about the various tools involved in top of the funnel (awareness), middle of the funnel (nurturing), and bottom of the funnel (sales), and active users.  

The next step is to learn about the data engineer's role in all of this.

### 1. Dashboarding and Internal Analysis

The first component is in performing internal analysis.  This involves a stack of migrating data captured by our various marketing tools like Facebook ads for top of the funnel, event tracking for middle of the funnel, hubspot for bottom of the funnel, and stripe for payments.

Ultimately, a company would want to what ads, site features, or sales efforts are driving users to make a purchase -- and which of these components are less productive.  Then ultimately, a company needs to see how these efforts tie to active users and ultimately revenue.

To capture this information, a data engineer would be used to collect information in third party tools through an EL tool like Fivetran or Meltano, and load the data into an analytics database.  Then we would transform and merge information together with DBT, and ultimately present this in a dashboard, or organize data for further analysis by other stakeholders like data scientists or business analysts.

### 2. Integrating Tools

In addition to collecting data for internal analysis, data engineers help to enrich data on existing platforms.  There are a couple of ways that data engineers can assist with this.

1. Share data between SAAS tools

One task of data engineers is to take data from one tool and populate that data into another tool.  For example, imagine that in hubspot the sales person identifies the industry of our customer.  If that person later asks to customer support through zendesk, it may be helpful to also indicate the industry of the user to provide more context.  Or maybe our web tracking software identifes the location of the user, and we want that information also captured in our sales tool like Hubspot. 

This process of capturing data from our tools to then load into other tools is called *reverse ETL*.  It's called reverse ETL because if ETL is pulling data from our third party tools to our analytics database, one way of populating these third party tools is loading data from our database into the tooling.

Currently, there are off the shelf products that will automatically pull data from some tools and load it into others.  There are reverse tools like [Census](https://www.getcensus.com/).  And there are also Customer Data Platform (CDP web tracking), that both perform Reverse ETL, and also provide web tracking.

2. CDP Web Tracking

We've already heard about tools like Mixpanel or Google Analytics that already perform web tracking for us.  So why would we want a CDP to also perform web tracking?  The idea with web tracking with a CDP is just to have a single service that tracks various user events.  Then the CDP can send these events to Google Analytics to track certain web events, and *also* send these events to Mixpanel for other analysis.  

The idea is to use the CDP as an event tracking API that multiple services can listen into.  Using the CDP reduces engineering work as we would not need to add separate code when different services want to listen to the same event.  Using a CDP also allows companies to change services they use with limited engineering cost.

In theory, a Customer Data Platform is supposed to be a central data hub for our analytics service.  It listens to events sending event data to multiple services.  And it performs reverse ETL pulling data from one third party tool and sending it to another.  
> Our data warehouse, by contrast, captures the data needed for internal analysis, but does not need to capture data related to reverse ETL as the CDP can take care of this.

Examples of CDPs include [Rudderstack](https://rudderstack.com/),  [Segment](https://segment.com/) and [Snowplow](https://snowplowanalytics.com/).

3. Customer Enrichment Platform

Finally, there exists information about our customer that exists that we may not be capturing internally, but other companies are capturing.  For example, with a service like [Clearbit](https://clearbit.com/), will attempt to identify visitors to our website, or we can send a captured email address to Clearbit, and it will pull in additional information to then send to other services.

### Resources

[Amplitude Guide to Product Analytics](https://amplitude.com/product-analytics)

[Marketo Cost of Acquisition](https://www.youtube.com/watch?v=d626-SXyh8I)

[Startup Founders Analytics](https://thinkgrowth.org/the-startup-founders-guide-to-analytics-1d2176f20ac1)

[Rittman Analytics - Modern Stack](https://rittmananalytics.com/blog/category/Modern+Analytics+Stack)