# Building Staging Views

### Introduction

So as we know, we'll begin with our source tables, and then use DBT to make transformations by using staging.  In our staging step, we do not combine data from multiple sources, but rather just provide some initial transformations.  Let's see how in this lesson.

### Getting setup

Let's get started by creating a new branch called `build_staging_customers`.  Then, from there, under the models folder, we can create a new folder called `staging`.

And from there, let's create a new file called `stg_rds_customers.sql`.

So now our file tree should look like the following.

> <img src="./file_tree.png" width="60%">

Ok, now let's move to writing our staging table.

### Our staging table

The code for our staging table should look something like the following:

<img src="./initial_staging.png" width="80%">

As we can see from the above, the code for our staging table is broken into three segments:

1. Import CTEs
2. Logical CTEs
3. Final select statement

Let's go through these in turn.

### Import CTEs

We can see that import CTE at the top of the file.

```sql
WITH source as (
  SELECT * FROM "FIVETRAN_DATABASE"."POSTGRES_NORTHWINDS_RDS_PUBLIC"."CUSTOMERS"
), 
```

The import CTEs are simply where we are specifying the tables needed for our file.  These import CTEs are analogous to the `import` statements at the top of a python file: we specify the dependencies of the file up top.

So with each import CTE, we select all columns from the relevant table, and then rename the table.  Above we're using the customers table, so we select all columns, and rename it as `source`.

> Where we are only loading one table, as we often are in, we can assign a name of source.  If we were loading more than one table, then something like `customers` would be more descriptive. 

### Logical CTEs

Above, our logical CTE is the following:

```sql
renamed as (
    SELECT customer_id, country, 
    SPLIT_PART(contact_name, ' ', 1) as first_name,
    SPLIT_PART(contact_name, ' ', -1) as last_name
    FROM source
)
```

As we can see, the logical CTE is where we primarily do our work.  For example, above, we break our `contact_name` column into the columns of `first_name`, and `last_name`. 

### Final Select statement

Then we end our file with the final select statement, that specifies what will be the final form of our view.  Below, we simply select all of the columns from our logical CTE like so:

```sql
select * FROM renamed
```

### Altogether now

So now let's take another look at the entire staging file.  Identify each of three components, and explain what each component does.

```sql
WITH source as (
  SELECT * FROM "FIVETRAN_DATABASE"."POSTGRES_NORTHWINDS_RDS_PUBLIC"."CUSTOMERS"
), 
renamed as (
    SELECT customer_id, country, 
    SPLIT_PART(contact_name, ' ', 1) as first_name,
    SPLIT_PART(contact_name, ' ', -1) as last_name
    FROM source
)
select * FROM renamed
```

### Summary

In this lesson, we learned about how to structure a staging file, and work involved in staging.  In our staging files, we perform clean up our source data.  With these transformations, we primarily work with one source table at a time.  And we structure our staging files with the following sequence:

```sql
--- import ctes
WITH source as (
  SELECT * FROM "FIVETRAN_DATABASE"."POSTGRES_NORTHWINDS_RDS_PUBLIC"."CUSTOMERS"
),  
--- logical ctes
renamed as (
    SELECT customer_id, country, 
    SPLIT_PART(contact_name, ' ', 1) as first_name,
    SPLIT_PART(contact_name, ' ', -1) as last_name
    FROM source
)
--- final select statement
select * FROM renamed
```