# Loading Hubspot Data

### Introduction

In this lesson, we'll load some northwinds hubspot data from amazon S3 into snowflake.  Let's get started. 

### Getting started

We have uploaded sample hubspot data and northwinds data to s3.

<img src="./northwinds.png" width="100%">

And we can access these files at the following urls:

* `s3://jigsaw-labs-student/northwinds/northwinds_hubspot.csv`
* `s3://jigsaw-labs-student/northwinds/northwinds_mixpanel.csv`

And remember that our API keys to read from these files is the following:

* KEY_ID: `'AKIARIMMA5YSEFN6KMGI'`
* SECRET: `'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'`

Now use that information to create a stage for our hubspot data.  

> Make sure to create the stage in the database that DBT has access to.

Then let's see how we did by selecting first six columns from that stage. 

> We should see something like the following:

<img src="./select-data.png" width="100%">

If we look at the last column, we can see that this data is a little off.  The issue is that some of our business names have commas in it -- like "Fadel, Luilwitz and Nitzsche" -- and snowflake is reading these as separate values.  We can tell snowflake to not parse comas within the quotation marks by adding a FILE_FORMAT option of `FIELD_OPTIONALLY_ENCLOSED_BY='"'` to our stage, like so. 

```sql
CREATE or replace STAGE ...
FILE_FORMAT = (FIELD_OPTIONALLY_ENCLOSED_BY='"');
```

So add this and then we should see that only have five columns that look like the following:

<img src="./hubspot-select.png" width="100%">

Ok, now we're ready to load our data into a table.  Create a schema called `hubspot` and a table called `contacts` with five columns for each of the columns above.

Now it's time to copy in the data.  We can accomplish this without allowing for any errors -- so do not include a `ON_ERROR=CONTINUE` parameter.  Instead with the file format, we need to provide parameters to both skip the first row, and the `FIELD_OPTIONALLY_ENCLOSED_BY='"'` parameter.

Then if you select the first five rows from the table, you should see something like the following:

<img src="./first-five-rows.png" width="100%">

When that is accomplished, we are ready to move into working with this data in DBT.

### Summary

In this lesson, we practiced loading data into our database using staging.  To initially load our timestamp data -- which snowflake had a hard time interpreting -- we reduced the restrictions on that data by changing the data type to be of type string.

### Resources

[Transform timestamp](https://docs.snowflake.com/en/user-guide/data-load-transform.html#current-time-current-timestamp-default-column-values)

[Snowflake copy into](https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html)