Welcome to Snowflake! This guide helps you get started with Snowpark for data exploration and analysis. In this exercise, you will:

 * Load data from Snowflake tables into Snowpark DataFrames
 * Perform exploratory data analysis on Snowpark DataFrames
 * Pivot and join data from multiple tables using Snowpark DataFrames
 * Save transformed data into a Snowflake table

## Import Snowpark and create Snowpark session

In [None]:
import snowflake.snowpark as snowpark
from snowflake.snowpark.functions import month,year,col,sum

In [None]:
from snowflake.snowpark.context import get_active_session
session = get_active_session()

## Load `campaign_spend` and  `monthly_revenue` tables into Snowpark dataframes

In [None]:
snow_df_spend = session.table('campaign_spend')
snow_df_revenue = session.table('monthly_revenue')

## Total Spend per Year and Month For All Channels
Let's transform the campaign spend data so we can see total cost per year/month per channel using `group_by()` and `agg()` Snowpark DataFrame functions.

In [None]:
snow_df_spend_per_channel = snow_df_spend.group_by(year('DATE'), month('DATE'),'CHANNEL').\
                                            agg(sum('TOTAL_COST').as_('TOTAL_COST')).\
                                            with_column_renamed('"YEAR(DATE)"',"YEAR").\
                                            with_column_renamed('"MONTH(DATE)"',"MONTH").\
                                            sort('YEAR','MONTH')
snow_df_spend_per_channel.show()

## Total Spend per Year and Month
Let's further transform the campaign spend data by `pivoting` on the `channel` dimension. This should give us the campaign spend for every month across all channels on the same row.

In [None]:
snow_df_spend_per_month = snow_df_spend_per_channel.pivot('CHANNEL',['search_engine','social_media','video','email']).\
                                                    sum('TOTAL_COST').\
                                                    sort('YEAR','MONTH')

snow_df_spend_per_month = snow_df_spend_per_month.select(
        col("YEAR"),
        col("MONTH"),
        col("'search_engine'").as_("SEARCH_ENGINE"),
        col("'social_media'").as_("SOCIAL_MEDIA"),
        col("'video'").as_("VIDEO"),
        col("'email'").as_("EMAIL")
    )

snow_df_spend_per_month.show()

## Total Revenue per Year and Month
Now let's transform the revenue data into revenue per year/month using `group_by()` and `agg()` functions.

In [None]:
snow_df_revenue_per_month = snow_df_revenue.group_by('YEAR','MONTH').\
                                            agg(sum('REVENUE')).\
                                            sort('YEAR','MONTH').\
                                            with_column_renamed('SUM(REVENUE)','REVENUE')
snow_df_revenue_per_month.show()

## Join Total Spend and Total Revenue per Year and Month Across All Channels
Next let's `join` this `revenue` data with the transformed `campaign spend` data so we can analyze the spend and revenue data side by side. 

In [None]:
snow_df_spend_and_revenue_per_month = snow_df_spend_per_month.join(snow_df_revenue_per_month, ["YEAR","MONTH"])
snow_df_spend_and_revenue_per_month.show()

## Examine DataFrame Explain Plan
Snowpark makes it really convenient to look at the DataFrame query and execution plan using `explain()` Snowpark DataFrame function.

In [None]:
snow_df_spend_and_revenue_per_month.explain()

## Save Transformed Data into Snowflake Table
Let's save the transformed data into a Snowflake table `SPEND_AND_REVENUE_PER_MONTH`

In [None]:
snow_df_spend_and_revenue_per_month.write.mode('overwrite').save_as_table('SPEND_AND_REVENUE_PER_MONTH')

## Continue your learning!

This notebook is simply a `Hello World` of `Data Engineering with Snowpark`. To learn advanced data engineering with Snowflake, hop on to https://quickstarts.snowflake.com/guide/data_engineering_with_notebooks/index.html.