In this example, we will demonstrate how you can easily go from prototyping for development purposes to production with Git integration.

We will show an example of a simple data pipeline with one query. By changing the `MODE` variable to `DEV` or `PROD` with different warehouse and schema configurations.

For `DEV`, we will be using an extra small warehouse on a sample of the TPCH data.
For `PROD`, we will be using a large warehouse on a sample of the TPCH data that is 100X the size of the DEV sample.

In [None]:
MODE = "DEV" # Parameter to control whether to run in DEV or PROD mode

if MODE == "DEV":
    # For development, use XSMALL warehouse on TPCH data with scale factor of 1
    warehouse_name = "GIT_EXAMPLE_DEV_WH"
    schema_name = "TPCH_SF1"
    size = 'XSMALL'
elif MODE == "PROD": 
    # For production, use LARGE warehouse on TPCH data with scale factor of 100
    warehouse_name = "GIT_EXAMPLE_PROD_WH"
    schema_name = "TPCH_SF100"
    size = 'LARGE'

Let's create and use a warehouse with the specified name and size.

In [None]:
-- Create warehouse with specified name and size
CREATE OR REPLACE WAREHOUSE {{warehouse_name}} WITH WAREHOUSE_SIZE= {{size}};

In [None]:
-- Use specified warehouse for subsequent query
USE WAREHOUSE {{warehouse_name}};

Use the TPC-H Sample dataset with differing scale factor. 
- Note: Sample data sets are provided in a database named SNOWFLAKE_SAMPLE_DATA that has been shared with your account from the Snowflake SFC_SAMPLES account. If you do not see the database, you can create it yourself. Refer to [Using the Sample Database](https://docs.snowflake.com/en/user-guide/sample-data-using).

In [None]:
USE SCHEMA SNOWFLAKE_SAMPLE_DATA.{{schema_name}};  

Check out the number of rows in the `LINEITEM` table.

In [None]:
SELECT COUNT(*) FROM LINEITEM;

Now let's run a query on this dataset:
- The query lists totals for extended price, discounted extended price, discounted extended price plus tax, average quantity, average extended price, and average discount. These aggregates are grouped by RETURNFLAG and LINESTATUS, and listed in ascending order of RETURNFLAG and LINESTATUS. A count of the number of line items in each group is included.

In [None]:
select
       l_returnflag,
       l_linestatus,
       sum(l_quantity) as sum_qty,
       sum(l_extendedprice) as sum_base_price,
       sum(l_extendedprice * (1-l_discount)) as sum_disc_price,
       sum(l_extendedprice * (1-l_discount) * (1+l_tax)) as sum_charge,
       avg(l_quantity) as avg_qty,
       avg(l_extendedprice) as avg_price,
       avg(l_discount) as avg_disc,
       count(*) as count_order
 from
       lineitem
 where
       l_shipdate <= dateadd(day, -90, to_date('1998-12-01'))
 group by
       l_returnflag,
       l_linestatus
 order by
       l_returnflag,
       l_linestatus;

Using the cell referencing, we get the query ID and history of the query we just ran.

In [None]:
# Get query ID of the referenced cell
query_id = cell11.result_scan_sql().split("'")[1]

In [None]:
select * from table(information_schema.query_history_by_warehouse('{{warehouse_name}}')) 
where query_id = '{{query_id}}';

Finally, we compile all of this information into a report to document the run information.

In [None]:
import streamlit as st
from datetime import datetime
st.header(f"[{MODE}] Run Report")
st.markdown(f"Generated on: {datetime.now()}")

st.markdown(f"### System Information")
# Print session information
from snowflake.snowpark.context import get_active_session
session = get_active_session()
st.markdown(f"**Database:** {session.get_current_database()[1:-1]}")
st.markdown(f"**Schema:** {session.get_current_schema()[1:-1]}")
st.markdown(f"**Warehouse:** {session.get_current_warehouse()[1:-1]}")

st.markdown(f"### Query Information")
# Print session information
st.markdown(f"**Query ID:** {query_id}")
result_info = cell14.to_pandas()
st.markdown("**Query Text:**")
st.code(result_info["QUERY_TEXT"].values[0],language='sql',line_numbers=True)
st.markdown("**Runtime information:**")
st.dataframe(result_info[['START_TIME','END_TIME','TOTAL_ELAPSED_TIME']])