<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Snowflake - Basic usage and querying data

<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Snowflake/Snowflake_Basics_and_querying_data.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/open_in_naas.svg"/></a>

**Tags:** #snowflake #data #warehouse #naas_drivers #snippet

**Author:** [Mateusz Polakowski](https://www.linkedin.com/in/polakowski/)

This notebook shows basic usage of Snowflake driver.
Below you can find essential operations for setting up the environment and querying data that already exists in your data warehouse.

## Input

### Import library

In [2]:
import os
from naas_drivers import snowflake
from snowflake.connector.errors import ProgrammingError

### Setup Snowflake account

If you don't have your SF account, you can easily set up a [30-day trial account with $400 budget here](https://signup.snowflake.com/).

### Credentials

In [3]:
# Here environment variables are used to pass Snowflake credentials, 
# but it's okay to do it in a different manner

sf_username=os.environ['SNOWFLAKE_USER']
sf_password=os.environ['SNOWFLAKE_PASSWORD']
sf_account=os.environ['SNOWFLAKE_ACCOUNT']

## Model

### Connecting to your Snowflake account

In [4]:
snowflake.connect(
    username=sf_username,
    password=sf_password,
    account=sf_account
)

### Environment setup

In [5]:
snowflake.database is None

True

In [6]:
snowflake.warehouse = 'COMPUTE_WH'
snowflake.database = 'SNOWFLAKE_SAMPLE_DATA'
snowflake.schema = 'TPCH_SF100'
snowflake.role = 'ACCOUNTADMIN'

In [7]:
snowflake.warehouse, snowflake.database, snowflake.schema, snowflake.role

('COMPUTE_WH', 'SNOWFLAKE_SAMPLE_DATA', 'TPCH_SF100', 'ACCOUNTADMIN')

## Output

### Creating new database and schema

In [8]:
snowflake.api.database.create('NAAS', or_replace=True, return_statement=True)

{'results': [('Database NAAS successfully created.',)],
 'description': [ResultMetadata(name='status', type_code=2, display_size=None, internal_size=16777216, precision=None, scale=None, is_nullable=True)],
 'statement': 'CREATE OR REPLACE DATABASE NAAS'}

In [9]:
snowflake.database = 'NAAS'

In [10]:
snowflake.api.schema.create('INGESTION_SCHEMA')

{'results': [('Schema INGESTION_SCHEMA successfully created.',)],
 'description': [ResultMetadata(name='status', type_code=2, display_size=None, internal_size=16777216, precision=None, scale=None, is_nullable=True)],
 'statement': ''}

In [11]:
snowflake.schema = 'INGESTION_SCHEMA'

### Executing custom query with cursor

In [12]:
snowflake.cursor.execute('SHOW SCHEMAS;').fetchall()

[(datetime.datetime(2022, 8, 3, 12, 17, 45, 933000, tzinfo=<DstTzInfo 'America/Los_Angeles' PDT-1 day, 17:00:00 DST>),
  'INFORMATION_SCHEMA',
  'N',
  'N',
  'NAAS',
  '',
  'Views describing the contents of schemas in this database',
  '',
  '1'),
 (datetime.datetime(2022, 8, 3, 12, 17, 43, 223000, tzinfo=<DstTzInfo 'America/Los_Angeles' PDT-1 day, 17:00:00 DST>),
  'INGESTION_SCHEMA',
  'N',
  'Y',
  'NAAS',
  'ACCOUNTADMIN',
  '',
  '',
  '1'),
 (datetime.datetime(2022, 8, 3, 12, 17, 41, 53000, tzinfo=<DstTzInfo 'America/Los_Angeles' PDT-1 day, 17:00:00 DST>),
  'PUBLIC',
  'N',
  'N',
  'NAAS',
  'ACCOUNTADMIN',
  '',
  '',
  '1')]

### Querying data - wrong schema

In [13]:
snowflake.warehouse, snowflake.database, snowflake.schema, snowflake.role

('COMPUTE_WH', 'NAAS', 'INGESTION_SCHEMA', 'ACCOUNTADMIN')

In [14]:
query = 'SELECT * FROM CUSTOMER;'

In [15]:
# Querying table that doesn't exist in NAAS/INGESTION_SCHEMA
try:
    results_1_not_working = snowflake.execute(query)
except ProgrammingError as pe:
    print('Something went wrong!')
    print(pe)

Something went wrong!
002003 (42S02): SQL compilation error:
Object 'CUSTOMER' does not exist or not authorized.


### Querying data - valid command run with extra ad-hoc session environment modification

In [16]:
# Ad-hoc changes in environment only for the command to be executed
results_1_working = snowflake.execute(query, database='SNOWFLAKE_SAMPLE_DATA', schema='TPCH_SF100')
results_1_working

{'results': [(12000001,
   'Customer#012000001',
   'LAhNvSqdEb7R63OpaEVPwkCfkK5Sugao1loIoIsT',
   10,
   '20-467-775-1131',
   Decimal('6012.48'),
   'FURNITURE',
   'requests. final foxes integrate after the fluffily thin deposi'),
  (12000002,
   'Customer#012000002',
   'CBog13b1IQqDUe153w0LXM5wji',
   4,
   '14-883-132-9248',
   Decimal('3722.74'),
   'MACHINERY',
   ' furiously. unusual requests run'),
  (12000003,
   'Customer#012000003',
   'NdDkQhHTcXhpJWroQGkxpKa3,,8xUob8Y',
   2,
   '12-818-591-4368',
   Decimal('9085.06'),
   'BUILDING',
   'ully blithe accounts. fluffily unusual foxes about the furiously even asymptotes w'),
  (12000004,
   'Customer#012000004',
   'UgudUPqA61',
   5,
   '15-765-367-1664',
   Decimal('-273.06'),
   'BUILDING',
   'xes haggle according to the slyly unusual dolphins. quickly e'),
  (12000005,
   'Customer#012000005',
   'f4Tvx 0vXUxxc',
   7,
   '17-910-236-8625',
   Decimal('6607.49'),
   'MACHINERY',
   'lent packages across the quickly re

### Querying data - valid command run

In [17]:
# Updating session environment
snowflake.database = 'SNOWFLAKE_SAMPLE_DATA'
snowflake.schema = 'TPCH_SF100'

In [18]:
results_2 = snowflake.execute(query, n=100)

print(f"Rows returned: {len(results_2['results'])}")
results_2['results'][:2]

Rows returned: 100


[(12000001,
  'Customer#012000001',
  'LAhNvSqdEb7R63OpaEVPwkCfkK5Sugao1loIoIsT',
  10,
  '20-467-775-1131',
  Decimal('6012.48'),
  'FURNITURE',
  'requests. final foxes integrate after the fluffily thin deposi'),
 (12000002,
  'Customer#012000002',
  'CBog13b1IQqDUe153w0LXM5wji',
  4,
  '14-883-132-9248',
  Decimal('3722.74'),
  'MACHINERY',
  ' furiously. unusual requests run')]

In [19]:
# The same results get from `query` function
results_3 = snowflake.query(query, n=100)

print(f"Rows returned: {len(results_3['results'])}")
results_3['results'][:2]

Rows returned: 100


[(12000001,
  'Customer#012000001',
  'LAhNvSqdEb7R63OpaEVPwkCfkK5Sugao1loIoIsT',
  10,
  '20-467-775-1131',
  Decimal('6012.48'),
  'FURNITURE',
  'requests. final foxes integrate after the fluffily thin deposi'),
 (12000002,
  'Customer#012000002',
  'CBog13b1IQqDUe153w0LXM5wji',
  4,
  '14-883-132-9248',
  Decimal('3722.74'),
  'MACHINERY',
  ' furiously. unusual requests run')]

### Querying data - mapping results to Pandas DataFrame

In [20]:
# Querying SF data and turning it into Pandas DataFrame
results_pandas = snowflake.query_pd(query, n=100)

print(f'Rows returned: {len(results_pandas)}')
results_pandas.head(2)

Rows returned: 100


Unnamed: 0,C_CUSTKEY,C_NAME,C_ADDRESS,C_NATIONKEY,C_PHONE,C_ACCTBAL,C_MKTSEGMENT,C_COMMENT
0,12000001,Customer#012000001,LAhNvSqdEb7R63OpaEVPwkCfkK5Sugao1loIoIsT,10,20-467-775-1131,6012.48,FURNITURE,requests. final foxes integrate after the fluf...
1,12000002,Customer#012000002,CBog13b1IQqDUe153w0LXM5wji,4,14-883-132-9248,3722.74,MACHINERY,furiously. unusual requests run


## Extra

### Objects: `cursor` and `connection`

Both provided by Snowflake connector, that allow to execute any functionality possible with the original connector.

In [21]:
snowflake.cursor.execute('SELECT CURRENT_WAREHOUSE()').fetchall()

[('COMPUTE_WH',)]

In [22]:
snowflake.connection.database

'SNOWFLAKE_SAMPLE_DATA'