# Stored Procedures Examples

This notebook contains diffrenet examples of how to create Stored Procedures using the Snowpark API

In [None]:
# Make sure we do not get line breaks when doing show on wide dataframes
from IPython.core.display import HTML
display(HTML("<style>pre { white-space: pre !important; }</style>"))

# Snowpark imports 
import snowflake.snowpark as S
from snowflake.snowpark import Session, DataFrame
from snowflake.snowpark import functions as F
from snowflake.snowpark import types as T

# Used for reading creds.json
import json

# Print the version of Snowpark we are using
print(f"Using Snowpark: {S.__version__}")

## Connect to Snowflake

This example is using a JSON file with the following structure
```
{
    "account":"MY SNOWFLAKE ACCOUNT",
    "user": "MY USER",
    "password":"MY PASSWORD",
    "role":"MY ROLE",
    "warehouse":"MY WH",
    "database":"MY DB",
    "schema":"MY SCHEMA"
}

```

In [None]:
with open('../creds.json') as f:
    connection_parameters = json.load(f)

snf_session = Session.builder.configs(connection_parameters).create()
print("Current role: " + snf_session.get_current_role() + ", Current schema: " + snf_session.get_fully_qualified_current_schema() + ", Current WH: " + snf_session.get_current_warehouse())

# Python Stored Procedures

A Stored Procedure can be created using the **@sporc** decorator, the **sporc** function or the **sporc.register** method ofthe session object. It can be permanent or temporary.

## Using a function
Start by creating a Stored Procedure (SP) that returns a string, by setting *is_permanent=False* the SP will only be avalible for our user and also only until the active Snowflake session is closed.
By using **session.clear_imports()** and **session.clear_packages()** we make sure that old imports and packages are not included for the creation.

The function for the Stored Procedure needs to have a Snowflake session type as the first argument and after that additional agruments.

In [None]:
snf_session.clear_imports()
snf_session.clear_packages()
@F.sproc(name="hello_sp", is_permanent=False, replace=True, packages=['snowflake-snowpark-python'], session=snf_session)
def hello_sp(session: Session, name: str) -> str:
    curr_db = session.get_current_database()
    return f'Hello {name} this is running in the {curr_db} database!'

The Stored Procedure can be called using the **call** function of the Snowflake session object or using sql (CALL sp_name(arg1, arg2, ..)). When calling a Python Stored Procedure the session is never provided as a argument, it is handled snowflake

In [None]:
hello_sp(snf_session, 'mats')

In [None]:
snf_session.call("hello_sp", "mats")

To call a stored procedure with SQL

In [None]:
snf_session.sql("call hello_sp('mats')").show(max_width=150)

Get information about the Stored Procedure

In [None]:
snf_session.sproc.describe(hello_sp).show(max_width=150)

## Default values for arguments
We can add default values to the parameter, however we still need to provide the input, so instead of using default value we need to check for Null/None and make sure that we provide a Null/None value to indicate that the default value should be used

In [None]:
snf_session.clear_imports()
snf_session.clear_packages()
@F.sproc(name="hello_default_sp", is_permanent=False, replace=True, packages=['snowflake-snowpark-python'], session=snf_session)
def hello_default_sp(session: Session, name: str, age:int) -> str:
    if age is None:
        age = 45
    return f'Hello {name} with the age of {age}!'

In [None]:
snf_session.call("hello_default_sp", "mats", 49)

In [None]:
snf_session.call("hello_default_sp", "mats", None)

In [None]:
hello_default_sp(snf_session,"mats", None)

We can also use the fact that you can have multiple Stored Procedures with same name in a schema as long they have diffrent number of parameters

So, if we create a hello_default_sp without the age parameter:

In [None]:
@F.sproc(name="hello_default_sp", is_permanent=False, replace=True, packages=['snowflake-snowpark-python'], session=snf_session)
def hello_default_sp(session: Session, name: str) -> str:
    age = 45
    return f'Hello {name} with the default age of {age}!'

We can then call the Stored Procedure without the age parameter

In [None]:
snf_session.call("hello_default_sp", "mats")

If we then add the age argument, the previous created Stored Procedure will be called:

In [None]:
snf_session.call("hello_default_sp", "mats", 49)

##  Register a file as a Stored Procedure

We can create a stored procedure from a Python file using **sproc.register_from_file**. The file can be local or on a Snowflake stage.

### Create a Stored Procedure from a local file
When using a local file, then the file will be uploaded to Snowflake and as a part of the Stored Procedure meaning that if we update the file we need to recreate the Stored Procedure.

Start by creating a file localy with a simple function in it

In [None]:
%%writefile ../py_scripts/sp_examples/file_sp.py
from snowflake.snowpark import Session

def hello(session: Session, name: str) -> str:
    curr_db = session.get_current_database()
    return f'Hello {name} this SP is created from a local Python file and is running in the {curr_db} database!'

Register a Stored Procedure using the file we just created.

In [None]:
snf_session.clear_imports()
snf_session.clear_packages()

file_sp = snf_session.sproc.register_from_file(name="local_file_sp", file_path="../py_scripts/sp_examples/file_sp.py", func_name="hello"
                                           , packages=['snowflake-snowpark-python'], replace=True, is_permanent=False)


Call the Stored Procedure

In [None]:
snf_session.call("local_file_sp", "mats")

### Create a Stored Procedure from a file on a Snowflake stage

By first storing our python file on a Snowflake stage we can update the file without having to recreate the Stored Procedure.

Start by creating anew file with afunction in it.

In [None]:
%%writefile ../py_scripts/sp_examples/stage_sp.py
from snowflake.snowpark import Session

def hello(session: Session, name: str) -> str:
    curr_db = session.get_current_database()
    return f'Hello {name} this SP is created from a Python file on a stage and is running in the {curr_db} database!'

We also need a Snowflake stage to store the file, we can either use a external stage (AWS S3, Azure Blob Storage , Google Cloud Storage) or a internal stage (managed by Snowflake).  In this example we are using a Snowflake internal stage.

In [None]:
snf_session.sql("create or replace stage python_files").collect()

To ad the file to the stage we can use **file.put** if the stage is a Snowflake Internal, if using a cloud provider we need to use their tools to upload it.

In [None]:
snf_session.file.put('../py_scripts/sp_examples/stage_sp.py', '@python_files/sp_examples/', auto_compress=False, overwrite=True)

When creating a Python Stored Procedure from a file on a stage we need to provide what data types the arguments and return value have through the **return_type** and **input_type** parameters.

In [None]:
snf_session.clear_imports()
snf_session.clear_packages()

file_sp = snf_session.sproc.register_from_file(name="file_stage_sp", file_path="@python_files/sp_examples/stage_sp.py", func_name="hello"
                                           , return_type=T.StringType(), input_types=[T.StringType()]
                                           , packages=['snowflake-snowpark-python']
                                           , replace=True, is_permanent=False)


Test the Stored Procedure

In [None]:
snf_session.call("file_stage_sp", "mats")

If we update the file...

In [None]:
%%writefile ../py_scripts/sp_examples/stage_sp.py
from snowflake.snowpark import Session

def hello(session: Session, name: str) -> str:
    curr_db = session.get_current_database()
    return f'Hello {name} this SP is using a updated Python file from a stage and is running in the {curr_db} database!'

In [None]:
snf_session.file.put('../py_scripts/sp_examples/stage_sp.py', '@python_files/sp_examples/', auto_compress=False, overwrite=True)

And if we call the Stored Procedure again we are now using the new version

In [None]:
snf_session.call("file_stage_sp", "mats")

## Using a function in a Python file
### Using a local Python file
We can use functions in Python files from a Python Stored Procedure, we only need to add those files using the **import** parameter.

Start by creating a file

In [None]:
%%writefile ../py_scripts/sp_examples/modules/local_module.py
def hello_name(name: str) -> str:
    return f'Hello {name} using a Python file for this function!'


Create Python Stored Procedure that is using the function in the file

In [None]:
snf_session.clear_imports()
snf_session.clear_packages()
@F.sproc(name="hello_file_sp", is_permanent=False, replace=True, packages=['snowflake-snowpark-python'],imports=['../py_scripts/sp_examples/modules/local_module.py'], session=snf_session)
def hello_file_sp(session: Session, name: str) -> str:
    import local_module
    return local_module.hello_name(name)

In [None]:
snf_session.call("hello_file_sp", "mats")

We can also import all files in a local directory. When doing that we need to import from the folder name.

Start with creating an addtional file

In [None]:
%%writefile ../py_scripts/sp_examples/modules/local_second_module.py
def reverse_name(name: str) -> str:
    return name[::-1]


Create Python Stored Procedure that imports the functions from the files

In [None]:
snf_session.clear_imports()
snf_session.clear_packages()
@F.sproc(name="module_dir_sp", is_permanent=False, replace=True, packages=['snowflake-snowpark-python'],imports=['../py_scripts/sp_examples/modules'], session=snf_session)
def module_dir_sp(session: Session, name: str) -> str:
    from modules.local_module import hello_name
    from modules.local_second_module import reverse_name
    hello_str = hello_name(name)
    reverse_name = reverse_name(name)
    return f'{hello_str} your name reversed is {reverse_name}'

If we look at the description of the Stored procedure we can see that the folder is added as modules.zip

In [None]:
snf_session.sproc.describe(module_dir_sp).show(max_width=150)

Test the Stored Procedure

In [None]:
snf_session.call("module_dir_sp", "mats")

### Using files in a stage

We can import files that are on a stage to use in a Stored Procedure. However, we need to import each file 

In [None]:
snf_session.file.put('../py_scripts/sp_examples/modules/*.py', '@python_files/modules/', auto_compress=False, overwrite=True)

In [None]:
snf_session.sql("ls @python_files/modules/").show()

In this case since we are using files on a stage we import each file individually in our code

In [None]:
snf_session.clear_imports()
snf_session.clear_packages()
@F.sproc(name="hello_stage_sp", is_permanent=False, replace=True, packages=['snowflake-snowpark-python']
         ,imports=['@python_files/modules/local_module.py', '@python_files/modules/local_second_module.py'], session=snf_session)
def hello_dir_sp(session: Session, name: str) -> str:
    from local_module import hello_name
    from local_second_module import reverse_name
    hello_str = hello_name(name)
    reverse_name = reverse_name(name)
    return f'{hello_str} your name reversed is {reverse_name}'

If get information about the Stored Procedure we can see that each file are added individually

In [None]:
snf_session.sproc.describe(hello_dir_sp).show(max_width=150)

In [None]:
snf_session.call("hello_stage_sp", "mats")

### Returning DataFrame

Starting with Snowpark for Python 1.5.0 you can create a Python Stored Procedure using teh SNowpark API that returns a table/DataFrame.

In [None]:
snf_session.clear_imports()
snf_session.clear_packages()
@F.sproc(name="table_name_sp", is_permanent=False, replace=True, packages=['snowflake-snowpark-python'], session=snf_session)
def table_name_sp(snf_session: Session) -> DataFrame:
    sp_df = snf_session.table("CAMPAIGN_SPEND")
    df_spend_yearly = sp_df.group_by(F.year("DATE"), "CHANNEL").sum("TOTAL_COST").sort("YEAR(DATE)")
    
    return df_spend_yearly


In [None]:
return_df = snf_session.call("table_name_sp")
return_df.show()

However, the return is not a real DataFrame with multiple columns as we can see by looking at the query and schema for return_df 

In [None]:
return_df.queries

In [None]:
return_df.schema

In [None]:
sp_df = return_df.cache_result()

### Logging

The Python logger can be used in Python Stored Procedures, it needs to be setup before according to https://docs.snowflake.com/en/developer-guide/logging-tracing/logging-python

In [None]:
snf_session.clear_imports()
snf_session.clear_packages()
@F.sproc(name="logging_sp", is_permanent=False, replace=True, packages=['snowflake-snowpark-python'], session=snf_session)
def logging_sp(session: Session) -> str:
    import logging

    logger = logging.getLogger("logging_sp_logger")
    logger.info("Starting sp")
    curr_db = session.get_current_database()
    logger.info(f"Using db: {curr_db}")
    logger.info("End logging_sp")
    
    return f"Done logging for this time!"
  

We need to set the level for the Stored Procedure so the logging is captured

In [None]:
snf_session.sql("ALTER PROCEDURE logging_sp() SET LOG_LEVEL=INFO").collect()

If we now call the Stroed Procedure the logg messages with the orevious set level will be captured

In [None]:
snf_session.call("logging_sp")

To see the logged messges we need to query the events table created, it can take a couple of minutes before the messages are visible in the table.

In [None]:
snf_session.table("event_db.logging.logging_events").filter(F.col("SCOPE")['name'] == 'logging_sp_logger').order_by(F.col("OBSERVED_TIMESTAMP").desc()).select("OBSERVED_TIMESTAMP", "VALUE").show(max_width=150)

### Tracing
Tracing can be used to ...

In order to use trace we need to set up a event table , as above for logging, and install the snowflake-telemetry-python python library

In [None]:
from snowflake import telemetry

snf_session.clear_imports()
snf_session.clear_packages()
@F.sproc(name="logging_tracing_sp", is_permanent=False, replace=True, packages=['snowflake-snowpark-python', 'snowflake-telemetry-python'], session=snf_session)
def logging_tracing_sp(session: Session) -> str:
    import logging

    logger = logging.getLogger("logging_tracing_sp_logger")
    logger.info("Starting sp")
    curr_db = session.get_current_database()
    logger.info(f"Using db: {curr_db}")

    nbr_tables = session.table("information_schema.tables").filter(F.col("TABLE_TYPE") == 'BASE TABLE').count()
    telemetry.add_event("logging_tracing_sp.proc.do_tracing")
    telemetry.set_span_attribute("database_used", curr_db)
    telemetry.set_span_attribute("tables_in_database", nbr_tables)
    telemetry.add_event("logging_tracing_sp.proc.with_attribute", {"one_attribute": 1, "string_attribute":"string"})

    logger.info("End logging_sp")
    
    return f"Done logging and tracing"
  

Set the Trace Level for the Stored Procedure

In [None]:
snf_session.sql("ALTER PROCEDURE logging_tracing_sp() SET TRACE_LEVEL = ON_EVENT").collect()

In [None]:
snf_session.call("logging_tracing_sp")

By querying the event tablewe can get the trace events, it will take a couple of minutes until it is visible

In [None]:
snf_session.table("event_db.logging.logging_events").order_by(F.col("TIMESTAMP").desc()).filter(F.col("RESOURCE_ATTRIBUTES")['snow.executable.name'].like('LOGGING_TRACING_SP%')).select("TIMESTAMP", "RECORD_TYPE","RECORD","RECORD_ATTRIBUTES" ).show(max_width=150)

In [None]:
snf_session.close()