# 2. Importing Packages That Aren't In Snowflake's Anaconda Channel

Snowflake's Anaconda channel has a lot of packages - https://repo.anaconda.com/pkgs/snowflake/ but sometimes there are packages that you require that are not yet included, or even packages you have built yourself.  Fear not, these can be used easily too, let's find out how.




In [1]:
import json
import numpy as np
import pandas as pd
from snowflake.snowpark.session import Session
import snowflake.snowpark.functions as F
import snowflake.snowpark.types as T
from snowflake.snowpark.types import PandasDataFrameType, IntegerType, StringType, FloatType, DateType
from snowflake.ml.modeling.xgboost import XGBRegressor
from snowflake.ml.modeling.linear_model import LinearRegression
from snowflake.ml.registry import model_registry
from snowflake.ml._internal.utils import identifier

# 2.1 Reading Snowflake Connection Details, create a Session 

In [2]:
snowflake_connection_cfg = json.loads(open("/Users/mitaylor/Documents/creds/creds.json").read()) # <--- 2. Update here
session = Session.builder.configs(snowflake_connection_cfg).create()
session.sql("USE DATABASE HOL_DEMO").collect()
session.sql("CREATE OR REPLACE STAGE YOUR_STAGE").collect()
session.sql("CREATE OR REPLACE WAREHOUSE ASYNC_WH WITH WAREHOUSE_SIZE='MEDIUM' WAREHOUSE_TYPE = 'SNOWPARK-OPTIMIZED'").collect()

[Row(status='Warehouse ASYNC_WH successfully created.')]

# 2.2 Let's run a SPROC

In [29]:
def hello_world(session: Session) -> T.Variant:
    return "hello world"

# Register sproc
hello_world_demo = session.sproc.register(
                              func=hello_world, 
                              name='hello_world', 
                              is_permanent=True, 
                              replace=True,
                              stage_location='@YOUR_STAGE', 
                              packages=['snowflake-snowpark-python'])
# Call sproc
hello_world_demo()

'"hello world"'

# 2.3 Trying to Create a SPROC Using the python BACKTEST library

In [27]:
def sproc_test_backtesting(session: Session) -> T.Variant:
    import backtesting as bt

# Register sproc
sproc_test_backtesting_demo = session.sproc.register(
                              func=sproc_test_backtesting, 
                              name='YOUR_SPROC_NAME', 
                              is_permanent=True, 
                              replace=True,
                              stage_location='@YOUR_STAGE', 
                              packages=['snowflake-snowpark-python', 'backtesting', 'bokeh'])
# Call sproc
sproc_test_backtesting_demo()

RuntimeError: Cannot add package backtesting because it is not available in Snowflake and Session.custom_package_usage_config['enabled'] is not set to True. To upload these packages, you can set it to True or find the directory of these packages and add it via Session.add_import. See details at https://docs.snowflake.com/en/developer-guide/snowpark/python/creating-udfs.html#using-third-party-packages-from-anaconda-in-a-udf.

# 2.4 Trying to Create a SPROC Using the python BACKTEST library

In [12]:
# Define sproc to test that the import was successful
def sproc_test_backtesting(session: Session) -> T.Variant:
    import wheel_loader
    wheel_loader.load('Backtesting-0.3.4.dev30+g0ce24d8-py3-none-any.whl')
    import pandas as pd
    import backtesting as bt


# Register sproc
sproc_test_backtesting_demo = session.sproc.register(
                              func=sproc_test_backtesting, 
                              name='YOUR_SPROC_NAME', 
                              is_permanent=True, 
                              replace=True,
                              stage_location='@YOUR_STAGE', 
                              packages=['snowflake-snowpark-python', 'bokeh'], # Needed as dependency
                              imports=["@YOUR_STAGE/wheel_loader.py",
                                       "@YOUR_STAGE/Backtesting-0.3.4.dev30+g0ce24d8-py3-none-any.whl"])
# Call sproc
sproc_test_backtesting_demo()

'null'

# 2.5 Using the Backtest library locally

In [None]:
import backtesting as bt
from backtesting.lib import crossover
import pandas as pd

def MovingAverage(closes:pd.Series, n:int) -> pd.Series:
    return pd.Series(closes).rolling(n).mean()

class SmaCross(bt.Strategy):
    sma_fast = 12 
    sma_slow = 35
    
    def init(self):
        self.sma1 = self.I(MovingAverage, self.data.Close, self.sma_fast)
        self.sma2 = self.I(MovingAverage, self.data.Close, self.sma_slow)

    def next(self):
        if not self.position and crossover(self.sma1, self.sma2): # if you have a position, and sma1 and sma2 crossover
            self.buy()
        elif self.position and crossover(self.sma2, self.sma1):
            self.position.close()
            
data = session.sql("""SELECT * FROM FS_DATASET WHERE SYMBOL = 'GOOG'""").to_pandas()
data['High'] = data['CLOSE']
data['Open'] = data['CLOSE']
data['Low'] = data['CLOSE']
data['Close'] = data['CLOSE']
btest = bt.Backtest(data, SmaCross, cash=10_000, commission=0,exclusive_orders=True)
stats = btest.run()
pd.DataFrame(stats).T

In [None]:
# 2.6 Putting it all together

In [None]:
# Define sproc to test that the import was successful
def sproc_test_backtesting(session: Session) -> T.Variant:
    import wheel_loader
    wheel_loader.load('Backtesting-0.3.4.dev30+g0ce24d8-py3-none-any.whl')
    import pandas as pd
    import backtesting as bt

    ########################################################
    # Insert some code here, hint it'll look A LOT like 2.5
    ########################################################


# Register sproc
sproc_test_backtesting_demo = session.sproc.register(
                              func=sproc_test_backtesting, 
                              name='YOUR_SPROC_NAME', 
                              is_permanent=True, 
                              replace=True,
                              stage_location='@YOUR_STAGE', 
                              packages=['snowflake-snowpark-python', 'bokeh'], # Needed as dependency
                              imports=["@YOUR_STAGE/wheel_loader.py",
                                       "@YOUR_STAGE/Backtesting-0.3.4.dev30+g0ce24d8-py3-none-any.whl"])
# Call sproc
sproc_test_backtesting_demo()

In [None]:
# 2.7 Parallelise!

In [24]:
bt.plot(filename = "test.html", open_browser=False)

  formatter=DatetimeTickFormatter(days=['%d %b', '%a %d'],
  formatter=DatetimeTickFormatter(days=['%d %b', '%a %d'],


1 - Just Price Prediction, Model built registered blah blah blah < -- we're off to the races, 35 lines of code!
2 - Build a Strategy Off that Prediction, and Backtest it
3 - Get More Data, enhance the strategy
4 - Scale Up The Search (cartesian prod of symbol and backtest bits? or maybe staretg?)
5 - Streamlit the Results

Bonus - Orchestrate with tasks?



UDTF to clean data
UDF to deploy our model to the data
UDTF to do a backtesting strategy on the data

UDTF
Contains the strategy
Gets the data (from SF, based on input criteria)
partitions over symbol
Runs backtest on symbol
cartesian product of symbol and smastats