# Vectorized User Defined Functions

1. Operationalize UDFS and UDTFS in snowpark

    Compare scalar and vectorized operations


2. Enhance performance in Snowpark Applications

    Vectorization

        Understanding the difference between vectoried and scalar UDFs

        Vectorized UDFs for batching
    

For more information follow the below links:

1. [Snowpark Performance Best Practices](https://www.phdata.io/blog/snowpark-performance-best-practices/)

2. [Vectorized Python UDFs](https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-batch)



In [None]:
import pandas as pd
from snowflake.snowpark.functions import col, call_udf, pandas_udf
from snowflake.snowpark.types import IntegerType, PandasSeriesType, PandasDataFrameType
from snowflake.snowpark.context import get_active_session

session = get_active_session()
session.use_database("snowpark_db")
session.use_schema("sourced")
session.query_tag = "create-vectorized-udfs"

df = session.create_dataframe(
    [[1, 2], [3, 4], [5, 6]],
    schema=["a", "b"])
df

In [None]:
# vectorized anonymous UDF (with pandas_udf + lambda)
add_10 = pandas_udf(
    lambda df: df[0] + df[1] + 10,
    input_types=[PandasDataFrameType([IntegerType(), IntegerType()])],
    return_type=PandasSeriesType(IntegerType()))

df.select(add_10("a", "b").alias("res"))

In [None]:
# vectorized named UDF (with @pandas_udf)
@pandas_udf(
    name="add_11",
    input_types=[PandasSeriesType(IntegerType()), PandasSeriesType(IntegerType())],
    return_type=PandasSeriesType(IntegerType()),
    replace=True)
def add_11(col1: pd.Series, col2: pd.Series) -> pd.Series:
    return col1+col2+11

df.select(call_udf("add_11", col("a"), col("b")).alias("res"))