We will use the Python "Faker" library to generate some test data required for this project. You don't need to learn python to use Dynamic Tables, it's only used to generate sample datasets. In order to run this python code we will build and use Python UDTF

we will create 3 UDTF to generate our source data. First table is CUST_INFO and insert 1000 customers into it using this new Python UDTF.

In [None]:
use database SNOW_DYNAMIC_TABLES_DE;
use schema data;

In [None]:
create or replace function gen_cust_info(num_records number)
returns table (custid number(10), cname varchar(100), spendlimit number(10,2))
language python
runtime_version=3.8
handler='CustTab'
packages = ('Faker')
as $$
from faker import Faker
import random

fake = Faker()
# Generate a list of customers  

class CustTab:
    # Generate multiple customer records
    def process(self, num_records):
        customer_id = 1000 # Starting customer ID                 
        for _ in range(num_records):
            custid = customer_id + 1
            cname = fake.name()
            spendlimit = round(random.uniform(1000, 10000),2)
            customer_id += 1
            yield (custid,cname,spendlimit)

$$;

create or replace table cust_info as select * from table(gen_cust_info(1000)) order by 1;

Just to show another feature real quick, we can create a .jar, upload it to a stage and then call it via a stored procedure.

I'm going to zoom through a few steps, but first, lets load up IntelliJ and take a look at some code we created.

This code creates some fake data for us. We can run it locally, but then want to build a jar (with dependencies)

When that is complete, we'll move the jar to Snowflake via our put command

* Need to connect via snowsql to upload the jar if it's over 50mb
* snowsql -a <YOUR_ACCOUNT_HERE> -w <YOUR_WAREHOUSE_HERE> -u <YOUR_USER_HERE>
* then make sure you use the right database and schema
* this will upload your jar to the stage
* put file:////Users/bharris/src/github/snowpark-java-examples/EnrichData/target/EnrichData-1.0-SNAPSHOT-jar-with-dependencies.jar @~/JAVA_STAGE/ auto_compress = false OVERWRITE=true;

Once the jar has been uploaded to the stage, you can create your function(s)

```
create or replace function generateFull(num number)
returns variant
language java
imports = ('@~/JAVA_STAGE/EnrichData-1.0-SNAPSHOT-jar-with-dependencies.jar')
handler = 'com.snowflake.udf.FakeData.generateFull';

create or replace function generateBasic(num number(9,0))
returns variant
language java
imports = ('@~/JAVA_STAGE/EnrichData-1.0-SNAPSHOT-jar-with-dependencies.jar')
handler = 'com.snowflake.udf.FakeData.generateBasic';

create or replace function generatebasictablerows(num number)
returns table(firstName varchar, lastName varchar, email varchar, phone varchar, address varchar)
language java
imports = ('@~/JAVA_STAGE/EnrichData-1.0-SNAPSHOT-jar-with-dependencies.jar')
handler = 'com.snowflake.udf.FakeDataTable';
```


And while you can't run Java code in notebooks, you can still create a stored procedure that runs Java code in SQL

In [None]:
create or replace function echo_varchar(x varchar, y number)
returns varchar
language java
called on null input
handler='TestFunc.echoVarchar'
target_path='@~/testfuncloopv2.jar'
as
'class TestFunc {
  public static String echoVarchar(String x, Integer y) {

    StringBuffer rtn = new StringBuffer();
    
    for(int z=0;z<y;z++)
    {
        rtn.append(x + " ");
    }
    return rtn.toString();
  }
}';

In [None]:
select echo_varchar('Hi!', 5);

Next table is PROD_STOCK_INV and insert 100 products inventory into it using this new Python UDTF.



In [None]:
create or replace function gen_prod_inv(num_records number)
returns table (pid number(10), pname varchar(100), stock number(10,2), stockdate date)
language python
runtime_version=3.8
handler='ProdTab'
packages = ('Faker')
as $$
from faker import Faker
import random
from datetime import datetime, timedelta
fake = Faker()

class ProdTab:
    # Generate multiple product records
    def process(self, num_records):
        product_id = 100 # Starting customer ID                 
        for _ in range(num_records):
            pid = product_id + 1
            pname = fake.catch_phrase()
            stock = round(random.uniform(500, 1000),0)
            # Get the current date
            current_date = datetime.now()
            
            # Calculate the maximum date (3 months from now)
            min_date = current_date - timedelta(days=90)
            
            # Generate a random date within the date range
            stockdate = fake.date_between_dates(min_date,current_date)

            product_id += 1
            yield (pid,pname,stock,stockdate)

$$;

create or replace table prod_stock_inv as select * from table(gen_prod_inv(100)) order by 1;


Next table is SALESDATA to store raw product sales by customer and purchase date



In [None]:
create or replace function gen_cust_purchase(num_records number,ndays number)
returns table (custid number(10), purchase variant)
language python
runtime_version=3.8
handler='genCustPurchase'
packages = ('Faker')
as $$
from faker import Faker
import random
from datetime import datetime, timedelta

fake = Faker()

class genCustPurchase:
    # Generate multiple customer purchase records
    def process(self, num_records,ndays):       
        for _ in range(num_records):
            c_id = fake.random_int(min=1001, max=1999)
            
            #print(c_id)
            customer_purchase = {
                'custid': c_id,
                'purchased': []
            }
            # Get the current date
            current_date = datetime.now()
            
            # Calculate the maximum date (days from now)
            min_date = current_date - timedelta(days=ndays)
            
            # Generate a random date within the date range
            pdate = fake.date_between_dates(min_date,current_date)
            
            purchase = {
                'prodid': fake.random_int(min=101, max=199),
                'quantity': fake.random_int(min=1, max=5),
                'purchase_amount': round(random.uniform(10, 1000),2),
                'purchase_date': pdate
            }
            customer_purchase['purchased'].append(purchase)
            
            #customer_purchases.append(customer_purchase)
            yield (c_id,purchase)

$$;

-- Create table and insert records 
create or replace table salesdata as select * from table(gen_cust_purchase(10000,10));

This completes our sample data stored in raw base tables. In real world, you will load this data into Snowflake either using COPY COMMAND, connectors, Snowpipe or Snowpipe Streaming

In [None]:
-- customer information table, each customer has spending limits
select * from cust_info limit 10;

In [None]:
-- product stock table, each product has stock level from fulfilment day
select * from prod_stock_inv limit 10;

In [None]:
-- sales data for products purchsaed online by various customers
select * from salesdata limit 10;