### 1. Build conda environment to be used in this demo

> <b>Run the following commands from a terminal.</b>

``` bash
conda create --name demosnowparkdemo --override-channels -c https://repo.anaconda.com/pkgs/snowflake python=3.8
conda activate demosnowparkdemo
conda install snowflake-snowpark-python pandas pyarrow streamlit
```

### 2. Build docker image and push the image to image registry

> <b>Run the following commands from a terminal. Ensure Docker is running on your laptop. Update the ORGNAME-ACCTNAME with your Snowflake account info and provide correct username (you will be prompted for your password so have it handy).</b> 

``` bash
cd ..
cd audio2text

-- Refer audio2text/Dockerfile for image details

docker build --no-cache --platform linux/amd64 -t ORGNAME-ACCTNAME.registry.snowflakecomputing.com/llmdemo/public/images/whisper-audio2text:latest . 

-- username and password same as your snowflake credential

docker login ORGNAME-ACCTNAME.registry.snowflakecomputing.com -u <username>

docker push ORGNAME-ACCTNAME.registry.snowflakecomputing.com/llmdemo/public/images/whisper-audio2text:latest
```
> <b>*** Allow docker push to complete before moving on (takes several minutes, so now is a good time for a bio break) ***</b>

### 3. Create internal stages to hold our images and files 

In [35]:
import json
from snowflake.snowpark.session import Session
import snowflake.snowpark.functions as F

In [36]:
# Connection.json file should use the SPCS_PSE_ROLE 

connection_parameters = json.load(open('../connection.json'))
session = Session.builder.configs(connection_parameters).create()

In [37]:
# Run the below command to create the required stage
stages=['WHISPER_APP','AUDIO_FILES','SPECS','CSV_FILES']
for stg in stages:
    session.sql(f'''
                CREATE OR REPLACE STAGE {stg} ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE') 
                DIRECTORY = (ENABLE = TRUE);
                ''').collect()


###  4. Create SPC Service
Update the image url line in [whisper_spec.yml](./whisper_spec.yml) with your org/acct info prior to executing the following put command

ex. image: <i>myorg-myaccount</i>.registry.snowflakecomputing.com/pr_llmdemo/public/image_repo/whisper-audio2text:latest

PS: <b>Run following commands using the SPCS Role, not accountadmin! (this should already be set in your [connection.json](../connection.json) file) </b>


In [38]:

session.file.put("./whisper_spec.yml", "@specs",auto_compress=False)

[PutResult(source='whisper_spec.yml', target='whisper_spec.yml', source_size=546, target_size=546, source_compression='NONE', target_compression='NONE', status='UPLOADED', message='')]

In [39]:
# Create the service
session.sql('''
CREATE SERVICE IF NOT EXISTS Whisper_Audio_text_SVC
  IN COMPUTE POOL PR_GPU_S
  FROM @specs
  SPEC='whisper_spec.yml'
  EXTERNAL_ACCESS_INTEGRATIONS = (ALLOW_ALL_EAI)
  MIN_INSTANCES=1
  MAX_INSTANCES=1;
            ''').collect()

[Row(status='Service WHISPER_AUDIO_TEXT_SVC successfully created.')]

> The service must be in Ready State to proceed. Run the following command to confirm before proceeding to next step.

In [41]:
# Service activation may take a few minutes.
import ast
res=session.sql(''' 
SELECT SYSTEM$GET_SERVICE_STATUS('Whisper_Audio_text_SVC',1)
''').collect()[0][0]
ast.literal_eval(res)[0]

{'status': 'READY',
 'message': 'Running',
 'containerName': 'audio-whisper-app',
 'instanceId': '0',
 'serviceName': 'WHISPER_AUDIO_TEXT_SVC',
 'image': 'sfsenorthamerica-demo-psheehan.registry.snowflakecomputing.com/llmdemo/public/images/whisper-audio2text:latest',
 'restartCount': 0,
 'startTime': '2024-06-21T16:44:28Z'}

>  Be sure the service is in the Ready State before proceeding.

In [42]:
#  Check the log for the service for any errors.
session.sql('''SELECT value AS log_line
FROM TABLE(
 SPLIT_TO_TABLE(SYSTEM$GET_SERVICE_LOGS('Whisper_Audio_text_SVC', 0, 'audio-whisper-app'), '\n')
  )''').to_pandas()


Unnamed: 0,LOG_LINE
0,
1,=============
2,== PyTorch ==
3,=============
4,
5,NVIDIA Release 23.06 (build 63009835)
6,PyTorch Version 2.1.0a0+4136153
7,
8,"Container image Copyright (c) 2023, NVIDIA COR..."
9,


### 5. Creating the service function

In [43]:
#   Function to get duration of the audio files
session.sql('''CREATE OR REPLACE FUNCTION DURATION(AUDIO_FILE TEXT)
RETURNS VARIANT
SERVICE=Whisper_Audio_text_SVC
ENDPOINT=API
AS '/audio-duration'
            ''').collect()


[Row(status='Function DURATION successfully created.')]

In [44]:
# Function to transcribe the audio files
session.sql('''CREATE OR REPLACE FUNCTION TRANSCRIBE(TASK TEXT, LANGUAGE TEXT, AUDIO_FILE TEXT, ENCODE BOOLEAN)
RETURNS VARIANT
SERVICE=Whisper_Audio_text_SVC
ENDPOINT=API
AS '/asr'
            ''').collect()

[Row(status='Function TRANSCRIBE successfully created.')]

In [45]:
# Function to detect language of the audio file
session.sql('''CREATE OR REPLACE FUNCTION DETECT_LANGUAGE(AUDIO_FILE TEXT, ENCODE BOOLEAN)
RETURNS VARIANT
SERVICE=Whisper_Audio_text_SVC
ENDPOINT=API
AS '/detect-language'
            ''').collect()

[Row(status='Function DETECT_LANGUAGE successfully created.')]

In [46]:
# Creating the Table to load the Audio file raw text along with duration and other attributes

# Duration is in seconds

session.sql('''
    CREATE or REPLACE TABLE ALL_CLAIMS_RAW (
	DATETIME DATE,
	AUDIOFILE VARCHAR(16777216),
	CONVERSATION VARCHAR(16777216),
	PRESIGNED_URL_PATH VARCHAR(16777216),
	DURATION FLOAT NOT NULL
)''').collect()


[Row(status='Table ALL_CLAIMS_RAW successfully created.')]

In [47]:
#  Uploading the audio files to Internal Stage

_ = session.file.put("./audiofiles/*.*", "@AUDIO_FILES/2024-01-26/", auto_compress=False,overwrite=True)

session.sql(f'''ALTER STAGE AUDIO_FILES REFRESH''').collect()


[Row(file='stages/259abeb8-fe6d-4f18-b069-0126596699c8/2024-01-26/harvard (1).wav', status='REGISTERED_NEW', description='File registered successfully.'),
 Row(file='stages/259abeb8-fe6d-4f18-b069-0126596699c8/2024-01-26/Sample_ATT_Inbound_Call-MONO_47sec.mp3', status='REGISTERED_NEW', description='File registered successfully.'),
 Row(file='stages/259abeb8-fe6d-4f18-b069-0126596699c8/2024-01-26/common_voice_de_37888599.mp3', status='REGISTERED_NEW', description='File registered successfully.'),
 Row(file='stages/259abeb8-fe6d-4f18-b069-0126596699c8/2024-01-26/Health-Insurance-1.mp3', status='REGISTERED_NEW', description='File registered successfully.'),
 Row(file='stages/259abeb8-fe6d-4f18-b069-0126596699c8/2024-01-26/common_voice_de_37942822.mp3', status='REGISTERED_NEW', description='File registered successfully.'),
 Row(file='stages/259abeb8-fe6d-4f18-b069-0126596699c8/2024-01-26/jackhammer.wav', status='REGISTERED_NEW', description='File registered successfully.')]

In [48]:
session.sql('ls @AUDIO_FILES/2024-01-26').collect()

[Row(name='audio_files/2024-01-26/Health-Insurance-1.mp3', size=587565, md5='889b7aa9c9c09f78b376e04ca53585c3', last_modified='Fri, 21 Jun 2024 16:57:59 GMT'),
 Row(name='audio_files/2024-01-26/Sample_ATT_Inbound_Call-MONO_47sec.mp3', size=335727, md5='273123d02e1082fba024ba4429b3f03f', last_modified='Fri, 21 Jun 2024 16:58:00 GMT'),
 Row(name='audio_files/2024-01-26/common_voice_de_37888599.mp3', size=63333, md5='cdc9ef276ed8a5ae9378795bd5ada737', last_modified='Fri, 21 Jun 2024 16:57:59 GMT'),
 Row(name='audio_files/2024-01-26/common_voice_de_37942822.mp3', size=63333, md5='68b5c52ffa45c560f851f77bb237372f', last_modified='Fri, 21 Jun 2024 16:58:00 GMT'),
 Row(name='audio_files/2024-01-26/harvard (1).wav', size=3249924, md5='0547986abb83074dc44469b94167f629', last_modified='Fri, 21 Jun 2024 16:58:00 GMT'),
 Row(name='audio_files/2024-01-26/jackhammer.wav', size=600204, md5='8ed1a3f104be95530dbace9fea26eca9', last_modified='Fri, 21 Jun 2024 16:57:59 GMT')]

In [49]:
# Inserting records into the RAW Table
# To have different values for the datetime, store your audio files in sub folders with yyy-mm-dd format . 
# E.g. 2024-01-10. 
session.sql('''
INSERT INTO ALL_CLAIMS_RAW
(
DATETIME,
AUDIOFILE,
PRESIGNED_URL_PATH,
CONVERSATION,
DURATION
)
SELECT CAST(CASE WHEN split(RELATIVE_PATH,'/')[1]::string IS NULL THEN GETDATE() 
            ELSE split(RELATIVE_PATH,'/')[0]::string END AS DATE) as Datetime, 
        CASE WHEN split(RELATIVE_PATH,'/')[1]::string is null then split(RELATIVE_PATH,'/')[0]::string 
            ELSE split(RELATIVE_PATH,'/')[1]::string END as RELATIVE_PATH,
       GET_PRESIGNED_URL('@AUDIO_FILES', RELATIVE_PATH) AS PRESIGNED_URL
       -- ,DETECT_LANGUAGE(PRESIGNED_URL,TRUE) as DETECT_LANGUAGE
       ,TRANSCRIBE('transcribe','',PRESIGNED_URL,True)['text']::string AS EXTRACTED_TEXT
       ,DURATION(PRESIGNED_URL):call_duration_seconds::DOUBLE as CALL_DURATION_SECONDS
FROM DIRECTORY('@AUDIO_FILES')
            
            ''').collect()

[Row(number of rows inserted=6)]

In [50]:
session.table('ALL_CLAIMS_RAW').to_pandas()

Unnamed: 0,DATETIME,AUDIOFILE,CONVERSATION,PRESIGNED_URL_PATH,DURATION
0,2024-01-26,Health-Insurance-1.mp3,"Hi, good morning. Looking for Charles Clough?...",https://sfc-prod3-ds1-customer-stage.s3.us-wes...,146.808
1,2024-01-26,Sample_ATT_Inbound_Call-MONO_47sec.mp3,Thanks for calling AT&T. My name is Erica. Ca...,https://sfc-prod3-ds1-customer-stage.s3.us-wes...,47.952
2,2024-01-26,common_voice_de_37888599.mp3,Wie so das große Bestände stehen im Hochland-...,https://sfc-prod3-ds1-customer-stage.s3.us-wes...,10.548
3,2024-01-26,common_voice_de_37942822.mp3,Der Name leitet sich aus den englischen Worte...,https://sfc-prod3-ds1-customer-stage.s3.us-wes...,10.548
4,2024-01-26,harvard (1).wav,The stale smell of old beer lingers. It takes...,https://sfc-prod3-ds1-customer-stage.s3.us-wes...,18.35619
5,2024-01-26,jackhammer.wav,The still smell of old beer lingers.,https://sfc-prod3-ds1-customer-stage.s3.us-wes...,3.346712


### 6. Loading Data into the ALL_CLAIMS_RAW Table from CSV

Since we don't have lot of audio files from insurance industry, we will be loading sample data into the Raw table which has the raw conversation from the insurance industry. This data will be the source for this solution.

In [51]:
_ = session.file.put("./Sample_Audio_Text.csv", "@CSV_FILES", auto_compress=False)

sp_df=session.read.options({"INFER_SCHEMA":True,"PARSE_HEADER":True,"FIELD_OPTIONALLY_ENCLOSED_BY":'"'}).csv('@CSV_FILES/Sample_Audio_Text.csv')

# sp_df = session.read.option("INFER_SCHEMA", True).option("PARSE_HEADER", True).option("FIELD_OPTIONALLY_ENCLOSED_BY",'"').csv("@CSV_FILES/Sample_Audio_Text.csv")

In [52]:
sp_df.write.mode("overwrite").save_as_table("ALL_CLAIMS_RAW")

In [53]:
session.table('ALL_CLAIMS_RAW').to_pandas()

Unnamed: 0,DATETIME,AUDIOFILE,CONVERSATION,PRESIGNED_URL_PATH,DURATION
0,2023-11-11,audiofile1.mp3,"Hello, this is Emily from AutoAssure Insurance...",https://sfc-prod3-ds1-16-customer-stage.s3.us-...,218.254271
1,2023-11-11,audiofile2.mp3,"Hello, this is James from AutoAssure Insurance...",https://sfc-prod3-ds1-16-customer-stage.s3.us-...,197.705486
2,2023-11-11,audiofile3.mp3,"Hello, this is Sarah from AutoAssure Insurance...",https://sfc-prod3-ds1-16-customer-stage.s3.us-...,75.172382
3,2023-11-11,audiofile4.mp3,"Hello, this is Kevin from AutoAssure Insurance...",https://sfc-prod3-ds1-16-customer-stage.s3.us-...,224.291618
4,2023-11-15,audiofile5.mp3,"Hello, this is Olivia from AutoAssure Insuranc...",https://sfc-prod3-ds1-16-customer-stage.s3.us-...,174.649442
...,...,...,...,...,...
91,2024-01-05,audiofile92.mp3,"Good morning, this is Sara at AutoAssure Insur...",https://sfc-prod3-ds1-16-customer-stage.s3.us-...,102.048162
92,2024-01-05,audiofile93.mp3,"Hello, I'm Michael with AutoAssure Insurance, ...",https://sfc-prod3-ds1-16-customer-stage.s3.us-...,71.842008
93,2024-01-10,audiofile95.wav,"Welcome to AutoAssure Insurance, this is Josh....",https://sfc-prod3-ds1-16-customer-stage.s3.us-...,84.340859
94,2024-01-10,audiofile97.wav,"Hi, this is Ethan from AutoAssure Insurance. W...",https://sfc-prod3-ds1-16-customer-stage.s3.us-...,195.237454
