### 1. Load requirements into conda 

> <b>In a new terminal, set your working directory to 2. Audio2Text and run the following commands</b>

``` bash
conda create --name demosnowparkdemo --override-channels -c https://repo.anaconda.com/pkgs/snowflake python=3.8 
conda activate demosnowparkdemo
conda install snowflake-snowpark-python pandas pyarrow streamlit
pip install scikit-learn==1.1.1
```

### 2. Build docker image and push the image to image registry - BE SURE TO HAVE DOCKER RUNNING BEFORE PROCEEDING


> <b>Run the following commands from a terminal. Set your working directory to "2. Audio2Text". Ensure Docker is running on your laptop. Update the ORGNAME-ACCTNAME in the following commands with your Snowflake account info and provide correct username (you will be prompted for your password so have it handy).</b> 

``` bash
-- Refer to "2. Audio2Text\Dockerfile" for image details

docker build --no-cache --platform linux/amd64 -t ORGNAME-ACCTNAME.registry.snowflakecomputing.com/llmdemo/public/images/whisper-audio2text:latest . 

docker push ORGNAME-ACCTNAME.registry.snowflakecomputing.com/llmdemo/public/images/whisper-audio2text:latest
```
> <b>*** Allow docker push to complete before moving on (takes several minutes, so now is a good time for a bio break) ***</b>

### 3. Create internal stages to hold our images and files 

In [None]:
import json
from snowflake.snowpark.session import Session
import snowflake.snowpark.functions as F

In [None]:
# Connection.json file should use the SPCS_PSE_ROLE 

connection_parameters = json.load(open('../connection.json'))
session = Session.builder.configs(connection_parameters).create()

In [None]:
# Create stages for our files
stages=['WHISPER_APP','AUDIO_FILES','SPECS','CSV_FILES']
for stg in stages:
    session.sql(f'''
                CREATE OR REPLACE STAGE {stg} ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE') 
                DIRECTORY = (ENABLE = TRUE);
                ''').collect()


###  4. Create SPC Service
Update the image url line in [whisper_spec.yml](./whisper_spec.yml) with your org/acct info prior to executing the following put command

ex. image: <i>myorg-myaccount</i>.registry.snowflakecomputing.com/pr_llmdemo/public/image_repo/whisper-audio2text:latest

PS: <b>Run following commands using the SPCS Role, not accountadmin! (this should already be set in your [connection.json](../connection.json) file) </b>


In [None]:

session.file.put("./whisper_spec.yml", "@specs",auto_compress=False)

In [None]:
# Create the service
session.sql('''
CREATE SERVICE IF NOT EXISTS Whisper_Audio_text_SVC
  IN COMPUTE POOL PR_GPU_S
  FROM @specs
  SPEC='whisper_spec.yml'
  EXTERNAL_ACCESS_INTEGRATIONS = (ALLOW_ALL_EAI)
  MIN_INSTANCES=1
  MAX_INSTANCES=1;
            ''').collect()

> The service must be in Ready State to proceed. Run the following command to confirm before proceeding to next step.

In [None]:
# Service activation will take several minutes, so just rerun this command until you see status READY (patience is a virtue)
import ast
res=session.sql(''' 
SELECT SYSTEM$GET_SERVICE_STATUS('Whisper_Audio_text_SVC',1)
''').collect()[0][0]
ast.literal_eval(res)[0]

>  Be sure the service is in the Ready State before proceeding.

In [None]:
#  Check the log for the service for any errors.
session.sql('''SELECT value AS log_line
FROM TABLE(
 SPLIT_TO_TABLE(SYSTEM$GET_SERVICE_LOGS('Whisper_Audio_text_SVC', 0, 'audio-whisper-app'), '\n')
  )''').to_pandas()


### 5. Creating the service function

In [None]:
#   Function to get duration of the audio files
session.sql('''CREATE OR REPLACE FUNCTION DURATION(AUDIO_FILE TEXT)
RETURNS VARIANT
SERVICE=Whisper_Audio_text_SVC
ENDPOINT=API
AS '/audio-duration'
            ''').collect()


In [None]:
# Function to transcribe the audio files
session.sql('''CREATE OR REPLACE FUNCTION TRANSCRIBE(TASK TEXT, LANGUAGE TEXT, AUDIO_FILE TEXT, ENCODE BOOLEAN)
RETURNS VARIANT
SERVICE=Whisper_Audio_text_SVC
ENDPOINT=API
AS '/asr'
            ''').collect()

In [None]:
# Function to detect language of the audio file
session.sql('''CREATE OR REPLACE FUNCTION DETECT_LANGUAGE(AUDIO_FILE TEXT, ENCODE BOOLEAN)
RETURNS VARIANT
SERVICE=Whisper_Audio_text_SVC
ENDPOINT=API
AS '/detect-language'
            ''').collect()

In [None]:
# Creating the Table to load the Audio file raw text along with duration and other attributes

# Duration is in seconds

session.sql('''
    CREATE or REPLACE TABLE ALL_CLAIMS_RAW (
	DATETIME DATE,
	AUDIOFILE VARCHAR(16777216),
	CONVERSATION VARCHAR(16777216),
	PRESIGNED_URL_PATH VARCHAR(16777216),
	DURATION FLOAT NOT NULL
)''').collect()


In [None]:
#  Uploading the audio files to Internal Stage

_ = session.file.put("./audiofiles/*.*", "@AUDIO_FILES/2024-01-26/", auto_compress=False,overwrite=True)

session.sql(f'''ALTER STAGE AUDIO_FILES REFRESH''').collect()


In [None]:
session.sql('ls @AUDIO_FILES/2024-01-26').collect()

In [None]:
# Inserting records into the RAW Table
# To have different values for the datetime, store your audio files in sub folders with yyy-mm-dd format . 
# E.g. 2024-01-10. 
session.sql('''
INSERT INTO ALL_CLAIMS_RAW
(
DATETIME,
AUDIOFILE,
PRESIGNED_URL_PATH,
CONVERSATION,
DURATION
)
SELECT CAST(CASE WHEN split(RELATIVE_PATH,'/')[1]::string IS NULL THEN GETDATE() 
            ELSE split(RELATIVE_PATH,'/')[0]::string END AS DATE) as Datetime, 
        CASE WHEN split(RELATIVE_PATH,'/')[1]::string is null then split(RELATIVE_PATH,'/')[0]::string 
            ELSE split(RELATIVE_PATH,'/')[1]::string END as RELATIVE_PATH,
       GET_PRESIGNED_URL('@AUDIO_FILES', RELATIVE_PATH) AS PRESIGNED_URL
       -- ,DETECT_LANGUAGE(PRESIGNED_URL,TRUE) as DETECT_LANGUAGE
       ,TRANSCRIBE('transcribe','',PRESIGNED_URL,True)['text']::string AS EXTRACTED_TEXT
       ,DURATION(PRESIGNED_URL):call_duration_seconds::DOUBLE as CALL_DURATION_SECONDS
FROM DIRECTORY('@AUDIO_FILES')
            
            ''').collect()

In [None]:
session.table('ALL_CLAIMS_RAW').to_pandas()

### 6. Loading Data into the ALL_CLAIMS_RAW Table from CSV

Since we don't have lot of audio files from insurance industry, we will be loading sample data into the Raw table which has the raw conversation from the insurance industry. This data will be the source for this solution.

In [None]:
_ = session.file.put("./Sample_Audio_Text.csv", "@CSV_FILES", auto_compress=False)

sp_df=session.read.options({"INFER_SCHEMA":True,"PARSE_HEADER":True,"FIELD_OPTIONALLY_ENCLOSED_BY":'"'}).csv('@CSV_FILES/Sample_Audio_Text.csv')

In [None]:
sp_df.write.mode("overwrite").save_as_table("ALL_CLAIMS_RAW")

In [None]:
session.table('ALL_CLAIMS_RAW').to_pandas()