# Translate multilingual customer reviews using Snowflake Cortex

This template guides you through translating a sample batch of multilingual customer reviews using Snowflake Cortex in Python and SQL. 

### Context
*Tasty Bytes* is a global food truck network operating in 15 countries with fleet of 450 trucks. They collect customer reviews to get customer feedback on their food-trucks which come in from multiple sources and span multiple languages. 

In this notebook, we will look at how we analyze these collated customer reviews using Snowflake Cortex to understand what our international customers are saying with **Cortex Translate**.

##

### Import sample data

In this next SQL query, we will populate sample data that is used in this template.

In [None]:
USE ROLE SNOWFLAKE_LEARNING_ROLE;

-- use the existing database, schema and warehouse
USE DATABASE SNOWFLAKE_LEARNING_DB;

SET schema_name = CONCAT(current_user(), '_TRANSLATE_MULTILINGUAL_CUSTOMER_REVIEWS');
USE SCHEMA IDENTIFIER($schema_name);

  /*--
  • file format and stage creation
  --*/

  CREATE OR REPLACE FILE FORMAT csv_ff 
    TYPE = 'csv';

  CREATE OR REPLACE STAGE s3load
    COMMENT = 'Quickstarts S3 Stage Connection'
    URL = 's3://sfquickstarts/tastybytes-voc/'
    FILE_FORMAT = csv_ff;


  /*--
  • raw zone table build 
  --*/

  -- truck_reviews table
  CREATE OR REPLACE TABLE truck_reviews
  (
      order_id NUMBER(38,0),
      language VARCHAR(16777216),
      source VARCHAR(16777216),
      review VARCHAR(16777216),
      review_id NUMBER(18,0)
  );
  
  /*--
  • raw zone table load 
  --*/
  
  -- truck_reviews table load
  COPY INTO truck_reviews
  FROM @s3load/raw_support/truck_reviews/;

-- setup completion note
SELECT 'Setup is complete' AS note;

**Import python packages**

Snowflake Notebooks include Streamlit and the third-party packages listed in the Snowflake Anaconda channel. 

Now that the necessary packages are installed, we will import the installed packages into the notebook.

In [None]:
# Import python packages
import streamlit as st
import pandas as pd

# Snowpark
from snowflake.snowpark.context import get_active_session
import snowflake.snowpark.functions as F
from snowflake.snowpark.functions import when, date_part, col

# Cortex Functions
import snowflake.cortex  as cortex

session = get_active_session()

### Let's preview the reviews
In this next Python cell, we are previewing the data, looking specifically at any non-English reviews.

In [None]:
reviews_df = (
    session.table('TRUCK_REVIEWS')
    .filter(col('LANGUAGE') != 'en')
)

reviews_df.select("LANGUAGE","REVIEW").show(20, max_width=125)

### Use Cortex Translate on a Python dataframe

In the next cell, we will leverage **Translate** - one of the **Snowflake Cortex specialised LLM functions** available in Snowpark, to translate the multilingual reviews into English to enable easier analysis for anyone who doesn't speak the language of the original review.

In [None]:
# Conditionally translate reviews that are not english using Cortex Translate
reviews_df = reviews_df.withColumn('TRANSLATED_REVIEW',when(F.col('LANGUAGE') != F.lit("en"), \
                                                            cortex.translate(F.col('REVIEW'), \
                                                                             F.col('LANGUAGE'), \
                                                                             "en")) \
                                   .otherwise(F.col('REVIEW')))

reviews_df.filter(F.col('LANGUAGE') != F.lit("en")) \
.select(["REVIEW","LANGUAGE","TRANSLATED_REVIEW"]).show(20, max_width=75)

## Using Cortex Translate in SQL
Translate can also be completed in SQL by calling `SNOWFLAKE.CORTEX.TRANSLATE`. In this query, we are translating non-English reviews directly into a new `TRANSLATED_REVIEW` column. 

In [None]:
SELECT 
  REVIEW,
  LANGUAGE,
  CASE 
    WHEN LANGUAGE != 'en' THEN SNOWFLAKE.CORTEX.TRANSLATE(REVIEW, LANGUAGE, 'en')
    ELSE REVIEW
  END AS TRANSLATED_REVIEW
FROM TRUCK_REVIEWS
WHERE LANGUAGE != 'en'
LIMIT 20;

## Conclusion

In this template, we've demonstrated how to translate non-English customer reviews into English using Snowflake's Cortex Translate function. By following these steps, you now have a replicable process for:

- **Identifying non-English reviews:** Filtering and targeting the right data.
- **Applying conditional translation:** Leveraging SQL’s CASE statement to translate text on demand.
- **Generating actionable insights:** Making multilingual data accessible for analysis.

To further enhance your understanding and explore more advanced use cases, continue learning about additional Cortex functions in the [Snowflake Cortex documentation](https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions).
