<div id="singlestore-header" style="display: flex; background-color: rgba(235, 249, 245, 0.25); padding: 5px;">
    <div id="icon-image" style="width: 90px; height: 90px;">
        <img width="100%" height="100%" src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/header-icons/browser.png" />
    </div>
    <div id="text" style="padding: 5px; margin-left: 10px;">
        <div id="badge" style="display: inline-block; background-color: rgba(0, 0, 0, 0.15); border-radius: 4px; padding: 4px 8px; align-items: center; margin-top: 6px; margin-bottom: -2px; font-size: 80%">SingleStore Notebooks</div>
        <h1 style="font-weight: 500; margin: 8px 0 0 4px;">Demonstrate some common AI function usecases</h1>
    </div>
</div>

<div class="alert alert-block alert-warning">
    <b class="fa fa-solid fa-exclamation-circle"></b>
    <div>
        <p><b>Note</b></p>
        <p>You can use your existing Standard or Premium workspace with this Notebook.</p>
    </div>
</div>


This feature is currently in **Private Preview**. Please reach out to support@singlestore.com to confirm if this feature can be enabled in your org.

This Jupyter notebook will help you:
1. Load the Amazon Fine Foods Reviews dataset from Kaggle
2. Store the data in SingleStore
3. Demonstrate powerful AI Functions for text processing and analysis

**Prerequisites**: Ensure AI Functions are installed on your deployment (AI Services > AI & ML Functions).

## Create some simple tables

This setup establishes a basic relational structure to store some reviews for restaurants. Ensure you have selected a database and have CREATE permissions to create/delete tables.

In [1]:
%%sql
CREATE DATABASE IF NOT EXISTS temp;
USE temp;

In [2]:
%%sql
DROP TABLE IF EXISTS reviews;

CREATE TABLE IF NOT EXISTS reviews (
    Id INT PRIMARY KEY,
    ProductId VARCHAR(20),
    UserId VARCHAR(50),
    ProfileName VARCHAR(255),
    HelpfulnessNumerator INT,
    HelpfulnessDenominator INT,
    Score INT,
    Time BIGINT,
    Summary TEXT,
    Text TEXT
);

## Install the required packages

In [3]:
!pip install -q httplib2 kagglehub pandas

Collecting kagglehub
  Downloading kagglehub-0.3.13-py3-none-any.whl.metadata (38 kB)
Downloading kagglehub-0.3.13-py3-none-any.whl (68 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m68.3/68.3 kB[0m [31m36.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: kagglehub
Successfully installed kagglehub-0.3.13


## Download and Load Dataset

In [4]:
import kagglehub
import pandas as pd

# Download the Amazon Fine Foods Reviews dataset from Kaggle
print("Downloading dataset from Kaggle...")
path = kagglehub.dataset_download("snap/amazon-fine-food-reviews")
print(f"Dataset downloaded to: {path}")

# Read the CSV file
df = pd.read_csv(f"{path}/Reviews.csv")

# Display dataset info
print(f"\nDataset shape: {df.shape}")
print(f"Columns: {list(df.columns)}")
print("\nFirst few rows:")
df.head()

Downloading dataset from Kaggle...
Downloading from https://www.kaggle.com/api/v1/datasets/download/snap/amazon-fine-food-reviews?dataset_version_number=2...


100%|██████████| 242M/242M [00:01<00:00, 164MB/s]  

Extracting files...





Dataset downloaded to: /home/jovyan/.cache/kagglehub/datasets/snap/amazon-fine-food-reviews/versions/2

Dataset shape: (568454, 10)
Columns: ['Id', 'ProductId', 'UserId', 'ProfileName', 'HelpfulnessNumerator', 'HelpfulnessDenominator', 'Score', 'Time', 'Summary', 'Text']

First few rows:


Unnamed: 0,Id,ProductId,UserId,ProfileName,HelpfulnessNumerator,HelpfulnessDenominator,Score,Time,Summary,Text
0,1,B001E4KFG0,A3SGXH7AUHU8GW,delmartian,1,1,5,1303862400,Good Quality Dog Food,I have bought several of the Vitality canned d...
1,2,B00813GRG4,A1D87F6ZCVE5NK,dll pa,0,0,1,1346976000,Not as Advertised,Product arrived labeled as Jumbo Salted Peanut...
2,3,B000LQOCH0,ABXLMWJIXXAIN,"Natalia Corres ""Natalia Corres""",1,1,4,1219017600,"""Delight"" says it all",This is a confection that has been around a fe...
3,4,B000UA0QIQ,A395BORC6FGVXV,Karl,3,3,2,1307923200,Cough Medicine,If you are looking for the secret ingredient i...
4,5,B006K2ZZ7K,A1UQRSCLF8GW1T,"Michael D. Bigham ""M. Wassir""",0,0,5,1350777600,Great taffy,Great taffy at a great price. There was a wid...


## Load Data into SingleStore

In [5]:
import singlestoredb as s2

# Create SQLAlchemy engine instead of regular connection
engine = s2.create_engine()

# Take a sample of 10,000 reviews for demo purposes
sample_df = df.head(10000).copy()

print(f"Loading {len(sample_df)} reviews into SingleStore...")

# Write dataframe to SingleStore table using SQLAlchemy engine
sample_df.to_sql(
    'reviews',
    con=engine,  # Use engine instead of connection
    if_exists='append',
    index=False,
    chunksize=1000
)

print("Data loaded successfully!")

Loading 10000 reviews into SingleStore...
Data loaded successfully!


 ## Verify Data Load

In [6]:
%%sql
-- Check the number of reviews loaded
SELECT COUNT(*) as total_reviews FROM reviews;

total_reviews
10000


## Sample Data Preview

In [7]:
%%sql
-- View sample reviews
SELECT Id, ProductId, Score, Summary, LEFT(Text, 100) as Review_Preview
FROM reviews
LIMIT 10;

Id,ProductId,Score,Summary,Review_Preview
24,B001GVISJM,5,Twizzlers,I love this candy. After weight watchers I had to cut back but still have a craving for it.
302,B001UJEN6C,5,Tested by a trucker,I drive OTR...Over the Road truck and this helps to keep me alert. It has no sugar high & no crash.
415,B003XV5LHK,5,Double the pleasure!,"If you like Oreo's or the Oreo cakester, then you've got to try the Double Stuff cakesters!!! These"
599,B000G6RYNE,5,These chips will make you fat,But you will enjoy ever step. I gained 5 lbs within a month of buying this 12 pack of full sized bag
651,B001EPPCNK,5,Bavarian Creme Flavor Oil,"We make up a coffee creamer with 1/4 part commercial coffee creamer, vanilla, butter creme, almond a"
710,B000G6MBX2,5,"Plocky's Tortilla Chips, Red Beans 'N Rice","We ordered these only because the Black Beans and Rice,which we really. really love, was out of stoc"
856,B0007NG568,2,Funny taste,"I can't eat these oats, they have a funny taste to them. My kids also think they taste funny. My h"
1068,B0017ZBPTW,3,Too pricey,"It's a good idea to have raw sugar type cubes, but this was 3x the cost of granulated raw sugar"
1295,B002WJYCR4,5,Deeee-licious!,OMG these things are delicious!@ I'm not a candy/sweets eater usually.. but ... OhMan! I liked them
1304,B001FA1L7U,5,The Best!,Not always available at our local stores. I love them. They are especially good watching football ga


## AI Functions Demonstrations

Now let's explore the power of SingleStore AI Functions for text analysis and processing.
Ensure that AI functions are enabled for the org and you are able to list the available AI functions

In [8]:
%%sql
USE cluster;
SHOW functions;

Functions_in_cluster,Function Type,Definer,Data Format,Runtime Type,Link,Options
AI_CLASSIFY,External User Defined Function,cluster-admin@%,ROWDAT_1,Managed Service,,
AI_COMPLETE,External User Defined Function,cluster-admin@%,ROWDAT_1,Managed Service,,
AI_EXTRACT,External User Defined Function,cluster-admin@%,ROWDAT_1,Managed Service,,
AI_SENTIMENT,External User Defined Function,cluster-admin@%,ROWDAT_1,Managed Service,,
AI_SUMMARIZE,External User Defined Function,cluster-admin@%,ROWDAT_1,Managed Service,,
AI_TRANSLATE,External User Defined Function,cluster-admin@%,ROWDAT_1,Managed Service,,
EMBED_TEXT,External User Defined Function,cluster-admin@%,ROWDAT_1,Managed Service,,
ML_ANOMALY_DETECT,External User Defined Function,cluster-admin@%,ROWDAT_1,Managed Service,,
ML_CLASSIFY,External User Defined Function,cluster-admin@%,ROWDAT_1,Managed Service,,
VECTOR_SIMILARITY,External User Defined Function,cluster-admin@%,ROWDAT_1,Managed Service,,


In [9]:
%%sql
-- AI_COMPLETE: Ask general questions and get LLM-powered completions
SELECT cluster.AI_COMPLETE(
    'What is SingleStore?'
) AS completion;

completion
"SingleStore is a unified, cloud-native database designed for scalable data processing, analytics, and real-time applications. It combines transactional and analytical capabilities in one platform, allowing businesses to handle large volumes of data with high performance and low latency."


In [10]:
%%sql
-- AI_SENTIMENT: Analyze sentiment of customer reviews for a specific product
-- WHERE ProductId = <Your choice>
-- Remember to specify the datbase name. In this example 'temp' is the Database name
SELECT
    Id,
    ProductId,
    Score,
    LEFT(Text, 80) as Review_Snippet,
    cluster.AI_SENTIMENT(Text) AS sentiment
FROM temp.reviews
WHERE ProductId = 'B000NY8ODS'
LIMIT 10;

Id,ProductId,Score,Review_Snippet,sentiment
187,B000NY8ODS,5,This packet of glaze is the secret to making those European style fresh fruit ta,"{  ""score"": 0.8,  ""sentiment"": ""positive"" }"


In [11]:
%%sql
-- Aggregate sentiment analysis across products
-- Using CTE to filter and prepare data first
WITH filtered_reviews AS (
    SELECT
        ProductId,
        Text
    FROM temp.reviews
    WHERE ProductId IN (
        SELECT ProductId
        FROM temp.reviews
        GROUP BY ProductId
        HAVING COUNT(*) >= 5
    )
    LIMIT 100
),
grouped_reviews AS (
    SELECT
        ProductId,
        COUNT(*) as review_count,
        GROUP_CONCAT(Text SEPARATOR '. ') as combined_text
    FROM filtered_reviews
    GROUP BY ProductId
    LIMIT 5
)
SELECT
    ProductId,
    review_count,
    cluster.AI_SENTIMENT(combined_text) as overall_sentiment
FROM grouped_reviews;

ProductId,review_count,overall_sentiment
B001L1DYAA,1,"{""score"": 0.8, ""sentiment"": ""positive""}"
B004391DK0,3,"{""score"": 0.85, ""sentiment"": ""positive""}"
B0016FY6H6,1,"{  ""score"": 0.6,  ""sentiment"": ""positive"" }"
B00139TT72,2,"{  ""sentiment"": ""neutral"",  ""score"": 0.1 }"
B000E7WM0K,2,"{  ""score"": 0.4,  ""sentiment"": ""positive"" }"


In [12]:
%%sql
-- AI_SUMMARIZE: Create concise summaries of lengthy reviews
-- Filter long reviews first using CTE
WITH long_reviews AS (
    SELECT
        Id,
        ProductId,
        Text,
        LEFT(Text, 150) as Original_Review
    FROM temp.reviews
    WHERE LENGTH(Text) > 200
    LIMIT 5
)
SELECT
    Id,
    ProductId,
    Original_Review,
    cluster.AI_SUMMARIZE(
        Text,
        'aifunctions_chat_default',
        15
    ) AS summary
FROM long_reviews;

Id,ProductId,Original_Review,summary
121,B003SE19UK,"I have done a lot of research to find the best food for my cat, and this is an excellent food. That is also according to my holistic veterinarian. T","Excellent probiotic food; cat loved it immediately, approved by holistic veterinarian."
187,B000NY8ODS,"This packet of glaze is the secret to making those European style fresh fruit tarts. I am about to make one for a pie auction at church, after a frien","European-style fruit tarts use glaze for fresh fruit, hold shape, and enhance appearance."
498,B000G6RYNE,"Kettle Chips Spicy Thai potato chips have the perfect amount of sweet, savory and spicy-- everything you'd want in a good meal or better yet...a potat","Kettle Chips Spicy Thai offers sweet, tangy, mildly spicy crunch with no artificial ingredients."
690,B000G6MBX2,My aunt gave me a bag of these and I was immediately addicted. There are actual pieces of beans in the chips and they are not overly salty like other,"Bean chips with real beans, not too salty, no trans fat, delicious and addictive."
795,B00285FF6O,"The best chocolate in the world, in this critic's humble opinion, is made in the United States. And the best chocolate in the United States is made in","Ghirardelli in California makes the best U.S. chocolate, widely available and exquisite."


In [13]:
%%sql
-- AI_CLASSIFY: Classify customer feedback into categories
-- Filter negative reviews first using CTE
WITH negative_reviews AS (
    SELECT
        Id,
        ProductId,
        Text,
        LEFT(Text, 100) as Review_Text
    FROM temp.reviews
    WHERE Score <= 3
    LIMIT 10
)
SELECT
    Id,
    ProductId,
    Review_Text,
    cluster.AI_CLASSIFY(
        Text,
        '[quality, price, shipping, taste]'
    ) AS classification
FROM negative_reviews;

Id,ProductId,Review_Text,classification
1445,B001E50UEQ,"Me and my wife have tried a wide range of the Hormel compleats, and this is by far the worst tasting",taste
1540,B000E7WM0K,We enjoy A Taste of Tai's Peanut Noodles... they rock. That peanutty flavor...mmm... so we decided,taste
1627,B001RVFDOO,I agree with many of these reviews. The product tastes good - not fantastic but good. I don't like,taste
2117,B00061EXBU,you won't BELIEVE how many ways this product has saved my life! I only gave it one star because i'm,price
2890,B000F9Z1WI,It's very nice that the company put 100 calories packages out--great marketing idea. They taste oka,taste
3079,B000FDKQCO,I like that this item has flax seed in it but the whole wheat flower makes it very grainy. It is al,taste
3150,B000FDKQCY,"howdy y'all, this bread is pretty good. the potato flavor is there, recognizable as potat",taste
3415,B005K4Q1VI,With Green Mountain Hot Cocoa being perpet,taste
3456,B005K4Q1VI,this product taste stale and is full of artificial ingredients. NO REAL PEPPERMINT IS IN THIS P,taste
3563,B004G8ZAS4,"These arrived fresh and in a timely manner. However, flavor-wise our family is in agreement that Je",taste


In [14]:
%%sql
-- AI_EXTRACT: Extract specific information from reviews
-- Filter positive reviews first using CTE
WITH positive_reviews AS (
    SELECT
        Id,
        ProductId,
        Text,
        LEFT(Text, 100) as Review_Text
    FROM temp.reviews
    WHERE Score >= 4
    LIMIT 10
)
SELECT
    Id,
    ProductId,
    Review_Text,
    cluster.AI_EXTRACT(
        Text,
        'Does this customer indicate they will buy this product again? Answer with yes, no, or unclear only'
    ) AS repeat_purchase_intent
FROM positive_reviews;

Id,ProductId,Review_Text,repeat_purchase_intent
121,B003SE19UK,"I have done a lot of research to find the best food for my cat, and this is an excellent food. That",yes
187,B000NY8ODS,This packet of glaze is the secret to making those European style fresh fruit tarts. I am about to m,yes
498,B000G6RYNE,"Kettle Chips Spicy Thai potato chips have the perfect amount of sweet, savory and spicy-- everything",yes
516,B000G6RYNE,"Despite coming in an extremely large box, I found this to be great value. All the bags were preserve",yes
686,B000G6MBX2,A nice case of chips that are quite tasty. I definitely enjoy the Kettle Sea Salt and Black Pepper.,yes yes yes yes yes yes yes yes yes yes
690,B000G6MBX2,My aunt gave me a bag of these and I was immediately addicted. There are actual pieces of beans in t,yes yes yes yes yes yes yes yes yes yes
795,B00285FF6O,"The best chocolate in the world, in this critic's humble opinion, is made in the United States. And",yes
886,B000HDMUQ2,"This brand of granola bars are really good. The very berry are my favorite. However, the bars are",unclear
949,B003KDCJYY,Pretty good product. The taste isn't the best but it's definitely not the worse either. It gives a,unclear
971,B0002XIB2Y,Nothing easier. Nothing better. Even beats grandmother's white gravy recipe. Already peppered for yo,yes yes yes yes yes yes yes yes yes yes


In [15]:
%%sql
-- AI_EXTRACT: Identify reviews with high churn risk
-- Filter low-rated reviews first using CTE
WITH low_rated_reviews AS (
    SELECT
        Id,
        ProductId,
        Score,
        Text,
        LEFT(Text, 120) as Review_Text
    FROM temp.reviews
    WHERE Score <= 2
    LIMIT 10
)
SELECT
    Id,
    ProductId,
    Score,
    Review_Text,
    cluster.AI_EXTRACT(
        Text,
        'Is this customer at high risk of not purchasing again? Answer with high, medium, or low only'
    ) AS churn_risk
FROM low_rated_reviews;

Id,ProductId,Score,Review_Text,churn_risk
1445,B001E50UEQ,1,"Me and my wife have tried a wide range of the Hormel compleats, and this is by far the worst tasting one of the bunch. I",high
1540,B000E7WM0K,2,"We enjoy A Taste of Tai's Peanut Noodles... they rock. That peanutty flavor...mmm... so we decided to try these... um,",medium
2117,B00061EXBU,1,you won't BELIEVE how many ways this product has saved my life! I only gave it one star because i'm so dehydrated from,medium
3415,B005K4Q1VI,2,With Green Mountain Hot Cocoa being perpetually out of stock i,high
3456,B005K4Q1VI,1,this product taste stale and is full of artificial ingredients. NO REAL PEPPERMINT IS IN THIS PRODUCT and to be hon,high
5103,B004157PZI,2,"These are very tasty, but way too sweet & sugary. I felt like I was eating candy instead of meat. If that's okay with",high
5405,B00622CYVS,2,"Amazon normally does a fantastic job getting product out to me in great shape and on-time, but, I guess this is one prod",medium
6359,B001AW9PTO,2,"too salty -- enough said. I order the wood smoked 30-count jar. Remember, the price is for only ONE JAR (not two).",high
6364,B001AW9PTO,1,Wow. The hickory smoked jerkey sticks were incredibly salty. I could not even eat half of the stick. I have had 3 differ,high
74,B0059WXJKM,1,Buyer Beware Please! This sweetener is not for everybody. Maltitol is an alcohol sugar and can be undigestible in the b,high


In [16]:
%%sql
-- AI_TRANSLATE: Translate text between languages
-- Filter reviews with substantial summaries first using CTE
WITH translatable_reviews AS (
    SELECT
        Id,
        Summary as Original_English
    FROM temp.reviews
    WHERE Score = 5
    AND Summary IS NOT NULL
    AND LENGTH(Summary) > 20
    LIMIT 5
)
SELECT
    Id,
    Original_English,
    cluster.AI_TRANSLATE(
        Original_English,
        'english',
        'spanish'
    ) AS spanish_translation
FROM translatable_reviews;

Id,Original_English,spanish_translation
26,Twizzlers - Strawberry,Twizzlers - Fresa
746,Anti-Oxidant Smoothie,Batido antioxidante
986,"Moore's Marinade, Gluten, low sodium and MSG Free!","Marinada de Moore, ¡sin gluten, baja en sodio y sin glutamato monosódico!"
1325,"Equidorian OJio Arriba Carillo ""RAW Oranic"" Cacao","Equidoriano OJio Arriba Carillo ""RAW Orgánico"" Cacao"
1909,Too good to continue for long.,Demasiado bueno para continuar por mucho tiempo.


In [17]:
%%sql
-- Combined AI Functions: Comprehensive product analysis
-- Filter to products with multiple reviews first
WITH popular_products AS (
    SELECT ProductId
    FROM temp.reviews
    GROUP BY ProductId
    HAVING COUNT(*) >= 10
    LIMIT 5
),
product_reviews AS (
    SELECT
        r.ProductId,
        r.Text,
        r.Score,
        LEFT(r.Text, 80) as Review_Sample
    FROM temp.reviews r
    INNER JOIN popular_products p ON r.ProductId = p.ProductId
    LIMIT 10
)
SELECT
    ProductId,
    Score,
    Review_Sample,
    cluster.AI_SENTIMENT(Text) as sentiment,
    cluster.AI_CLASSIFY(Text, '[quality, value, taste, packaging]') as category,
    cluster.AI_SUMMARIZE(Text, 'aifunctions_chat_default', 10) as brief_summary
FROM product_reviews;

ProductId,Score,Review_Sample,sentiment,category,brief_summary
B004391DK0,5,As far as gluten free products this is awesome. I use it as a substitute for an,"{""score"": 0.8, ""sentiment"": ""positive""}",taste,"Gluten-free flour substitute works well, especially for pumpkin waffles."
B004391DK0,3,"I was disappointed when I tried this pancake mix. Before going gluten free, Bisq","{""sentiment"": ""negative"", ""score"": -0.6}",taste,Disappointed with gluten-free pancake mix; preferred Pamela brand.
B004391DK0,1,Thought I had died and gone to heaven when I made the bisquit recipe. I was wro,"{ ""score"": -0.6, ""sentiment"": ""negative"" }",taste,"Loved mix, but caused severe indigestion with gluten."
B0019GVYR2,1,I just called Bob's Red Mill customer service (just do a G search for the compan,"{ ""score"": -0.6, ""sentiment"": ""negative"" }",value,"Bob's Red Mill baking soda is all aluminum-free, identical."
B004391DK0,5,I really love this mix as the pancakes are so soft and fluffy. Could not tell th,"{""score"": 0.85, ""sentiment"": ""positive""}",taste,"Gluten-free pancakes and muffins were soft, fluffy, and delicious."
B004391DK0,5,"Can as be used to make cookies - even egg free! I just used the mix, enough app","{""score"": 0.5, ""sentiment"": ""positive""}",taste,Egg-free cookies made with apple sauce; Walmart offers cheaper mix.
B004391DK0,5,I find it really hard to believe this is GF because it tastes better than any GF,"{ ""score"": 0.8, ""sentiment"": ""positive"" }",taste,Waffles taste amazingly good; doubts about gluten-free authenticity.
B004391DK0,3,I love this mix. My favorite of any I've tried. The only reason I starred it at,"{""score"": 0.6, ""sentiment"": ""positive""}",value,"Loved mix, but found better price elsewhere; value varies."
B004391DK0,3,Alittle bit expensive for the product I think that it should be the same price a,"{  ""score"": 0.1,  ""sentiment"": ""neutral"" }",value,Product good but slightly overpriced compared to non-gluten Bisquick.
B004391DK0,1,I regularly purchase this item in the bulk and loved it because it was also dair,"{  ""sentiment"": ""negative"",  ""score"": -0.8 }",quality,Found dead bugs in product; will not repurchase.


## Cleanup

In [18]:
%%sql
DROP TABLE IF EXISTS reviews;
DROP DATABASE IF EXISTS temp;

<div id="singlestore-footer" style="background-color: rgba(194, 193, 199, 0.25); height:2px; margin-bottom:10px"></div>
<div><img src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/singlestore-logo-grey.png" style="padding: 0px; margin: 0px; height: 24px"/></div>