# Wine Selection for Sales Boost

In this notebook, we will explore the database of wines and vintages to identify the top wines that we can highlight to increase sales.
Our selection will be based on factors such as average rating, rating count, and affordability. 

In [12]:
import sqlite3
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns

In [13]:
# Connect to the database
connection = sqlite3.connect('../data/vivino.db')
cursor = connection.cursor()

In [36]:
# Execute the SQL query
query = """
    SELECT wines.id AS wine_id, 
           wines.name AS wine_name, 
           wines.ratings_average wine_avg_rating, 
           wines.ratings_count wine_ratings_count, 
           COUNT(vintages.year) AS vintages_number, 
           ROUND(AVG(vintages.price_euros), 2) AS average_price
    FROM wines
        JOIN vintages ON wines.id = vintages.wine_id
    WHERE wines.ratings_count > 10000
    GROUP BY wines.id
    ORDER BY wine_avg_rating DESC, 
             wine_ratings_count DESC
    -- LIMIT 10
    ;
"""

cursor.execute(query)
result = cursor.fetchall()

# Print the result
# for row in result:
#    wine_name, rating_avarage, rating_count, year, price_euros=row
#    print(f'Wine Name: {wine_name}, Average Rating: {rating_avarage}, Count Rating:{rating_count}, Year: {year}, Price (Euros): {price_euros}')

df = pd.read_sql_query(query, connection)
print(df.shape)
display(df.head(10))
df.to_csv("../data/csv/question_1_0.csv")

(161, 6)


Unnamed: 0,wine_id,wine_name,wine_avg_rating,wine_ratings_count,vintages_number,average_price
0,77137,Unico,4.7,45140,3,631.87
1,1153863,Sauternes,4.7,44126,9,973.28
2,66294,Special Selection Cabernet Sauvignon,4.7,41236,2,420.25
3,1166837,Pomerol,4.7,32157,6,5948.33
4,1136930,Grange,4.7,24356,4,1066.45
5,2446729,Toscana,4.7,16284,24,2551.35
6,77136,Unico Reserva Especial Edición,4.7,13025,1,644.95
7,66284,Cabernet Sauvignon,4.6,157944,1,177.95
8,86684,Brut Champagne,4.6,146377,5,437.27
9,5078,Sassicaia,4.6,107646,25,990.24


Our goal is to select wines that have a high average rating, a significant number of ratings, and an affordable price. We will focus on wines with a price under 50 euros to make them accessible to a wider range of customers.

In [37]:
# Execute the SQL query
query = """
    SELECT wines.id AS wine_id, 
           wines.name AS wine_name, 
           wines.ratings_average wine_avg_rating, 
           wines.ratings_count wine_ratings_count, 
           COUNT(vintages.year) AS vintages_number, 
           ROUND(AVG(vintages.price_euros), 2) AS average_price
    FROM wines
        JOIN vintages ON wines.id = vintages.wine_id
    WHERE vintages.price_euros <= 50 AND
          wines.ratings_count > 10000 
    GROUP BY wines.id
    ORDER BY wine_avg_rating DESC, 
             wine_ratings_count DESC,
             vintages.price_euros ASC
    -- LIMIT 10
    ;
"""

cursor.execute(query)
result = cursor.fetchall()

# Print the result
# for row in result:
#    wine_name, rating_avarage, rating_count, year, price_euros=row
#    print(f'Wine Name: {wine_name}, Average Rating: {rating_avarage}, Count Rating:{rating_count}, Year: {year}, Price (Euros): {price_euros}')

df = pd.read_sql_query(query, connection)
print(df.shape)
display(df.head(10))
df.to_csv("../data/csv/question_1_1.csv")

(19, 6)


Unnamed: 0,wine_id,wine_name,wine_avg_rating,wine_ratings_count,vintages_number,average_price
0,11890,60 Sessantanni Old Vines Primitivo di Manduria,4.5,94289,1,24.75
1,1139434,Tinto,4.4,65625,1,42.95
2,12393,Amarone della Valpolicella Classico,4.4,48329,1,46.25
3,1383553,Roda I Reserva Rioja,4.4,22749,1,40.98
4,1148298,Brut Rosé Champagne,4.4,21147,1,49.61
5,1129440,Old Vines Primitivo di Manduria,4.4,20904,1,20.95
6,75978,Quinta da Leda Douro,4.4,14197,1,30.0
7,76476,Belondrade y Lurton,4.4,13567,1,49.75
8,2355320,Fusion V,4.4,12877,1,36.1
9,6331780,Guerriero della Terra,4.4,10185,2,26.68


In [10]:
# Close the cursor and connection
cursor.close()
connection.close()

Now, let's analyze why we selected these wines and provide recommendations for the marketing strategy.

We have selected these wines based on the following criteria:

#### Average Rating: 
We prioritize wines with high average ratings, as they indicate a good overall quality that customers are likely to appreciate.
#### Rating Count: 
Wines with a significant number of ratings demonstrate popularity and consumer trust. This can encourage potential buyers to try these wines.
#### Price:
 By choosing wines priced under 50 euros, we ensure that the highlighted wines are accessible to a broader range of customers.