# GeoCache: *Wine Spectator*'s Top 100 Wines, 1988-2020
List available online on *Wine Spectator*'s [Top 100 Lists web site](https://top100.winespectator.com/lists/).

## File Setup

In [1]:
# import and initialize main python libraries
import numpy as np
import pandas as pd
import shapefile as shp
import matplotlib.pyplot as plt
import seaborn as sns

# import libraries for file navigation
import os
import shutil
import glob
from pandas_ods_reader import read_ods

# import other packages
from scipy import stats
from sklearn import linear_model

# import geo packages
import geopandas as gpd
import descartes
from shapely.geometry import Point, Polygon

# import Geopy packages
import geopy
from geopy.geocoders import Nominatim

In [2]:
# initialize vizualization set
sns.set(style="whitegrid", palette="colorblind", color_codes=True)
sns.mpl.rc("figure", figsize=(10, 6))

# Jupyter Notebook
%matplotlib inline

## Dataframe Exploration

In [3]:
# Note: save CSV files in UTF-8 format to preserve special characters.
df_Wine = pd.read_csv('./CSV_Wines.csv')
df_GeoCache = pd.read_csv('./CSV_GeoCache.csv')
df_GeoList = pd.read_csv('./CSV_GeoList.csv')

In [4]:
df_Wine.shape

(3301, 18)

In [5]:
df_Wine.dtypes

Review_Year           float64
Rank                   object
Vintage                object
Score                 float64
Price                  object
Winemaker              object
Wine                   object
Wine_Style             object
Grape_Blend            object
Blend_List             object
Geography              object
Cases_Made            float64
Cases_Imported        float64
Reviewer               object
Drink_now             float64
Best_Drink_from       float64
Best_Drink_Through    float64
Review                 object
dtype: object

In [6]:
df_GeoCache.shape

(1226, 3)

In [7]:
df_GeoList.shape

(448, 1)

In [8]:
df_Wine.sample(10)

Unnamed: 0,Review_Year,Rank,Vintage,Score,Price,Winemaker,Wine,Wine_Style,Grape_Blend,Blend_List,Geography,Cases_Made,Cases_Imported,Reviewer,Drink_now,Best_Drink_from,Best_Drink_Through,Review
3057,1990.0,58,1987,95.0,170,Jean Gros,Richebourg,Red,Pinot Noir,,Richebourg,900.0,,,1.0,1990.0,,"The class shows clearly in this wine, sending ..."
218,2018.0,19,2016,94.0,17,San Felice,Chianti Classico,Red,Chianti,,Chianti,25000.0,,BS,,2020.0,2036.0,"Expressive and smooth, this hits all the right..."
161,2019.0,62,2015,90.0,15,Viña Haras de Pirque,Cabernet Sauvignon Maipo Valley Hussonet Gran ...,Red,Cabernet Sauvignon,,Maipo Valley,,2500.0,KM,1.0,2019.0,2024.0,"Big and fresh-tasting, with concentrated dark ..."
555,2015.0,56,2010,93.0,44,Cune,Rioja Imperial Reserva,Red,Rioja,,Rioja,,1500.0,TM,,2016.0,2030.0,Smoky and tarry notes give this rich red an au...
1589,2005.0,90,2003,92.0,35,Gunderloch,Riesling Spätlese Rheinhessen Nackenheim Rothe...,White,Riesling,,Rheinhessen,,500.0,BS,1.0,2005.0,2020.0,"Opulent and concentrated, exhibiting mango, pa..."
2548,1995.0,49,1990,93.0,30,Castello Banfi,Brunello di Montalcino,Red,Brunello di Montalcino,,Brunello di Montalcino,,,,,1999.0,,"A wonderful Brunello boasting mineral, violet,..."
2257,1998.0,24,1995,96.0,110,Harlan Estate,Napa Valley,Red,Blend,Bordeaux Blend Red,Napa Valley,1185.0,,JL,,2001.0,2010.0,A tremendous effort with all kinds of extra fl...
456,2016.0,57,2015,91.0,20,Cave de Roquebrun,St.-Chinian-Roquebrun La Grange des Combes,Red,Blend,"Syrah, Grenache and Mourvèdre",St.-Chinian-Roquebrun,12000.0,,GS,1.0,2016.0,2022.0,"A muscular but polished red, with brooding bla..."
2632,1994.0,33,1992,92.0,14,Domaine du Closel,Savennières Cuvée Spéciale,White,Chenin Blanc,,Savennières,2600.0,,,1.0,1994.0,,Imagine an apple tart glazed with honey and le...
1729,2003.0,30,2000,97.0,60,E. Guigal,Hermitage,Red,Shiraz | Syrah,,Hermitage,4580.0,,PM,1.0,2003.0,2012.0,"Ultrarich and ultrathick. A full-bodied, stunn..."


In [9]:
df_GeoCache.sample(10)

Unnamed: 0,Geography,Hierarchy,Address
684,New Mexico,Hierarchy_01,"New Mexico, USA"
339,Willamette Valley,Hierarchy_00,USA
525,Rheingau,Hierarchy_01,"Rheingau, Germany"
850,Rhodes,Hierarchy_02,"Rhodes, Aegean Islands, Greece"
954,Howell Mountain,Hierarchy_02,"North Coast, California, USA"
1155,Morey-St.-Denis Monts Luisants,Hierarchy_04,"Morey-St.-Denis Premier Cru, Morey-St.-Denis, ..."
1048,Romanée St.-Vivant,Hierarchy_03,"Vosne-Romanée, Côte de Nuits, Burgundy, France"
1202,Pommard Les Boucherottes,Hierarchy_05,"Pommard Les Boucherottes, Pommard Premier Cru,..."
297,Edna Valley,Hierarchy_00,USA
693,Yamhill-Carlton District,Hierarchy_01,"Oregon, USA"


In [10]:
df_GeoList.sample(10)

Unnamed: 0,Address
237,"Meursault, Côte de Beaune, Burgundy, France"
110,"Contra Costa County, San Francisco Bay, Centra..."
137,"Dão, Portugal"
422,"Vin de Corse, Corsica, France"
329,"Riverina, New South Wales, Australia"
343,"San Luis Obispo County, Central Coast, Califor..."
418,"Verdicchio dei Castelli di Jesi, Marche, Italy"
96,"Chinon, Touraine, Loire, France"
339,"San Benito County, Central Coast, California, USA"
325,"Ribera del Guadiana, Extremadura, Spain"


### Geocode the Address dataframe
Reference: [Python’s geocoding — Convert a list of addresses into a map](https://towardsdatascience.com/pythons-geocoding-convert-a-list-of-addresses-into-a-map-f522ef513fd6)

In [11]:
# Initialize Nominatim into geolocator variable.
geolocator = Nominatim(user_agent='wine app')

In [12]:
geolocator.geocode('Castilla y León, Spain').raw

{'place_id': 258252333,
 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
 'osm_type': 'relation',
 'osm_id': 349041,
 'boundingbox': ['40.0824504', '43.2382034', '-7.077073', '-1.7753716'],
 'lat': '41.8037172',
 'lon': '-4.7471726',
 'display_name': 'Castilla y León, España',
 'class': 'boundary',
 'type': 'administrative',
 'importance': 0.9625997816800999,
 'icon': 'https://nominatim.openstreetmap.org/ui/mapicons//poi_boundary_administrative.p.20.png'}

In [13]:
geolocator.geocode('Castilla y León, Spain').point

Point(41.8037172, -4.7471726, 0.0)

In [14]:
# Apply geolocator to the Address column in the GeoList dataframe.
df_GeoList['loc'] = df_GeoList['Address'].apply(geolocator.geocode)

In [15]:
# Get .point containing lat/long from Geocode response, if not none.
df_GeoList['point'] = df_GeoList['loc'].apply(lambda loc: tuple(loc.point) if loc else None)

In [16]:
# Split the .point column into separate columns for lat, long, and altitude
df_GeoList[['lat', 'long', 'altitude']] = pd.DataFrame(df_GeoList['point'].to_list(), index=df_GeoList.index)

In [17]:
df_GeoList

Unnamed: 0,Address,loc,point,lat,long,altitude
0,"Abruzzo, Italy","(Abruzzo, Italia, (42.227681, 13.854983))","(42.227681, 13.854983, 0.0)",42.227681,13.854983,0.0
1,"Adelaide Hills, South Australia, Australia","(Adelaide Hills Council, South Australia, Aust...","(-34.901351649999995, 138.8293202817461, 0.0)",-34.901352,138.829320,0.0
2,"Aegean Islands, Greece","(Aegean, Σάμη - Αγία Ευφημία, Καραβόμυλος, Δήμ...","(38.2504094, 20.6304217, 0.0)",38.250409,20.630422,0.0
3,"Aglianico del Vulture, Basilicata, Italy",,,,,
4,"Agrelo, Mendoza, Argentina","(Agrelo, Distrito Agrelo, Departamento Luján d...","(-33.1184629, -68.8859261, 0.0)",-33.118463,-68.885926,0.0
5,"Alba, Piedmont | Piemonte, Italy",,,,,
6,"Alentejo, Portugal","(Alentejo, Portugal, (38.0551003, -7.8605799))","(38.0551003, -7.8605799, 0.0)",38.055100,-7.860580,0.0
7,"Alexander Valley, Sonoma County, North Coast, ...",,,,,
8,"Alicante, Valencia, Spain","(Alacant / Alicante, l'Alacantí, Alacant / Ali...","(38.353738, -0.4901846, 0.0)",38.353738,-0.490185,0.0
9,"Almansa, Castilla La Mancha, Spain","(Almansa, Albacete, Castilla-La Mancha, 02640,...","(38.8682065, -1.0978627, 0.0)",38.868206,-1.097863,0.0


### Append geography details to the GeoCache dataframe
Determine how well populated geography is at different hierarchy levels.

In [18]:
df_GeoCache = pd.merge(df_GeoCache, df_GeoList, on = 'Address', how = 'left' )

In [19]:
df_GeoCache.to_csv(path_or_buf = './GeoCache.csv', index = False)

### Append Hierarchy 00 details to the df_Wine dataset

In [20]:
# filter df_GeoCache to Hierarchy_00

df_GeoCache00 = df_GeoCache[
    (df_GeoCache.Hierarchy == 'Hierarchy_00')
]

df_GeoCache00.sample(10)

Unnamed: 0,Geography,Hierarchy,Address,loc,point,lat,long,altitude
235,Tuscany,Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
257,Hemel-en-Aarde,Hierarchy_00,South Africa,"(South Africa, (-28.8166236, 24.991639))","(-28.8166236, 24.991639, 0.0)",-28.816624,24.991639,0.0
344,Walla Walla Valley,Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
57,Graves,Hierarchy_00,France,"(France, (46.603354, 1.8883335))","(46.603354, 1.8883335, 0.0)",46.603354,1.888334,0.0
228,Morellino di Scansano,Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
239,Prosecco,Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
241,Valpolicella Ripasso,Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
80,Corton Les Renardes,Hierarchy_00,France,"(France, (46.603354, 1.8883335))","(46.603354, 1.8883335, 0.0)",46.603354,1.888334,0.0
184,Cirò,Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
17,Coonawarra,Hierarchy_00,Australia,"(Australia, (-24.7761086, 134.755))","(-24.7761086, 134.755, 0.0)",-24.776109,134.755,0.0


In [21]:
df_Wine00 = pd.merge(df_Wine, df_GeoCache00, on = 'Geography', how = 'left')

df_Wine00.sample(10)

Unnamed: 0,Review_Year,Rank,Vintage,Score,Price,Winemaker,Wine,Wine_Style,Grape_Blend,Blend_List,...,Best_Drink_from,Best_Drink_Through,Review,Hierarchy,Address,loc,point,lat,long,altitude
2101,2000.0,100,1998,91.0,29,Freie Weingärtner Wachau,Riesling Smaragd Trocken Wachau Spitzer Singer...,White,Riesling,,...,2001.0,2006.0,Multidimensional. Intense aromas of licorice a...,Hierarchy_00,Austria,"(Österreich, (47.2000338, 13.199959))","(47.2000338, 13.199959, 0.0)",47.200034,13.199959,0.0
3268,1988.0,66,1985,93.0,25,Château La Croix,Pomerol,Red,Pomerol,,...,1991.0,1993.0,Rich and powerful yet elegant and decadent wit...,Hierarchy_00,France,"(France, (46.603354, 1.8883335))","(46.603354, 1.8883335, 0.0)",46.603354,1.888334,0.0
890,2012.0,89,2009,92.0,40,Honig,Cabernet Sauvignon Napa Valley,Red,Cabernet Sauvignon,,...,2013.0,2023.0,"Dense and tight, exhibiting a firm core of loa...",Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
3054,1990.0,52,1987,93.0,24,Clos du Bois,Chardonnay Alexander Valley Winemaker's Reserve,White,Chardonnay,,...,1990.0,1993.0,"A rich, smooth, creamy style that offers a bro...",Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
897,2012.0,96,2009,94.0,75,Pahlmeyer,Merlot Napa Valley,Red,Merlot,,...,2012.0,2018.0,"Plush and richly structured, but with a dense,...",Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
763,2013.0,62,2010,91.0,22,Concha y Toro,Syrah Buin Marqués de Casa Concha,Red,Shiraz | Syrah,,...,2014.0,2018.0,"This compact red needs aeration, featuring lay...",Hierarchy_00,Chile,"(Chile, (-31.7613365, -71.3187697))","(-31.7613365, -71.3187697, 0.0)",-31.761336,-71.31877,0.0
2222,1998.0,7,1997,95.0,40,Château de Beaucastel,Châteauneuf-du-Pape White,White,Châteauneuf-du-Pape,,...,1998.0,2010.0,"This classy, distinctive white shows wonderful...",Hierarchy_00,France,"(France, (46.603354, 1.8883335))","(46.603354, 1.8883335, 0.0)",46.603354,1.888334,0.0
3249,1988.0,47,1984,94.0,20,Chateau Montelena,Cabernet Sauvignon Napa Valley,Red,Cabernet Sauvignon,,...,1998.0,,"Deep, dark, concentrated and ripe, showing mas...",Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
610,2014.0,9,2010,95.0,125,Concha y Toro,Cabernet Sauvignon Puente Alto Don Melchor,Red,Cabernet Sauvignon,,...,2014.0,2020.0,"Refined and elegant, with silky tannins behind...",Hierarchy_00,Chile,"(Chile, (-31.7613365, -71.3187697))","(-31.7613365, -71.3187697, 0.0)",-31.761336,-71.31877,0.0
1644,2004.0,43,2001,93.0,26,Pago de los Capellanes,Ribera del Duero Crianza,Red,Blend,Cabernet – Tempranillo,...,2004.0,2015.0,This distinctive red is graceful on the palate...,Hierarchy_00,Spain,"(España, (39.3260685, -4.8379791))","(39.3260685, -4.8379791, 0.0)",39.326068,-4.837979,0.0


### Append Hierarchy 01 details to the df_Wine dataset

In [22]:
# filter df_GeoCache to Hierarchy_00

df_GeoCache01 = df_GeoCache[
    (df_GeoCache.Hierarchy == 'Hierarchy_01')
]

df_GeoCache01.sample(10)

Unnamed: 0,Geography,Hierarchy,Address,loc,point,lat,long,altitude
615,Groenekloof,Hierarchy_01,"Western Cape, South Africa","(Western Cape, South Africa, (-33.546977, 20.7...","(-33.546977, 20.72753, 0.0)",-33.546977,20.72753,0.0
564,Langhe,Hierarchy_01,"Piedmont | Piemonte, Italy","(Piedmont Properties, 78, SP50, San Marzano Ol...","(44.7605629, 8.2998538, 0.0)",44.760563,8.299854,0.0
374,Limestone Coast,Hierarchy_01,"South Australia, Australia","(South Australia, Australia, (-30.5343665, 135...","(-30.5343665, 135.6301212, 0.0)",-30.534367,135.630121,0.0
698,Red Mountain,Hierarchy_01,"Washington, USA","(Washington, District of Columbia, United Stat...","(38.8949924, -77.0365581, 0.0)",38.894992,-77.036558,0.0
566,Nardò,Hierarchy_01,"Puglia, Italy","(Puglia, Italia, (40.9842539, 16.6210027))","(40.9842539, 16.6210027, 0.0)",40.984254,16.621003,0.0
396,Apalta,Hierarchy_01,"Colchagua Valley, Chile","(Colchagua, Palmilla, Provincia de Colchagua, ...","(-34.548228, -71.4013194, 0.0)",-34.548228,-71.401319,0.0
456,Chambolle-Musigny,Hierarchy_01,"Burgundy, France","(Bourgogne, France métropolitaine, France, (47...","(47.27808725, 4.222486304306048, 0.0)",47.278087,4.222486,0.0
538,Val di Neto IGT,Hierarchy_01,"Calabria, Italy","(Calabria, Italia, (39.0565974, 16.5249864))","(39.0565974, 16.5249864, 0.0)",39.056597,16.524986,0.0
451,Santenay,Hierarchy_01,"Burgundy, France","(Bourgogne, France métropolitaine, France, (47...","(47.27808725, 4.222486304306048, 0.0)",47.278087,4.222486,0.0
673,Napa Valley,Hierarchy_01,"California, USA","(California, United States, (36.7014631, -118....","(36.7014631, -118.755997, 0.0)",36.701463,-118.755997,0.0


In [23]:
df_Wine01 = pd.merge(df_Wine, df_GeoCache01, on = 'Geography', how = 'left')

df_Wine01.sample(10)

Unnamed: 0,Review_Year,Rank,Vintage,Score,Price,Winemaker,Wine,Wine_Style,Grape_Blend,Blend_List,...,Best_Drink_from,Best_Drink_Through,Review,Hierarchy,Address,loc,point,lat,long,altitude
63,2020.0,64,2018,92.0,35,La Crema,Pinot Noir Willamette Valley,Red,Pinot Noir,,...,2020.0,2027.0,"Sleek and vibrant, with snappy raspberry and r...",Hierarchy_01,"Oregon, USA","(Oregon, United States, (43.9792797, -120.7372...","(43.9792797, -120.737257, 0.0)",43.97928,-120.737257,0.0
1522,2005.0,21,2002,97.0,85,Barossa Valley Estate,Shiraz Barossa Valley E&E Black Pepper,Red,Shiraz | Syrah,,...,2010.0,2022.0,"Dark, juicy and profound, with layer upon laye...",Hierarchy_01,"South Australia, Australia","(South Australia, Australia, (-30.5343665, 135...","(-30.5343665, 135.6301212, 0.0)",-30.534367,135.630121,0.0
760,2013.0,59,2010,95.0,72,Donum,Pinot Noir Carneros,Red,Pinot Noir,,...,2013.0,2023.0,Offers ebullient raspberry and black cherry fl...,Hierarchy_01,"California, USA","(California, United States, (36.7014631, -118....","(36.7014631, -118.755997, 0.0)",36.701463,-118.755997,0.0
83,2020.0,84,2019,94.0,35,Delas,St.-Joseph White Les Challeys,White,Blend,Marsanne – Roussanne,...,2020.0,2027.0,"Very bright and engaging style, with a flurry ...",Hierarchy_01,"Rhône, France","(Rhône, Circonscription départementale du Rhôn...","(45.8802348, 4.564533629559522, 0.0)",45.880235,4.564534,0.0
1446,2006.0,45,2004,95.0,85,Tenuta Sette Ponti,Toscana Oreno,Red,Blend,"Merlot, Sangiovese and Cabernet Sauvignon",...,2009.0,,"Fabulous aromas of ripe blackberry, cappuccino...",Hierarchy_01,"Tuscany, Italy","(Toscana, Italia, (43.4586541, 11.1389204))","(43.4586541, 11.1389204, 0.0)",43.458654,11.13892,0.0
775,2013.0,74,2011,93.0,55,Lucia,Pinot Noir Santa Lucia Highlands Garys' Vineyard,Red,Pinot Noir,,...,2013.0,2021.0,"Offers a rich, potent, vivid mix of dark berry...",Hierarchy_01,"California, USA","(California, United States, (36.7014631, -118....","(36.7014631, -118.755997, 0.0)",36.701463,-118.755997,0.0
3104,1989.0,2,1986,97.0,32,Château Clerc Milon,Pauillac,Red,Blend,Bordeaux Blend Red,...,1998.0,2009.0,Seductively rich and supple with layers of ele...,Hierarchy_01,"Bordeaux, France","(Bordeaux, Gironde, Nouvelle-Aquitaine, France...","(44.841225, -0.5800364, 0.0)",44.841225,-0.580036,0.0
3106,1989.0,4,1986,97.0,58,Château Pichon-Longueville Baron,Pauillac,Red,Blend,Bordeaux Blend Red,...,,,Amazingly elegant and complex featuring rich c...,Hierarchy_01,"Bordeaux, France","(Bordeaux, Gironde, Nouvelle-Aquitaine, France...","(44.841225, -0.5800364, 0.0)",44.841225,-0.580036,0.0
1652,2004.0,51,2003,92.0,24,Two Hands,Shiraz McLaren Vale Angel's Share,Red,Shiraz | Syrah,,...,2007.0,2015.0,"Not a huge mouthful, but it unfolds its flavor...",Hierarchy_01,"South Australia, Australia","(South Australia, Australia, (-30.5343665, 135...","(-30.5343665, 135.6301212, 0.0)",-30.534367,135.630121,0.0
227,2018.0,28,2016,93.0,20,Penley,Cabernet Sauvignon Coonawarra Phoenix,Red,Cabernet Sauvignon,,...,2018.0,2030.0,"Dense, with sink-your-teeth-into-them tannins ...",Hierarchy_01,"South Australia, Australia","(South Australia, Australia, (-30.5343665, 135...","(-30.5343665, 135.6301212, 0.0)",-30.534367,135.630121,0.0


### Save files for use in other notebooks

In [24]:
df_Wine00.to_csv(path_or_buf = './Wine_Hier00.csv', index = False)
df_Wine01.to_csv(path_or_buf = './Wine_Hier01.csv', index = False)