# GeoCache: *Wine Spectator*'s Top 100 Wines, 1988-2020
List available online on *Wine Spectator*'s [Top 100 Lists web site](https://top100.winespectator.com/lists/).

## File Setup

In [12]:
# import and initialize main python libraries
import numpy as np
import pandas as pd
import shapefile as shp
import matplotlib.pyplot as plt
import seaborn as sns

# import libraries for file navigation
import os
import shutil
import glob
from pandas_ods_reader import read_ods

# import other packages
from scipy import stats
from sklearn import linear_model

# import geo packages
import geopandas as gpd
import descartes
from shapely.geometry import Point, Polygon

# import Geopy packages
import geopy
from geopy.geocoders import Nominatim

In [13]:
# initialize vizualization set
sns.set(style="whitegrid", palette="colorblind", color_codes=True)
sns.mpl.rc("figure", figsize=(10, 6))

# Jupyter Notebook
%matplotlib inline

## Dataframe Exploration

In [14]:
# Note: save CSV files in UTF-8 format to preserve special characters.
df_Wine = pd.read_csv('./CSV_Wines.csv')
df_GeoCache = pd.read_csv('./CSV_GeoCache.csv')
df_GeoList = pd.read_csv('./CSV_GeoList.csv')

In [15]:
df_Wine.shape

(3301, 18)

In [36]:
df_Wine.dtypes

Review_Year           float64
Rank                   object
Vintage                object
Score                 float64
Price                  object
Winemaker              object
Wine                   object
Wine_Style             object
Grape_Blend            object
Blend_List             object
Geography              object
Cases_Made            float64
Cases_Imported        float64
Reviewer               object
Drink_now             float64
Best_Drink_from       float64
Best_Drink_Through    float64
Review                 object
dtype: object

In [16]:
df_GeoCache.shape

(1226, 3)

In [17]:
df_GeoList.shape

(448, 1)

In [18]:
df_Wine.sample(10)

Unnamed: 0,Review_Year,Rank,Vintage,Score,Price,Winemaker,Wine,Wine_Style,Grape_Blend,Blend_List,Geography,Cases_Made,Cases_Imported,Reviewer,Drink_now,Best_Drink_from,Best_Drink_Through,Review
81,2020.0,82,2016,90.0,20,Tormaresca,Primitivo Salento Torcicoda,Red,Primitivo,,Salento IGT,,4000.0,AN,1.0,2020.0,2025.0,"Well-balanced and fresh, this medium-bodied re..."
2449,1996.0,50,1994,93.0,40,Michel Colin-Deléger,Chassagne-Montrachet En Remilly,White,Chardonnay,,Chassagne-Montrachet En Remilly,408.0,,,,1999.0,,"Seductive and charming, boasting tons of ripe ..."
2622,1994.0,23,NV,93.0,30,Piper-Heidsieck,Brut Rosé Champagne,Sparkling,Champagne,,Champagne,9000.0,,,,,,"Has the taste of authenticity, with a provocat..."
3038,1990.0,39,1988,90.0,8,Rosemount,Shiraz Hunter Valley,Red,Shiraz | Syrah,,Hunter Valley,6000.0,,,,1993.0,,"An outstanding red wine. Very ripe and rich, w..."
416,2016.0,17,2014,93.0,32,Merry Edwards,Sauvignon Blanc Russian River Valley,White,Sauvignon Blanc,,Russian River Valley,9500.0,,MW,1.0,2016.0,,"Succulent, lush and rich, with fleshy mango, m..."
35,2020.0,36,2015,92.0,24,Marchesi de' Frescobaldi,Chianti Classico Tenuta Perano,Red,Chianti,,Chianti,,2000.0,BS,,2021.0,2033.0,The core black cherry and blackberry flavors a...
412,2016.0,13,2013,97.0,70,Reynvaan,Syrah Walla Walla Valley In The Rocks,Red,Shiraz | Syrah,,Walla Walla Valley,602.0,,HS,1.0,2016.0,2025.0,"Supple and expressive, this opens up like a gi..."
212,2018.0,13,2013,96.0,66,Produttori del Barbaresco,Barbaresco Rabajà Riserva,Red,Blend,Nebbiolo,Barbaresco,1420.0,,BS,,2023.0,2038.0,"This fruity version features floral, cherry an..."
842,2012.0,43,2010,91.0,18,Château de la Greffière,Mâcon-La Roche Vineuse Vieilles Vignes,White,Chardonnay,,Mâcon-La Roche Vineuse,,2000.0,BS,1.0,2012.0,2016.0,A light oak influence adds roundness and a hin...
2706,1993.0,7,1990,94.0,15,Mount Veeder,Cabernet Sauvignon Napa Valley,Red,Cabernet Sauvignon,,Napa Valley,2400.0,,,,,,Opulent and chewy with lots of ripe fruit flav...


In [19]:
df_GeoCache.sample(10)

Unnamed: 0,Geography,Hierarchy,Address
966,Sonoma County,Hierarchy_02,"North Coast, California, USA"
186,Greco di Tufo,Hierarchy_00,Italy
1205,Puligny-Montrachet Les Combettes,Hierarchy_05,"Puligny-Montrachet Les Combettes, Puligny-Mont..."
823,Montlouis,Hierarchy_02,"Touraine, Loire, France"
249,Hawkes Bay,Hierarchy_00,New Zealand
218,Alto Adige Terlano,Hierarchy_00,Italy
970,Santa Cruz Mountains,Hierarchy_02,"San Francisco Bay, California Central Coast, USA"
1199,Chassagne-Montrachet Morgeot,Hierarchy_05,"Chassagne-Montrachet Morgeot, Chassagne-Montra..."
388,Kremstal,Hierarchy_01,"Kremstal, Austria"
514,Vacqueyras,Hierarchy_01,"Rhône, France"


In [20]:
df_GeoList.sample(10)

Unnamed: 0,Address
92,"Chehalem Mountains, Willamette Valley, Oregon,..."
266,"New Mexico, USA"
363,"Sicilia, Italy"
140,"Douro, Portugal"
254,"Moulis-en-Médoc, Médoc, Bordeaux, France"
3,"Aglianico del Vulture, Basilicata, Italy"
144,"Eden Valley, Barossa, South Australia, Australia"
172,"Haut-Médoc, Médoc, Bordeaux, France"
253,"Moulin-à-Vent, Beaujolais, France"
328,"Rioja, Spain"


### Geocode the Address dataframe
Reference: [Python’s geocoding — Convert a list of addresses into a map](https://towardsdatascience.com/pythons-geocoding-convert-a-list-of-addresses-into-a-map-f522ef513fd6)

In [21]:
# Initialize Nominatim into geolocator variable.
geolocator = Nominatim(user_agent='wine app')

In [22]:
geolocator.geocode('Castilla y León, Spain').raw

{'place_id': 258252333,
 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright',
 'osm_type': 'relation',
 'osm_id': 349041,
 'boundingbox': ['40.0824504', '43.2382034', '-7.077073', '-1.7753716'],
 'lat': '41.8037172',
 'lon': '-4.7471726',
 'display_name': 'Castilla y León, España',
 'class': 'boundary',
 'type': 'administrative',
 'importance': 0.9625997816800999,
 'icon': 'https://nominatim.openstreetmap.org/ui/mapicons//poi_boundary_administrative.p.20.png'}

In [23]:
geolocator.geocode('Castilla y León, Spain').point

Point(41.8037172, -4.7471726, 0.0)

In [24]:
# Apply geolocator to the Address column in the GeoList dataframe.
df_GeoList['loc'] = df_GeoList['Address'].apply(geolocator.geocode)

In [25]:
# Get .point containing lat/long from Geocode response, if not none.
df_GeoList['point'] = df_GeoList['loc'].apply(lambda loc: tuple(loc.point) if loc else None)

In [26]:
# Split the .point column into separate columns for lat, long, and altitude
df_GeoList[['lat', 'long', 'altitude']] = pd.DataFrame(df_GeoList['point'].to_list(), index=df_GeoList.index)

In [27]:
df_GeoList

Unnamed: 0,Address,loc,point,lat,long,altitude
0,"Abruzzo, Italy","(Abruzzo, Italia, (42.227681, 13.854983))","(42.227681, 13.854983, 0.0)",42.227681,13.854983,0.0
1,"Adelaide Hills, South Australia, Australia","(Adelaide Hills Council, South Australia, Aust...","(-34.901351649999995, 138.8293202817461, 0.0)",-34.901352,138.829320,0.0
2,"Aegean Islands, Greece","(Aegean, Σάμη - Αγία Ευφημία, Καραβόμυλος, Δήμ...","(38.2504094, 20.6304217, 0.0)",38.250409,20.630422,0.0
3,"Aglianico del Vulture, Basilicata, Italy",,,,,
4,"Agrelo, Mendoza, Argentina","(Agrelo, Distrito Agrelo, Departamento Luján d...","(-33.1184629, -68.8859261, 0.0)",-33.118463,-68.885926,0.0
5,"Alba, Piedmont | Piemonte, Italy",,,,,
6,"Alentejo, Portugal","(Alentejo, Portugal, (38.0551003, -7.8605799))","(38.0551003, -7.8605799, 0.0)",38.055100,-7.860580,0.0
7,"Alexander Valley, Sonoma County, North Coast, ...",,,,,
8,"Alicante, Valencia, Spain","(Alacant / Alicante, l'Alacantí, Alacant / Ali...","(38.353738, -0.4901846, 0.0)",38.353738,-0.490185,0.0
9,"Almansa, Castilla La Mancha, Spain","(Almansa, Albacete, Castilla-La Mancha, 02640,...","(38.8682065, -1.0978627, 0.0)",38.868206,-1.097863,0.0


### Append geography details to the GeoCache dataframe
Determine how well populated geography is at different hierarchy levels.

In [28]:
df_GeoCache = pd.merge(df_GeoCache, df_GeoList, on = 'Address', how = 'left' )

In [29]:
df_GeoCache.to_csv(path_or_buf = './GeoCache.csv', index = False)

### Append Hierarchy 00 details to the df_Wine dataset

In [30]:
# filter df_GeoCache to Hierarchy_00

df_GeoCache00 = df_GeoCache[
    (df_GeoCache.Hierarchy == 'Hierarchy_00')
]

df_GeoCache00.sample(10)

Unnamed: 0,Geography,Hierarchy,Address,loc,point,lat,long,altitude
340,Oregon,Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
222,Bolgheri,Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
295,Contra Costa County,Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
115,Vosne-Romanée Clos des Réas,Hierarchy_00,France,"(France, (46.603354, 1.8883335))","(46.603354, 1.8883335, 0.0)",46.603354,1.888334,0.0
200,Marche,Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
165,IGP Var,Hierarchy_00,France,"(France, (46.603354, 1.8883335))","(46.603354, 1.8883335, 0.0)",46.603354,1.888334,0.0
92,Puligny-Montrachet Les Combettes,Hierarchy_00,France,"(France, (46.603354, 1.8883335))","(46.603354, 1.8883335, 0.0)",46.603354,1.888334,0.0
242,Verona IGT,Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
224,Carmignano,Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
235,Tuscany,Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0


In [31]:
df_Wine00 = pd.merge(df_Wine, df_GeoCache00, on = 'Geography', how = 'left')

df_Wine00.sample(10)

Unnamed: 0,Review_Year,Rank,Vintage,Score,Price,Winemaker,Wine,Wine_Style,Grape_Blend,Blend_List,...,Best_Drink_from,Best_Drink_Through,Review,Hierarchy,Address,loc,point,lat,long,altitude
2636,1994.0,34,1992,92.0,14,Robert Mondavi,Zinfandel Napa Valley,Red,Zinfandel,,...,1994.0,,"Ripe, rich and supple, with plush cherry, rasp...",Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
1705,2003.0,4,2001,95.0,60,Clos Mogador,Priorat,Red,Blend,Rare Red Blend,...,2005.0,,"Suave, sophisticated, rich and concentrated, b...",Hierarchy_00,Spain,"(España, (39.3262345, -4.8380649))","(39.3262345, -4.8380649, 0.0)",39.326234,-4.838065,0.0
1136,2009.0,35,2006,93.0,32,Viticcio,Chianti Classico Riserva,Red,Chianti,,...,2010.0,2015.0,"Fabulous aromas of blackberry, dark chocolate ...",Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
252,2018.0,53,2015,93.0,42,Sequoia Grove,Cabernet Sauvignon Napa Valley,Red,Cabernet Sauvignon,,...,2018.0,2030.0,"Rich, smoky oak and plump dark berry, mocha, c...",Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
971,2011.0,70,2009,91.0,20,Morgan,Chardonnay Monterey Metallico Un-Oaked,White,Chardonnay,,...,2011.0,2016.0,"Fresh, intense and vibrant, with an aromatic h...",Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
2326,1997.0,25,1993,96.0,100,Castello di Ama,Vigna l'Apparita,Red,Merlot,,...,2000.0,,"Powerful and exotic, this is the wine of the y...",Hierarchy_00,Italy,"(Italia, (42.6384261, 12.674297))","(42.6384261, 12.674297, 0.0)",42.638426,12.674297,0.0
749,2013.0,48,2011,93.0,39,Greywacke,Pinot Noir Marlborough,Red,Pinot Noir,,...,2013.0,2024.0,"Elegant, with supple, fresh and lively flavors...",Hierarchy_00,New Zealand,"(New Zealand / Aotearoa, (-41.5000831, 172.834...","(-41.5000831, 172.8344077, 0.0)",-41.500083,172.834408,0.0
2455,1996.0,54,1994,92.0,20,Rosenblum,Zinfandel Mount Veeder Brandlin Ranch,Red,Zinfandel,,...,,,This wine has a great sense of harmony and fin...,Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
1990,2001.0,89,1999,90.0,24,Rex Hill,Pinot Noir Willamette Valley,Red,Pinot Noir,,...,2001.0,2006.0,"Ripe, round and distinctive for its layers of ...",Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0
861,2012.0,60,2009,96.0,135,Quilceda Creek,Cabernet Sauvignon Columbia Valley,Red,Cabernet Sauvignon,,...,2015.0,2024.0,"Pure and impressively expressive, with focused...",Hierarchy_00,USA,"(United States, (39.7837304, -100.4458825))","(39.7837304, -100.4458825, 0.0)",39.78373,-100.445882,0.0


### Append Hierarchy 01 details to the df_Wine dataset

In [32]:
# filter df_GeoCache to Hierarchy_00

df_GeoCache01 = df_GeoCache[
    (df_GeoCache.Hierarchy == 'Hierarchy_01')
]

df_GeoCache01.sample(10)

Unnamed: 0,Geography,Hierarchy,Address,loc,point,lat,long,altitude
663,Mendocino County,Hierarchy_01,"California, USA","(California, United States, (36.7014631, -118....","(36.7014631, -118.755997, 0.0)",36.701463,-118.755997,0.0
356,Agrelo,Hierarchy_01,"Mendoza, Argentina","(Mendoza, Argentina, (-34.787093049999996, -68...","(-34.787093049999996, -68.43818677312292, 0.0)",-34.787093,-68.438187,0.0
456,Chambolle-Musigny,Hierarchy_01,"Burgundy, France","(Bourgogne, France métropolitaine, France, (47...","(47.27808725, 4.222486304306048, 0.0)",47.278087,4.222486,0.0
652,Edna Valley,Hierarchy_01,"California, USA","(California, United States, (36.7014631, -118....","(36.7014631, -118.755997, 0.0)",36.701463,-118.755997,0.0
697,Horse Heaven Hills,Hierarchy_01,"Washington, USA","(Washington, District of Columbia, United Stat...","(38.8949924, -77.0365581, 0.0)",38.894992,-77.036558,0.0
490,Savennières,Hierarchy_01,"Loire, France","(Loire, Auvergne-Rhône-Alpes, France métropoli...","(45.75385355, 4.045473682551104, 0.0)",45.753854,4.045474,0.0
430,Chassagne-Montrachet En Remilly,Hierarchy_01,"Burgundy, France","(Bourgogne, France métropolitaine, France, (47...","(47.27808725, 4.222486304306048, 0.0)",47.278087,4.222486,0.0
454,Bonnes Mares,Hierarchy_01,"Burgundy, France","(Bourgogne, France métropolitaine, France, (47...","(47.27808725, 4.222486304306048, 0.0)",47.278087,4.222486,0.0
461,Marsannay,Hierarchy_01,"Burgundy, France","(Bourgogne, France métropolitaine, France, (47...","(47.27808725, 4.222486304306048, 0.0)",47.278087,4.222486,0.0
614,Constantia,Hierarchy_01,"Western Cape, South Africa","(Western Cape, South Africa, (-33.546977, 20.7...","(-33.546977, 20.72753, 0.0)",-33.546977,20.72753,0.0


In [33]:
df_Wine01 = pd.merge(df_Wine, df_GeoCache01, on = 'Geography', how = 'left')

df_Wine01.sample(10)

Unnamed: 0,Review_Year,Rank,Vintage,Score,Price,Winemaker,Wine,Wine_Style,Grape_Blend,Blend_List,...,Best_Drink_from,Best_Drink_Through,Review,Hierarchy,Address,loc,point,lat,long,altitude
1119,2009.0,18,1999,95.0,60,Argyle,Extended Tirage Willamette Valley,Sparkling,Chardonnay - Pinot Noir,,...,2009.0,,"Elegant, with very fine bubbles and complex sp...",Hierarchy_01,"Oregon, USA","(Oregon, United States, (43.9792797, -120.7372...","(43.9792797, -120.737257, 0.0)",43.97928,-120.737257,0.0
494,2016.0,94,2014,90.0,32,William Fèvre,Chablis Domaine,White,Chardonnay,,...,2016.0,2018.0,"This is delicate, featuring lemon, herb and st...",Hierarchy_01,"Burgundy, France","(Bourgogne, France métropolitaine, France, (47...","(47.27808725, 4.222486304306048, 0.0)",47.278087,4.222486,0.0
1176,2009.0,75,2008,90.0,14,M. Chapoutier,Côtes du Roussillon-Villages Les Vignes de Bil...,Red,Blend,Carignan - Grenache – Syrah,...,2010.0,2014.0,This muscular red shows concentrated flavors o...,Hierarchy_01,"Languedoc-Roussillon, France","(Languedoc-Roussillon, France métropolitaine, ...","(43.65420305, 3.674669940206605, 0.0)",43.654203,3.67467,0.0
1823,2002.0,22,1997,98.0,89,Altesino,Brunello di Montalcino Montosoli,Red,Brunello di Montalcino,,...,2004.0,,"Complex aromas of grilled meat and cherry, wit...",Hierarchy_01,"Tuscany, Italy","(Toscana, Italia, (43.4586541, 11.1389204))","(43.4586541, 11.1389204, 0.0)",43.458654,11.13892,0.0
1544,2005.0,43,2003,93.0,16,Finca Luzón,Jumilla Altos de Luzón,Red,Blend,"Monastrell (Mourvèdre), Cabernet Sauvignon and...",...,2005.0,2010.0,Ripe and luscious. This rich red is bursting w...,Hierarchy_01,"Murcia, Spain","(Murcia, Área Metropolitana de Murcia, Región ...","(37.9923795, -1.1305431, 0.0)",37.992379,-1.130543,0.0
147,2019.0,48,2015,93.0,40,G.D. Vajra,Barolo Albe,Red,Blend,Nebbiolo,...,2021.0,2036.0,"Pretty, featuring rose, cherry, raspberry and ...",Hierarchy_01,"Piedmont | Piemonte, Italy","(Piedmont Properties, 78, SP50, San Marzano Ol...","(44.7605629, 8.2998538, 0.0)",44.760563,8.299854,0.0
1151,2009.0,50,2006,93.0,45,Tablas Creek,Esprit de Beaucastel Paso Robles,Red,Blend,"Mourvèdre, Grenache, Syrah and Counoise",...,2009.0,2014.0,"Well-balanced, intense yet elegant. Full-bodie...",Hierarchy_01,"California, USA","(California, United States, (36.7014631, -118....","(36.7014631, -118.755997, 0.0)",36.701463,-118.755997,0.0
938,2011.0,37,2009,93.0,30,Tablas Creek,Côtes de Tablas Paso Robles,Red,Blend,"Grenache, Syrah, Counoise and Mourvèdre",...,2011.0,2017.0,"Charmingly fruity, supple and fun to drink, ex...",Hierarchy_01,"California, USA","(California, United States, (36.7014631, -118....","(36.7014631, -118.755997, 0.0)",36.701463,-118.755997,0.0
835,2012.0,34,2010,93.0,29,Bertrand Stehelin,Gigondas,Red,Blend,Southern Rhone Red Blend,...,2014.0,2022.0,"Powerfully rendered, with thickly layered blac...",Hierarchy_01,"Rhône, France","(Rhône, Circonscription départementale du Rhôn...","(45.8802348, 4.564533629559522, 0.0)",45.880235,4.564534,0.0
783,2013.0,82,2009,91.0,30,Mamete Prevostini,Valtellina Superiore Sassella,Red,Nebbiolo,,...,2013.0,2023.0,"Expressive, with a floral note, hints of aroma...",Hierarchy_01,"Lombardy, Italy","(Lombardia, Italia, (45.5703694, 9.7732524))","(45.5703694, 9.7732524, 0.0)",45.570369,9.773252,0.0


### Save files for use in other notebooks

In [34]:
df_Wine00.to_csv(path_or_buf = './Wine_Hier00.csv', index = False)
df_Wine01.to_csv(path_or_buf = './Wine_Hier01.csv', index = False)