# Get Weather Data

In [1]:
import sys
from datetime import datetime
import pandas as pd
sys.path.append('..')
from WeatherFetcher import WeatherFetcher

The WeatherFetcher class requires a pandas DataFrame object with column titles 
'Latitude', 'Longitude', and 'Vintage Year', indicating the coordinates and year 
that the originating grapes were harvested.

In this example, we are going to request historical weather data for a 2019 Napa
Valley wine (located at 38.4274 latitude, 122.3943 longitude).

In [2]:
# Combine coordiante data with processed data
coord_df = pd.read_csv('../data/AVA_coordinates.csv')
df = pd.read_csv('../data/wine_processed.csv').drop(columns = ['Unnamed: 0'])
df = coord_df.merge(df, right_on='region', left_on='AVA')
df = df[['latitude', 'longitude', 'region', 'vintage']].drop_duplicates().reset_index(drop=True)

Now we will initailize a WeatherFetcher object that will organize the requested
data

In [4]:
wf = WeatherFetcher(data = df)
wf.query_weather()
wf.clean_weather_data()
wf.filter_weather_data()

Data has been pulled for napa (1996 - 2020)
Data has been pulled for RRV (1996 - 2020)
Data has been pulled for sonoma (1996 - 2020)
Data has been pulled for carneros (1996 - 2020)
Data has been pulled for anderson valley (2002 - 2020)
Data has been pulled for santa barbara (1996 - 2020)
Data has been pulled for central coast (1996 - 2020)
Data has been pulled for north coast (1998 - 2020)
Data has been pulled for south coast (1998 - 2020)
Data has been pulled for sierra foothills (1996 - 2020)
Data has been pulled for willamette valley (1996 - 2020)
Data has been pulled for columbia valley (1996 - 2020)



Let's look at the output

In [None]:
wf.output['napa-2005']

Unnamed: 0,time,tavg,tmin,tmax,prcp,snow,wspd,pres
1782,2005-03-01,9.6,3.9,15.0,7.900000,0.0,9.3,1.003504
1783,2005-03-02,10.8,6.7,16.7,1.500000,0.0,6.1,1.004194
1784,2005-03-03,10.2,3.9,15.6,0.144702,0.0,9.2,1.003405
1785,2005-03-04,11.4,7.2,16.1,0.144702,0.0,5.0,1.003306
1786,2005-03-05,10.9,3.3,18.9,0.300000,0.0,6.5,1.004013
...,...,...,...,...,...,...,...,...
2019,2005-10-27,12.4,7.8,17.8,0.000000,0.0,9.1,1.001332
2020,2005-10-28,14.1,11.7,17.8,0.032930,0.0,10.0,1.001199
2021,2005-10-29,13.8,7.2,19.4,0.032930,0.0,6.5,1.001199
2022,2005-10-30,10.8,1.7,20.6,0.000000,0.0,5.8,1.011695


Perfect, now write everything out to csvs

In [None]:
for k, v in wf.output.items():
    if pd.notna(v):
        v.to_csv(f'../data/weather1/{k}.csv', index = False)