## Import csv of California vineyard locations
http://www.discovercaliforniawines.com/discover-california/wine-map-winery-directory provides a directory of vineyards in California. I used this directory to collect addresses of vineyards around California. 

<img src="wine-directory.png">

Some of the addresses listed were showrooms, so I filtered those out of the list since we only want to collect information on where the grapes are grown.

## Adding in non-vineyards
I then added in addresses of locations in California which aren't vineyards. I realize that this isn't perfect because a **certain location might be great for a vineyard, but just doesn't happen to have one located there.** Because of this, I don't expect my model to reach extremley high accuracy results because the data will be a bit noisy. However, I'm still hopefull that it will give some indication of whether or not my land can grow grapes. :)

Here's what what the csv data look like:

In [2]:
import pandas as pd
from tabulate import tabulate
import time
from datetime import timedelta
address_df = pd.read_csv('CaliforniaVineyards.csv', encoding='cp1252')

#Pretty print the address dataframe
print(tabulate(address_df.head(10), headers=['', 'IsVineyard', 'Name', 'Address'], tablefmt= 'grid'))

#Split dataframe into a training & test set
train_df = address_df[0:-30]
test_df = address_df[-30:]

+----+--------------+--------+---------------------------------------------------+
|    |   IsVineyard |   Name | Address                                           |
|  0 |            0 |    nan | 0 Batiquitos Dr, Carlsbad, CA                     |
+----+--------------+--------+---------------------------------------------------+
|  1 |            0 |    nan | 0 California Ave, Hemet, CA 92545                 |
+----+--------------+--------+---------------------------------------------------+
|  2 |            0 |    nan | 0 Chevy Chase, Glendale, CA 91206                 |
+----+--------------+--------+---------------------------------------------------+
|  3 |            0 |    nan | 0 Cinnamon Rock, Ramona, CA 92065                 |
+----+--------------+--------+---------------------------------------------------+
|  4 |            0 |    nan | 0 Cloverdale Heights Rd, Cloverdale, CA 95425     |
+----+--------------+--------+---------------------------------------------------+
|  5

# Use Google Maps & WeatherBit APIs to gather data on addresses
Import required packages and API keys

In [3]:
import googlemaps
from googlemaps import convert
from datetime import datetime
import numpy as np
import requests

#Update the config.py file with your own Google Maps & WeatherBit API keys to run
from config import *

gmaps = googlemaps.Client(key=gmap_key)

Helper function which spits out latitude & longitude given a written address

In [4]:
def lat_lng(address):
    try:
    # Geocode an address
        geocode_result = gmaps.geocode(address)

        # Grab the location values from the returned dictionary
        location = geocode_result[0].get('geometry').get('location')

        #split in to lat & long coordinates
        lat = location.get('lat')
        lng = location.get('lng')

        return(lat, lng)

    except Exception:
        print('Google Maps geocode failed - retrying')
        # sleep for a bit in case that helps
        time.sleep(5)
        # try again
        return lat_lng(address)

Create a helper function which takes latitude and longitude coordinates and returns a 5x5 matrix of elevation points over roughly a 1 square km area

In [5]:
def elevation_matrix(lat, lng):
    #Create a 5 x 5 matrix of elevation points using lat long points equivelant to a 1 kilometer area
    cols = 5
    rows = 5
    lat_lng_increment = .0002

    #Minus our lat & long starting coordinates by .005 to help us center our data
    lat = lat - (lat_lng_increment * cols / 2)
    lng = lng - (lat_lng_increment * rows / 2)

    array1 = []
    array2 = []
    for j in range(cols):
        #if we're on the first row, set the latitute back to initial value
        if (j == 0):
            lng_j = lng

        for i in range(rows):
            #if we're on the first row, set the latitute back to initial value
            if (i == 0):
                lat_i = lat   

            #get elevation for incremented latitude & longitude point
            elevation = gmaps.elevation((lat_i, lng_j))[0].get('elevation')
            array1.append(elevation)
            lat_i = lat_i + lat_lng_increment

        lng_j = lng_j + lat_lng_increment
        array2.append(array1)
        array1 = []

    altitude_matrix = np.array(array2)
    return(altitude_matrix)

# Grab historical weather from WeatherBit
The helper function below returns the following historical weather data that includes the following for the dates passed to it:
* Wind direction
* Wind speed
* Precipitation
* Average temperature
* Minimum temperature
* Max temperature
* Cloud coverage
* GHI (Global Horizontal Irradiance) - aka solar radiation
* RH (Relative humidity)

In [6]:
def weather_hist(start_date, end_date, lat, lng):
    try:
        wbit_url = 'http://api.weatherbit.io/v1.0/history/daily?key=' + wbit_key + '&lat=' + str(lat) + '&lon=' + str(lng) + '&start_date=' + start_date + '&end_date=' + end_date
        r = requests.get(wbit_url).json().get('data')[0] 
        return r
    
    except Exception:
        print(wbit_url)
        
        print('Weather history failed - retrying')
        # sleep for a bit in case that helps
        time.sleep(5)
        # try again
        return weather_hist(start_date, end_date, lat, lng)

In [7]:
def land_data(df):
    year_offset = timedelta(days=364) #I know, I know, there aren't 364 days in a year, but this accounts for leapyears + my limited 1year of historical data
    end_date = pd.to_datetime('today')
    start_date = end_date - year_offset

    #1-year-ish of dates
    d = pd.date_range(start=start_date, end=end_date, freq='D')

    #A variable to store the last date to use in the range of the weather api data
    last_date_str = 0

    #Initialize np.arrays variables which will eventually be fed into our keras model
    is_vineyard = np.array([])
    elevation = np.array([])
    map_coords = np.array([])
    wind_dir = np.array([])
    wind_spd = np.array([])
    precip = np.array([])
    temp = np.array([])
    min_temp = np.array([])
    max_temp = np.array([])
    clouds = np.array([])
    ghi = np.array([])
    rh = np.array([])

    for index, row in df.iterrows():
        address = row['Address']
        print('Collecting data for address: ' + str(address))

        #Get numerical latitute and longitude values
        lat, lng = lat_lng(address)

        #Create blank arrays to store weather data for each address
        address_wind_dir = np.array([])
        address_wind_spd = np.array([])
        address_precip = np.array([])
        address_temp = np.array([])
        address_min_temp = np.array([])
        address_max_temp = np.array([])
        address_clouds = np.array([])
        address_ghi = np.array([])
        address_rh = np.array([])

        #Collect weather data for all dates over the last year
        for date in d[:]:

            #format the date as a string - truncate to the first 10 characters
            date_str = str(date)[:10]  

            #Get day as int
            day = int(date_str[-2:])

            #Grab data every 5 days
            skip_days = 5

            #Only grab digits if they
            if ((last_date_str != 0) & (day % skip_days == 0)):

                #Get a dictionary of weather data based off a day
                weather_data = weather_hist(last_date_str, date_str, lat, lng)

                #Grab elements from the weather_data dictionary
                address_wind_dir = np.append(address_wind_dir, weather_data.get('wind_dir'))
                address_wind_spd = np.append(address_wind_spd, weather_data.get('wind_spd'))
                address_precip = np.append(address_precip, weather_data.get('precip'))
                address_temp = np.append(address_temp, weather_data.get('temp'))
                address_min_temp = np.append(address_min_temp, weather_data.get('min_temp'))
                address_max_temp = np.append(address_max_temp, weather_data.get('max_temp'))
                address_clouds = np.append(address_clouds, weather_data.get('clouds'))
                address_ghi = np.append(address_ghi, weather_data.get('ghi'))
                address_rh = np.append(address_rh, weather_data.get('rh'))

            #Save this date to be used as the start date for the next API call
            last_date_str = date_str

        last_date_str = 0

        #Append boolean is_vineyard value to an array which will be our dependant variable into our model 
        is_vineyard = np.append(is_vineyard, row['Vineyard'])   

        #Append matrix of elevation points for lat long values
        if(len(elevation) == 0):
            elevation = np.array([elevation_matrix(lat, lng)])
        else:
            elevation = np.concatenate([elevation, np.array([elevation_matrix(lat, lng)])], axis=0)

        #Append latitude & longitude values to an array which we'll feed into our model 
        if(len(map_coords) == 0):
            map_coords = (np.array([lat,lng]))
        else:
            map_coords = np.vstack([map_coords,np.array([lat,lng])])

        #Append each address's weather data to arrays which we'll feed into our model
        if(len(wind_dir) == 0):
            wind_dir = ([address_wind_dir])
        else:
            wind_dir = np.vstack([wind_dir,address_wind_dir])

        if(len(wind_spd) == 0):
            wind_spd = ([address_wind_spd])
        else:
            wind_spd = np.vstack([wind_spd,address_wind_spd]) 

        if(len(precip) == 0):
            precip = ([address_precip])        
        else:
            precip = np.vstack([precip,address_precip]) 

        if(len(temp) == 0):
            temp = ([address_temp])
        else:
            temp = np.vstack([temp,address_temp]) 

        if(len(min_temp) == 0):
            min_temp = ([address_min_temp])
        else:
            min_temp = np.vstack([min_temp,address_min_temp])

        if(len(max_temp) == 0):
            max_temp = ([address_max_temp])
        else:
            max_temp = np.vstack([max_temp,address_max_temp]) 

        if(len(clouds) == 0):
            clouds = ([address_clouds])
        else:
            clouds = np.vstack([clouds,address_clouds]) 

        if(len(ghi) == 0):
            ghi = ([address_ghi])
        else:
            ghi = np.vstack([ghi,address_ghi])

        if(len(rh) == 0):
            rh = ([address_rh])
        else:
            rh = np.vstack([rh,address_rh])
    
    #Return variables
    return is_vineyard, map_coords, elevation, wind_dir, wind_spd, precip, temp, min_temp, max_temp, clouds, ghi, rh

In [8]:
is_vineyard_train, map_coords_train, elevation_train, wind_dir_train, wind_spd_train, precip_train, temp_train, min_temp_train, max_temp_train, clouds_train, ghi_train, rh_train = land_data(train_df)

Collecting data for address: 0 Batiquitos Dr, Carlsbad, CA
Collecting data for address: 0 California Ave, Hemet, CA 92545
Collecting data for address: 0 Chevy Chase, Glendale, CA 91206
Collecting data for address: 0 Cinnamon Rock, Ramona, CA 92065
Collecting data for address: 0 Cloverdale Heights Rd, Cloverdale, CA 95425
Collecting data for address: 0 Cris Rd, Hemet, CA 92544
Collecting data for address: 0 Forest Service Road 8n15 Rd, Kirkwood, CA 95646
Collecting data for address: 0 Helen, Shadow Hills, CA 91040
Collecting data for address: 0 La Vella Rd, Temecula, CA 92590
Collecting data for address: 0 Larkin Vly, Watsonville, CA 95076
Collecting data for address: 0 Leonard, Silverado, CA 92676
Collecting data for address: 0 Lookout Mountain Rd, Mariposa, CA 95338
Collecting data for address: 0 Mill Creek, Talmage, CA 95481
Collecting data for address: 0 Mulholland Hwy, Malibu, CA 90265
Collecting data for address: 0 Old Coach Rd,Temecula, CA 92592
Collecting data for address: 0 Pon

Collecting data for address: 15550 New Peoria Flat Rd, Jamestown, CA 95327
Collecting data for address: 15600 Highway 1, Jenner, CA 95450
Collecting data for address: 15725 Meyers Grade Road, Jenner, CA 95450
Collecting data for address: 15750 Gary Way, Grass Valley, CA 95949
Collecting data for address: 1585 Live Oak Road, Paso Robles, CA 93446
Collecting data for address: 15887 N. Alpine Road, Lodi, CA 95240
Collecting data for address: 16030 Highway 49, Drytown, CA 95699
Collecting data for address: 16186 Candace Ln, Nevada City, CA 95959
Collecting data for address: 16330 Old Ranch Rd, Los Gatos, CA 95033
Collecting data for address: 16405 Grizzly Ridge Rd, Nevada City, CA 95959
Collecting data for address: 16480 Walker Lake Rd, Willits, CA 95490
Collecting data for address: 16655 Finley Ridge Ct, Morgan Hill, CA 95037
Collecting data for address: 16887 Skyline Truck Trl, Jamul, CA 91935
Collecting data for address: 1690 Spyrock Rd, Laytonville, CA 95454
Collecting data for address

Collecting data for address: 30988 Garden Rd, Manteca, CA 95337
Collecting data for address: 3125 E. Orange St, Acampo, CA 95220
Collecting data for address: 31351 Greenwood Road, Elk, CA 95432
Collecting data for address: 3151 Hwy. 128, Philo, CA 95466
Collecting data for address: 3222 Ehlers Lane, St. Helena, CA 94574
Collecting data for address: 3225 Township Road, Paso Robles, CA 93446
Collecting data for address: 323 Ward Blvd, Oroville, CA 95966
Collecting data for address: 32850 Ponderosa Way, Paynes Creek, CA 96075
Collecting data for address: 330 Stone Ridge Road, Angwin, CA 94508
Collecting data for address: 3300 Holiday Ln, Placerville, CA 95667
Collecting data for address: 3322 Old Lawley Toll Rd. Calistoga, CA 94515
Collecting data for address: 3323 Vine Hill Ln, Paso Robles, CA 93446
Collecting data for address: 33230 Wright Rd, Menifee, CA 92584
Collecting data for address: 333 Silverado Trail, Calistoga, CA 94515
Collecting data for address: 33440 La Serena Way, Temecul

Collecting data for address: 5675 Clear Creek Rd, Placerville, CA 95667
Collecting data for address: 5700 Occidental Road Santa Rosa, CA 95401
Collecting data for address: 57332 Joshua Ln, Yucca Valley, CA 92284
Collecting data for address: 5795 Silverado Trail, Napa, CA 94558
Collecting data for address: 5800 Adelaida Road, Paso Robles, CA 93447
Collecting data for address: 5805 Adelaida Road, Paso Robles, CA 93465
Collecting data for address: 5828 Orcutt Road, San Luis Obispo, CA 93401
Collecting data for address: 58820 80th, Anza, CA 92539
Collecting data for address: 58942 Registered Guest Rd, Laytonville, CA 95454
Collecting data for address: 5959 Briceland-Shelter Rd, Redway, CA 95560
Collecting data for address: 5960 Wise Road, Newcastle, CA 95658
Collecting data for address: 5980 Meyers Lane Somerset, CA 95684
Collecting data for address: 605 Big Lagoon Ranch Rd, Trinidad, CA 95570
Collecting data for address: 6050 Westside Rd, Healdsburg, CA 95448
Collecting data for address: 

## Save train & test variables

In [9]:
# Save variable objects:
with open('vineyard_train.pkl', 'wb') as f: 
    pickle.dump([is_vineyard_train, map_coords_train, elevation_train, wind_dir_train, wind_spd_train, precip_train, temp_train, min_temp_train, max_temp_train, clouds_train, ghi_train, rh_train], f)

In [10]:
is_vineyard_test, map_coords_test, elevation_test, wind_dir_test, wind_spd_test, precip_test, temp_test, min_temp_test, max_temp_test, clouds_test, ghi_test, rh_test = land_data(test_df)

Collecting data for address: 8419 Airola, Vallecito, CA 95251
Collecting data for address: 850 Rutherford Road, Rutherford, CA 94574
Collecting data for address: 8500 Dry Creek Road, Geyserville, CA 95441
Collecting data for address: 8533 Dry Creek Road, Healdsburg, CA 95448
Collecting data for address: 8585 Cross Canyons Road, San Miguel, CA 93451
Collecting data for address: 8599 Ocean View Rd, Ventura, CA 93001
Collecting data for address: 8605 State Highway 16, Brooks, CA 95606
Collecting data for address: 8644 Highway 128, Healdsburg, CA 95448
Collecting data for address: 8711 Silverado Trail, St. Helena, CA 94574
Collecting data for address: 8761 Dry Creek Road, Healdsburg, CA 96448
Collecting data for address: 8900 Sunset Rd, Joshua Tree, CA 92252
Collecting data for address: 8910 Adelaida Road, Paso Robles, CA 93446
Collecting data for address: 90 Grey Fox Lane, Oroville, CA 95966
Collecting data for address: 9010 E. Harney Lane Lodi, CA 95240
Collecting data for address: 91 Ed

In [11]:
# Save variable objects:
with open('vineyard_test.pkl', 'wb') as f: 
    pickle.dump([is_vineyard_test, map_coords_test, elevation_test, wind_dir_test, wind_spd_test, precip_test, temp_test, min_temp_test, max_temp_test, clouds_test, ghi_test, rh_test], f)