<img src="https://upload.wikimedia.org/wikipedia/commons/4/4e/North_American_taximeter_%28cropped%29.png">


# **Goal?**
 Predict taxi fares in NYC.

# **Training Data?**

 * Datetime of pickup
 * Longitude and Latitude of pickup and dropoff points
 * Number of passengers
 * Fare

# **How do taxi meters work in NYC?**

According to www.nyc.gov/html/tlc/html/passenger/taxicab_rate.shtml, in 2018 the fare will vary with:
* Distance (50 cents / 0.2 miles)
* Duration (50 cents / min while moving < 12 mph)
* Zone based charges dependent on time of day/day of week
* Tolls when crossing certain bridges or tunnels
* Flat rates to/from certain airports
* Two cases of flat per passenger charges between districts during certain hours

# **Ideas before we look at the data?**

Based on the data we are given and our understanding of the way meters work in NYC, we probably want to:
* Estimate travel distance.
* Estimate duration of travel.  How to do this without it being a linear function of distance?
* Cluster the map into zones.
* Split the datetime data into several measures.

# Gameplan:
* We will briefly look at distance and duration measures and decide on one.
* Look at some density maps for pickup and dropoff points
* We will cluster our data into zones to aid in EDA, rather than specifying 4 numbers we can specify 1 with a marginal loss of clarity.
* Make some fare heatmaps when traveling from a couple of different clusters
* Look at dependence of fare on distance
* Make some time variables
* Look at dependence of fare on different time variables

## Imports

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from matplotlib.colors import ListedColormap
import matplotlib
from sklearn.cluster import MiniBatchKMeans, KMeans
#% matplotlib inline
import os
import time
import calendar

print(os.listdir("../input"))
print(os.listdir("../input/nyc-taxi-fare-osrm"))

## Read Training Data

In [None]:
train_df =  pd.read_csv('../input/nyc-taxi-fare-osrm/train_osrm_50M.csv')

## Helper Functions

In [None]:
#excludes rows outside of bounding box latitudes and longitudes
#from https://www.kaggle.com/breemen/nyc-taxi-fare-data-exploration

#long_min, long_max, lat_min, lat_max = (-74.5, -72.8, 40.5, 41.8) #loose constraints
#map_img = mpimg.imread('https://aiblog.nl/download/nyc_-74.5_-72.8_40.5_41.8.png')
long_min, long_max, lat_min, lat_max = (-74.3, -73.7, 40.5, 40.9) #tight constraints
#map_img = mpimg.imread('https://aiblog.nl/download/nyc_-74.3_-73.7_40.5_40.9.png')
#from https://imgur.com/a/nMkieh6
map_img = mpimg.imread('https://i.imgur.com/hXaTTqp.png')

def bounding_box(df):
    return df[(df.pickup_longitude >= long_min) & (df.pickup_longitude <= long_max) & \
           (df.pickup_latitude >= lat_min) & (df.pickup_latitude <= lat_max) & \
           (df.dropoff_longitude >= long_min) & (df.dropoff_longitude <= long_max) & \
           (df.dropoff_latitude >= lat_min) & (df.dropoff_latitude <= lat_max)]

min_distance, max_distance, min_duration, max_duration = (100.,60000.,30.,4000.)
def distance_duration_box(df):
    return df[(df.distance >= min_distance) & (df.distance <= max_distance) & \
           (df.duration >= min_duration) & (df.duration <= max_duration)]

min_fare, max_fare = (3.,200.)
def fare_box(df):
    return df[(df.fare_amount >= min_fare) & (df.fare_amount <= max_fare)]

#runs all data cleaning methods
def clean_data(df):
    df = bounding_box(df)
    df = distance_duration_box(df)
    return df

"""
#This is what a function that uses OSRM to calculate distance looks like interacting with a local server.
#See here to install the backend: https://github.com/Project-OSRM/osrm-backend/wiki
#Here to install frontend: https://pypi.org/project/osrm-py/

import osrm
client = osrm.Client(host='http://localhost:5000')
def osrm_calc(long_in,lat_in,long_out,lat_out):
    coordinates = [[long_in,lat_in],[long_out,lat_out]]
    response = client.route(coordinates=coords_nest)
    return response['routes'][0]['distance'],response['routes'][0]['duration']
"""

#bins the latitude and longitude variables to make it possible to create heatmaps
def latitude_longitude_binning(df):
    df['dropoff_longitude_bin'] = pd.cut(df.dropoff_longitude, bins=50)
    df['dropoff_latitude_bin'] = pd.cut(df.dropoff_latitude, bins=50)
    df['pickup_longitude_bin'] = pd.cut(df.pickup_longitude, bins=50)
    df['pickup_latitude_bin'] = pd.cut(df.pickup_latitude, bins=50)
    return df

def distance_duration_binning(df):
    df['distance_bin'] = pd.cut(df.distance, bins=50)
    df['duration_bin'] = pd.cut(df.duration, bins=50)
    return df

fare_bins = [3.,5.,7.,9.,11.,13.,15.,17.,19.,21.,23.,25.,27.,29.,31.,33.,35.,37.,39.,41.,43.,45.,47.,49.,51.,53.,55.,60.,65.,70.,75.,80.,90.,100.,125.,150.,200.]
def fare_binning(df):
    df['fare_amount_bin'] = pd.cut(df.fare_amount, bins = fare_bins)
    return df

def apply_clusters(df):
    df['pickup_cluster'] = clusters.predict(df[['pickup_longitude','pickup_latitude']])
    df['dropoff_cluster'] = clusters.predict(train_df[['dropoff_longitude','dropoff_latitude']])
    return df

#largely from https://www.kaggle.com/aiswaryaramachandran/eda-and-feature-engineering
def time_columns(df):
    df['pickup_datetime']=pd.to_datetime(df['pickup_datetime'],format='%Y-%m-%d %H:%M:%S UTC')
    df['pickup_date']= df['pickup_datetime'].dt.date
    df['pickup_day']=df['pickup_datetime'].apply(lambda x:x.day)
    df['pickup_hour']=df['pickup_datetime'].apply(lambda x:x.hour)
    df['pickup_day_of_week']=df['pickup_datetime'].apply(lambda x:x.weekday())
    df['pickup_month']=df['pickup_datetime'].apply(lambda x:x.month)
    df['pickup_year']=df['pickup_datetime'].apply(lambda x:x.year)
    return df

#runs all data creating methods (do not add training only columns)
def create_columns(df):
    df = latitude_longitude_binning(df)
    df = apply_clusters(df)
    df = distance_duration_binning(df)
    df = time_columns(df)
    return df

def heatmap_on_pic(pv,vmax=None, cmap=matplotlib.cm.YlGn):
    fig, ax = plt.subplots(figsize=(18,14))
    #optional kwargs
    kwargs = {}
    if vmax is not None: kwargs['vmax'] = vmax
    kwargs['cmap'] = cmap
    
    ax = sns.heatmap(pv, ax=ax, alpha = 0.8, zorder = 2, **kwargs)
    ax.invert_yaxis()
    ax.set_yticklabels([])
    ax.set_xticklabels([])
    _ = ax.imshow(map_img,
                   aspect = ax.get_aspect(),
                   extent = ax.get_xlim() + ax.get_ylim(),
                   zorder = 1)
    return fig,ax

## Longitude and Latitude Histograms (Note: NYC is at around -74, 40)

In [None]:
fig1, ax1 = plt.subplots()
train_df.pickup_longitude.hist(ax=ax1, bins=100, bottom=0.1, figsize=(14,3))
ax1.set_yscale('log')
_ = ax1.set_xlabel('longitude')

fig2, ax2 = plt.subplots()
train_df.pickup_latitude.hist(ax=ax2, bins=100, bottom=0.1, figsize=(14,3))
ax2.set_yscale('log')
_ = ax2.set_xlabel('latitude')

## Apply bounding box and bin latitude and longitude variables
* Longitude in (-74.3, -73.7)
* Latitude in (40.5, 40.9)
* This does exclude some data
* We also bin these variables to make it quicker to plot and make pivot tables

In [None]:
train_df = bounding_box(train_df)
train_df = latitude_longitude_binning(train_df)
train_df.head()

## Longitude and Latitude Histograms after bounding box

In [None]:
fig1, ax1 = plt.subplots()
train_df.pickup_longitude.hist(ax=ax1, bins=100, bottom=0.1, figsize=(14,3))
#ax1.set_yscale('log')
_ = ax1.set_xlabel('longitude')

fig2, ax2 = plt.subplots()
train_df.pickup_latitude.hist(ax=ax2, bins=100, bottom=0.1, figsize=(14,3))
#ax2.set_yscale('log')
_ = ax2.set_xlabel('latitude')

## Determining distance/duration

1. Manhatten distance: As the cab drives, not aware of actual streets, how to correct for different street orientations?
2. Euclidean distance: As the crow flies, not the actual path a street dwelling cab would take.
3. **OSRM** (Open Source Routing Machine): Uses open source maps and routing engine to determine shortest path and estimate typical duration

### Data used in this notebook is ammended with OSRM estimated distance and duration values.  
### It would take quite some time to do this in notebook!!

* Estimated distance is in meters
* Estimated duration is in seconds.  

### Duration is largely correlated to the distance estimate, but some paths which have the same distance can have different durations, so this does provide some value.
### Important to note is that this is only a duration estimate and is not correlated with other time variables (e.g. day of week, hour of day)


## Some histograms

In [None]:
fig1, ax1 = plt.subplots()
train_df.distance.hist(ax=ax1, bins=100, bottom=0.1, figsize=(14,3))
ax1.set_xlim(0.,60000.)
ax1.set_yscale('log')
_ = ax1.set_xlabel('distance (meters)')

fig2, ax2 = plt.subplots()
train_df.duration.hist(ax=ax2, bins=100, bottom=0.1, figsize=(14,3))
ax2.set_xlim(0.,4000.)
ax2.set_yscale('log')
_ = ax2.set_xlabel('duration (seconds)')

fig3, ax3 = plt.subplots()
train_df.fare_amount.hist(ax=ax3, bins=100, bottom=0.1, figsize=(14,3))
ax3.set_xlim(-10.,200.)
ax3.set_yscale('log')
_ = ax3.set_xlabel('fare (dollars)')

## Apply constraints on distance, duration, and fare, then bin these quantities
* We apply constraints to make it simpler to bin these values, I could also keep these data and put them into a "last" bin
* Distance must be greater than 100 meters and less than 60 kilometers.
* Time must be greater than 30 seconds and less than 4000 seconds.
* Training fares must be more than 3 dollars and less than 200 dollars.  There were a handful of rows with more than 200 dollar fare. Negative fares and those less than 3 dollars are too low to be "real".

### I think about this in the context of a phone app trying to make these predictions.  Who takes 1 min to plug coordinates into an app to figure out the cost of a 30 second cab ride?

In [None]:
train_df = distance_duration_box(train_df)
train_df = fare_box(train_df)
train_df = distance_duration_binning(train_df)
train_df = fare_binning(train_df)
train_df.head()

## Pickup/Dropoff Heatmaps
* Each bin shows the number of pickups/dropoffs from that area in our training data
* Max counts per bin = 5000 to condense color map, log plot would be ideal but I didn't have a great solution


In [None]:
pv1 = pd.pivot_table(train_df,aggfunc='size',columns='pickup_longitude_bin',index='pickup_latitude_bin',fill_value=0.0,dropna=False)
fig1,ax1 = heatmap_on_pic(pv1,vmax=5000.,cmap=matplotlib.cm.Reds)
plt.title('Pickup',fontsize=20)
plt.show()
pv2 = pd.pivot_table(train_df,aggfunc='size',columns='dropoff_longitude_bin',index='dropoff_latitude_bin',fill_value=0.0,dropna=False)
fig2,ax2 = heatmap_on_pic(pv2,vmax=5000.,cmap=matplotlib.cm.Blues)
plt.title('Dropoff',fontsize=20)
plt.show()

# Why Clustering
Doing clustering not to help with prediction, just as a plotting aid.  We can say what do the fares look like from cluster 1 to cluster 20 etc.

## Cluster with KMeans on 100000 rows of pickup coordinates

In [None]:
clusters = KMeans(n_clusters=15, random_state=0).fit(train_df[:100000][['pickup_longitude','pickup_latitude']])

## Apply clustering to label pickup and dropoff clusters

In [None]:
train_df = apply_clusters(train_df)
train_df.head()

## Show Clusters on map
* We will focus on clusters 12 (Upper Manhatten), 3 (lower Manhatten), 2 (JFK) and 4 (LaGuardia) in the coming plots.

In [None]:
h = .0005
xx, yy = np.meshgrid(np.arange(long_min,long_max,h),np.arange(lat_min,lat_max,h))
Z = clusters.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
fig, ax = plt.subplots(figsize=(18,14))
plt.imshow(Z, interpolation='nearest',
           extent=(xx.min(), xx.max(), yy.min(), yy.max()),
           cmap=ListedColormap(sns.diverging_palette(220, 70, n=15).as_hex()),#plt.cm.tab20b,
           aspect='auto', origin='lower', alpha=0.7, zorder=2)
centroids = clusters.cluster_centers_
labels = clusters.predict(centroids)
for label, x, y in zip(labels,centroids[:,0],centroids[:,1]):
    plt.annotate(label,xy=(x,y),fontsize='15')
plt.ylim(lat_min,lat_max)
plt.xlim(long_min,long_max)
_ = ax.imshow(map_img,
          aspect = ax.get_aspect(),
          extent = ax.get_xlim() + ax.get_ylim(),
          zorder = 1)
plt.show()

## Fare heatmap from selected clusters
* Each bin shows the average fare when traveling from the selected cluster.
* To condense the color map we peg the max visible value of fare to 70 dollars.

In [None]:
for i,district in zip([2,3,4,12],['JFK','Lower Manhatten','LaGuardia','Upper Manhatten']):
    pv = pd.pivot_table(train_df[(train_df.pickup_cluster==i)],values='fare_amount',columns='dropoff_longitude_bin',index='dropoff_latitude_bin',fill_value=0.0,dropna=False)
    fig,ax = heatmap_on_pic(pv,vmax=70.)
    plt.title("Pickup from {0} (cluster {1})".format(district,i), fontsize=20)
    plt.show()

## Fare Heatmap Duration vs Distance
* Looking at trip distances between 1 km and 20 km
* Bins show mean fare for that distance/duration bin
* Fare is capped at 40 dollars to conserve color map
* Clearly OSRM estimated duration is largely correlated to distance estimate.
* It does look like the fare generally increase as duration increases in a distance bin


In [None]:
    pv = pd.pivot_table(train_df[(train_df.distance>1000.) & (train_df.distance<20000.)],columns='distance_bin',index='duration_bin',values='fare_amount')
    fig, ax = plt.subplots(figsize=(18,14))
    ax = sns.heatmap(pv, ax=ax, vmax=40, cmap=matplotlib.cm.Wistia)
    ax.invert_yaxis()
    plt.show()

## Density Heatmap Fare vs Distance
* Plot shows number of trips in a given distance/fare range
* Max visible density per bin is 20000 to conserve colormap. I really need to log plot this..
* We have a pretty linear trend here with an intercept of maybe 3-5 dollars.
* What are those artifacts around 50 dollars?  Flat rates?

In [None]:
    pv = pd.pivot_table(train_df,aggfunc='size',columns='distance_bin',index='fare_amount_bin',fill_value=0.0)
    fig, ax = plt.subplots(figsize=(18,14))
    ax = sns.heatmap(pv, ax=ax, vmax=20000., cmap=matplotlib.cm.YlOrRd)
    ax.invert_yaxis()
    plt.show()

## Density Heatmap Fare vs Distance from JFK to Lower Manhatten
* I guessed that the artifacts in the previous plot were the flat rate fares from JFK to Manhatten, but what is with striping?
* Perhaps there is a rate change dependent on time?

In [None]:
    pv = pd.pivot_table(train_df[(train_df.pickup_cluster==2) & (train_df.dropoff_cluster==3)],aggfunc='size',columns='distance_bin',index='fare_amount_bin',fill_value=0.0)
    fig, ax = plt.subplots(figsize=(18,14))
    ax = sns.heatmap(pv, ax=ax, alpha = 0.8, vmax= 200, cmap=matplotlib.cm.Blues)
    ax.invert_yaxis()
    plt.show()

## Create time columns (year, month, day, day of week, hour of day etc..)

In [None]:
train_df = time_columns(train_df)
train_df.head()
train_df.describe()

## Lets look at some histograms in time
* Looks like we started collecting data in 2009 and stopped sometime in 2015
* The days count up from Monday.  Looks like peak travel is on Friday.
* Taxis start picking up around 6-7 am, flattening then peaking up at 7 pm and rapidly dropping after 11 pm.

In [None]:
fig1, ax1 = plt.subplots()
bins = range(2009,2017)
train_df.pickup_year.hist(ax=ax1, bins=bins, bottom=0.1, figsize=(14,3), align='left')
_ = ax1.set_xlabel('Year (AD)')

fig2, ax2 = plt.subplots()
bins = range(8)
train_df.pickup_day_of_week.hist(ax=ax2, bins=bins, bottom=0.1, figsize=(14,3), align='left')
_ = ax2.set_xlabel('Days since Midnight on Sunday')

fig3, ax3 = plt.subplots()
bins = range(0,25)
train_df.pickup_hour.hist(ax=ax3, bins=bins, bottom=0.1, figsize=(14,3),align='left')
_ = ax3.set_xlabel('Hour since Midnight')

## Boxplot Fare from JFK to lower Manhatten vs year (outliers not shown)
* This looks like an explanation for the striping in the fare vs distance plot.
* The flat rate changed sometime in 2012.
* So year is definitely of some use to a machine learning algorithm.

In [None]:
fig, ax = plt.subplots(figsize=(14,10))
ax = sns.boxplot(data = train_df[(train_df.pickup_cluster==2) & (train_df.dropoff_cluster==3) & (train_df.pickup_hour>5)], x='pickup_year',y='fare_amount', ax=ax, showfliers=False)

## So fares change with year, what about with hour?
* We will look at trips from upper to lower Manhatten.
* We will control for distance and duration. (Remember duration was estimated without any thought given to time of day or day in week)

## Distance and constrained duration from Upper to Lower Manhatten
* We will constrain the following plots to: distance in (7km, 9km) and duration in (450s, 650s)

In [None]:
fig1, ax1 = plt.subplots()
train_df[(train_df.pickup_cluster==12) & (train_df.dropoff_cluster==3) & (train_df.pickup_year<2013)].distance.hist(ax=ax1, bins=20, figsize=(14,3))
_ = ax1.set_xlabel('Distance (meters)')

fig2, ax2 = plt.subplots()
train_df[(train_df.pickup_cluster==12) & (train_df.dropoff_cluster==3) & (train_df.pickup_year<2013) & (train_df.distance >7000.) & (train_df.distance < 9000.)].duration.hist(ax=ax2, bins=20, figsize=(14,3))
_ = ax.set_xlabel('Durations (seconds)')
t = plt.title('7km < distance < 9km', fontsize=12)

## Boxplot Fare from Upper to Lower Manhatten vs year (outliers not shown)
* Distance and duration is constrained as described
* Again we see a change in fare happened sometime end of 2012..
* We will look at fare vs hour applying additional that the year is earlier than 2013 to avoid the fare change.

In [None]:
fig, ax = plt.subplots(figsize=(14,10))
ax = sns.boxplot(
    data = train_df[(train_df.pickup_cluster==12) & (train_df.dropoff_cluster==3) & (train_df.duration >450.) & (train_df.duration < 650.) & (train_df.distance >6000.) & (train_df.distance < 9000.)], 
    x='pickup_year',y='fare_amount', ax=ax, showfliers=False)

## Boxplot Fare from Upper to lower Manhatten vs hour  (outliers not shown)
* Constraints as described above
* Looks like there is some dependence on time of day.  We will use this to control further.

In [None]:
fig, ax = plt.subplots(figsize=(14,10))
ax = sns.boxplot(
    data = train_df[(train_df.pickup_cluster==12) & (train_df.dropoff_cluster==3) & (train_df.pickup_year<2013) & (train_df.duration >450.) & (train_df.duration < 650.) & (train_df.distance >6000.) & (train_df.distance < 9000.)], 
    x='pickup_hour',y='fare_amount', ax=ax, showfliers=False)

## Boxplot Fare from upper to lower Manhatten vs Day of Week (outliers not shown)
* Using constraints from previous plot and requiring travel during "daylight" hours 
* Only slight dependence on day of week.
* We are running low on data after all these constraints!

In [None]:
fig, ax = plt.subplots(figsize=(14,10))
ax = sns.boxplot(
    data = train_df[(train_df.pickup_cluster==12) & (train_df.dropoff_cluster==3) & (train_df.pickup_year<2013) & (train_df.duration >450.) & (train_df.duration < 650.) & (train_df.distance >6000.) & (train_df.distance < 9000.) & (train_df.pickup_hour>8) & (train_df.pickup_hour<20)], 
    x='pickup_day_of_week',y='fare_amount', ax=ax, showfliers=False)

## Boxplot Fare from upper to lower Manhatten vs Passenger Count (outliers not shown)
* We do not see any dependence of fare on number of passengers in these trips
* We apply the same constraints as above but exclude daylight travel constraint

In [None]:
fig, ax = plt.subplots(figsize=(14,10))
ax = sns.boxplot(data = train_df[(train_df.pickup_cluster==12) & (train_df.dropoff_cluster==3)& (train_df.pickup_year<2013) & (train_df.duration >450.) & (train_df.duration < 650.) & (train_df.distance >6000.) & (train_df.distance < 9000.)], x='passenger_count',y='fare_amount', ax=ax, showfliers=False)

## Summary
* So we've looked at a number of obvious things that might affect fare:
1. Distance
2. Duration
3. Year
4. Hour
5. Day of Week
6. Number of passengers

### What else might be interesting to look at?

# Do some Models

## Imports

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import lightgbm as lgb
from math import sqrt

from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import tensorflow as tf

## Train test split
* We also drop binned columns that were just for plotting heatmaps and datetime column since we've expanded it
* We keep clusters, it may allow us to do less deep trees. Have to one hot encode to convert from categorical to something useable by MLA
* Drop passenger count because it did not seem to have affect in small area we looked

In [None]:
drop = ['key','passenger_count','dropoff_longitude_bin', 'dropoff_latitude_bin', 'pickup_longitude_bin', 'pickup_latitude_bin', 'distance_bin', 'duration_bin', 'fare_amount_bin', 'pickup_date', 'pickup_datetime']
y = train_df['fare_amount']
X = train_df.drop(columns=['fare_amount'])
X = X.drop(columns=drop)
X = pd.get_dummies(X, columns=['pickup_cluster','dropoff_cluster'])
train_X, test_X, train_y, test_y = train_test_split(X, y, test_size=0.3, random_state=42)
train_X.head()

## LightGBM (Gradient boosted decision trees)
* Adapted from https://github.com/susanli2016/Machine-Learning-with-Python/blob/master/NYC%20taxi%20fare.ipynb

In [None]:
params = {
        'learning_rate': 0.75,
        'application': 'regression',
        'max_depth': 3,
        'num_leaves': 100,
        'verbosity': -1,
        'metric': 'RMSE',
    }
train_lgb = lgb.Dataset(train_X, train_y)
trained_lgb = lgb.train(params, train_set = train_lgb, num_boost_round=300)
predicted_y = trained_lgb.predict(test_X, num_iteration = trained_lgb.best_iteration)
print('LGBM RMSE: {0}'.format(sqrt(mean_squared_error(test_y,predicted_y))))

## Keras Sequential Neural Network
* Adapted from the same as above.
* Did not get a chance to run this.

In [None]:
def baseline_model():
    model = Sequential()
    model.add(Dense(12, input_dim=41, kernel_initializer='normal', activation='relu'))
    model.add(Dense(1, kernel_initializer='normal'))
    model.compile(loss='mean_squared_error', optimizer='adam')
    return model
"""
seed = 7
np.random.seed(seed)
trained_snn = KerasRegressor(build_fn=baseline_model, nb_epoch=100, batch_size=5, verbose=0)
with tf.device('/gpu:0'):
    trained_snn.fit(train_X.values,train_y.values, epochs=100, batch_size=5, verbose =2)
predicted_y = trained_snn.predict(test_X)
print('SNN RMSE: {0}'.format(sqrt(mean_squared_error(test_y,predicted_y))))
"""