<a href="https://colab.research.google.com/github/stephyi/10Academy/blob/master/AfterWork_Data_Science_Feature_Engineering_with_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<font color="blue">To use this notebook on Google Colaboratory, you will need to make a copy of it. Go to **File** > **Save a Copy in Drive**. You can then use the new copy that will appear in the new tab.</font>

# AfterWork Data Science: Feature Engineering

### Prerequisites

In [None]:
# Let's first import the libraries that we will need
# ----
#
import pandas as pd               # pandas for performing data manipulation
import numpy as np                # numpy for performing scientific computations
import matplotlib.pyplot as plt   # matplotlib for performing visualisation 

## Feature Improvement Techniques

#### <font color="blue">Example: Standardisation & Normalisation</font>

In [None]:
# Example
# --- 
# Question: Using the Support Vector Regressor, create a regression model using the clean dataset below.
# ---
# Dataset url = http://bit.ly/FishDatasetClean
# ---
# OUR CODE GOES BELOW
# 

##### Step 1. Loading our Dataset

In [None]:
# Loading our dataset
# ---
# 
df = pd.read_csv('http://bit.ly/FishDatasetClean')
df.head()

In [None]:
# Getting a statistical summary of our dataset
# ---
# 
df.describe()

##### Step 2, 3, 4: Checking, Cleaning, Exploratory Analysis and have already been performed on our dataset.

##### Step 5. Implementation and Evaluation

We will now perform normalisation and standardisation techniques to our dataset then fit the data to various models. We can then compare out RMSE accuracy in different instances. Go through each of the given models to understand the effect on these two techniques. Remember to uncomment the relevant cells.

In [None]:
# First we check for modeling without without normalisation and standardisation
# ---

# We select our features
X = df[['Length1', 'Length2', 'Length3', 'Height', 'Width']]
y = df['Weight']

# Splitting our dataset 
# ---
# NB: We use random_state to get the same results everytime,
# else we'd get to be working with different test and train datasets.
# ---
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state = 42) 

# Fitting in our models 
# ---
from sklearn.svm import SVR 
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor  

# Don't worry about the model parameters, we will learn about 
# them in a separate workshop
svm_regressor = SVR(kernel='rbf', C=10)
knn_regressor = KNeighborsRegressor()
dec_regressor = DecisionTreeRegressor(random_state=27)

svm_regressor.fit(X_train, y_train)
knn_regressor.fit(X_train, y_train)
dec_regressor.fit(X_train, y_train)

# Making Predictions  
svm_y_pred = svm_regressor.predict(X_test)
knn_y_pred = knn_regressor.predict(X_test)
dec_y_pred = dec_regressor.predict(X_test)

# Finally, evaluate our model 
from sklearn import metrics 
print('SVM RMSE:', np.sqrt(metrics.mean_squared_error(y_test, svm_y_pred)))
print('KNN RMSE:', np.sqrt(metrics.mean_squared_error(y_test, knn_y_pred)))
print('Decision Tree RMSE:', np.sqrt(metrics.mean_squared_error(y_test, dec_y_pred)))

In [None]:
# We then check for modeling with only normalisation
# ---

# We select our features
X = df[['Length1', 'Length2', 'Length3', 'Height', 'Width']]
y = df['Weight']

# Splitting our dataset  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state = 42)

# Performing normalisation 
norm = MinMaxScaler().fit(X_train) 
X_train = norm.transform(X_train) 
X_test = norm.transform(X_test)

# Fitting in our models  
svm_regressor = SVR(kernel='rbf', C=10)
knn_regressor = KNeighborsRegressor()
dec_regressor = DecisionTreeRegressor(random_state=27)

svm_regressor.fit(X_train, y_train)
knn_regressor.fit(X_train, y_train)
dec_regressor.fit(X_train, y_train)

# Making Predictions  
svm_y_pred = svm_regressor.predict(X_test)
knn_y_pred = knn_regressor.predict(X_test)
dec_y_pred = dec_regressor.predict(X_test)

# Finally, evaluating our models 
print('SVM RMSE:', np.sqrt(metrics.mean_squared_error(y_test, svm_y_pred)))
print('KNN RMSE:', np.sqrt(metrics.mean_squared_error(y_test, knn_y_pred)))
print('Decision Tree RMSE:', np.sqrt(metrics.mean_squared_error(y_test, dec_y_pred)))

In [None]:
# First we check for modeling with standardisation
# ---

# We select our features
X = df[['Length1', 'Length2', 'Length3', 'Height', 'Width']]
y = df['Weight']

# Splitting our dataset  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state = 42)

# Performing standardisation
from sklearn.preprocessing import StandardScaler
sc = StandardScaler() 
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)

# Fitting in our models  
svm_regressor = SVR(kernel='rbf', C=10)
knn_regressor = KNeighborsRegressor()
dec_regressor = DecisionTreeRegressor(random_state=27)

svm_regressor.fit(X_train, y_train)
knn_regressor.fit(X_train, y_train)
dec_regressor.fit(X_train, y_train)

# Making Predictions  
svm_y_pred = svm_regressor.predict(X_test)
knn_y_pred = knn_regressor.predict(X_test)
dec_y_pred = dec_regressor.predict(X_test)

# Finally, evaluating our models 
print('SVM RMSE:', np.sqrt(metrics.mean_squared_error(y_test, svm_y_pred)))
print('KNN RMSE:', np.sqrt(metrics.mean_squared_error(y_test, knn_y_pred)))
print('Decision Tree RMSE:', np.sqrt(metrics.mean_squared_error(y_test, dec_y_pred)))

#### <font color="green">Challenge</font>

In [None]:
# Challenge 1
# ---
# You can now work on the following dataset that you've used in the past to apply 
# the scaling techniques in an effort improve accuracy.
# Create a regression model to predict price using the given dataset examining
# the two scaling techniques for the different regressors. 
# NB: You can apply the other regression technique that we did not use i.e.
# Multiple linear regression.
# ---
# Dataset url = http://bit.ly/RealEstateDataset2
# ---
# OUR CODE GOES BELOW
#

#### <font color="green">Challenge 2</font>

In [None]:
# Challenge 2
# ---
# Again, you've already gone this classification problem. 
# Build a classifier to predict car sales, check the accuracy of the prediction then challenge 
# your solution by apply feature improvement techniques to  following dataset
# ---
# Dataset url = https://bit.ly/3dvU2BB
# ---
# OUR CODE GOES BELOW
#

## Feature Selection Techniques

#### Filter Method: <font color="blue">Example: Pearson's Correlation Coefficient</font>

In [None]:
# Example
# --- 
# Question: Let's use the following dataset that we used above.
# We will use the pearson's correlation coefficient as our filtering method.
# ---
# Dataset url = http://bit.ly/FishDatasetClean
# ---
# OUR CODE GOES BELOW
# 

##### Step 1. Loading our Dataset

In [None]:
# Loading our dataset
# ---
# 
df = pd.read_csv('http://bit.ly/FishDatasetClean')
df.head()

In [None]:
# Describing our dataset
# ---
# 
df.describe()

##### Step 2, 3, 4: Checking, Cleaning, Exploratory Analysis and have already been performed on our dataset.

##### Step 5. Implementation and Evaluation

In this example we will only use the pearson's correlation coefficient to resolve the most important features in a dataset. In our case we will drop features that are not highly correlated to our response variable.


In [None]:
# First, we then perform modeling with both standardisation and normalisation.
# We will use this as as our base for our solution, then perform feature engineering 
# by filter methods.
# ---

# We select our features
X = df[['Length1', 'Length2', 'Length3', 'Height', 'Width']]
y = df['Weight']

# Splitting our dataset  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state = 42)

# Performing normalisation 
norm = MinMaxScaler().fit(X_train) 
X_train = norm.transform(X_train) 
X_test = norm.transform(X_test)

# Fitting in our models   
svm_regressor = SVR(kernel='rbf', C=10)
knn_regressor = KNeighborsRegressor()
dec_regressor = DecisionTreeRegressor(random_state=27)

svm_regressor.fit(X_train, y_train)
knn_regressor.fit(X_train, y_train)
dec_regressor.fit(X_train, y_train)

# Making Predictions  
svm_y_pred = svm_regressor.predict(X_test)
knn_y_pred = knn_regressor.predict(X_test)
dec_y_pred = dec_regressor.predict(X_test)

# Finally, evaluating our models  
print('SVM RMSE:', np.sqrt(metrics.mean_squared_error(y_test, svm_y_pred)))
print('KNN RMSE:', np.sqrt(metrics.mean_squared_error(y_test, knn_y_pred)))
print('Decision Tree RMSE:', np.sqrt(metrics.mean_squared_error(y_test, dec_y_pred)))

In [None]:
# Then appply filter methods by plotting a correlation matrix
# ---
#
df_corr = df.corr()
plt.figure(figsize=(5,4))

# We then plot our heatmap visualistion
# 
import seaborn as sns
sns.heatmap(df_corr, annot=True, linewidth=0.5, cmap='coolwarm');

We resolve to drop height since it has a weaker correlation to Weight, which is our response variable.

In [None]:
# Then perform our modeling, comparing the resulting accuracy to the previous base solution.
# ---
# We select our features
X = df[['Length1', 'Length2', 'Length3', 'Width']]
y = df['Weight']

# Splitting our dataset  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state = 42)

# Performing normalisation 
norm = MinMaxScaler().fit(X_train) 
X_train = norm.transform(X_train) 
X_test = norm.transform(X_test)

# Fitting in our models   
svm_regressor = SVR(kernel='rbf', C=10)
knn_regressor = KNeighborsRegressor()
dec_regressor = DecisionTreeRegressor(random_state=27)

svm_regressor.fit(X_train, y_train)
knn_regressor.fit(X_train, y_train)
dec_regressor.fit(X_train, y_train)

# Making Predictions  
svm_y_pred = svm_regressor.predict(X_test)
knn_y_pred = knn_regressor.predict(X_test)
dec_y_pred = dec_regressor.predict(X_test)

# Finally, evaluate our model 
print('SVM RMSE:', np.sqrt(metrics.mean_squared_error(y_test, svm_y_pred)))
print('KNN RMSE:', np.sqrt(metrics.mean_squared_error(y_test, knn_y_pred)))
print('Decision Tree RMSE:', np.sqrt(metrics.mean_squared_error(y_test, dec_y_pred)))

#### Wrapper Method: <font color="blue">Example: Step Forward Feature Selection</font>

In [None]:
# Example
# --- 
# Question: Let's use the following dataset that we used above in our example 
# and perform the step forward feature selection method.
# --
# During step forward feature selection you start with no variables in the model, 
# testing the addition of each variable using a chosen model fit criterion, 
# adding the variable (if any) whose inclusion gives the most statistically 
# significant improvement of the fit, and repeating this process until none 
# improves the model to a statistically significant extent.
# ---
# Dataset url = http://bit.ly/FishDatasetClean
# ---
# OUR CODE GOES BELOW
# 

##### Step 1. Loading our Dataset

In [None]:
# Loading our dataset
# ---
# 
df = pd.read_csv('http://bit.ly/FishDatasetClean')
df.head()

In [None]:
# Describing our dataset
# ---
# 
df.describe()

##### Step 2, 3, 4: Checking, Cleaning, Exploratory Analysis and have already been performed on our dataset.

##### Step 5. Implementation and Evaluation

In [None]:
# First, we then perform modeling with both standardisation and normalisation.
# We will use this as our base for our solution, then perform feature engineering 
# by filter methods.
# ---

# We select our features
X = df[['Length1', 'Length2', 'Length3', 'Height', 'Width']]
y = df['Weight']

# Splitting our dataset  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state = 42)

# Performing normalisation 
norm = MinMaxScaler().fit(X_train) 
X_train = norm.transform(X_train) 
X_test = norm.transform(X_test)

# Selecting the ML algorithm to use   
dec_regressor = DecisionTreeRegressor(random_state=27)

# We pass the svm_regressor the estimator to the SequentialFeatureSelector function. 
# The k_features specifies the number of features to select. 
# We can set any number of features here. The forward parameter, if set to True, 
# performs step forward feature selection. The verbose parameter is used for logging 
# the progress of the feature selector, the scoring parameter defines the performance 
# evaluation criteria and finally, cv refers to cross-validation folds.
# ---
# Hint: Hover cursor on SequentialFeatureSelector to get a list of more parameter values.
# ---
#
from mlxtend.feature_selection import SequentialFeatureSelector
feature_selector = SequentialFeatureSelector(dec_regressor,
           k_features=4,
           forward=True,
           verbose=2,
           scoring='r2',
           cv=4)
 
# Perform step forward feature selection
feature_selector = feature_selector.fit(X_train, y_train) 

In [None]:
# Which are the selected features?
# The columns at these indexes are those which were selected
# ---
#
feat_cols = list(feature_selector.k_feature_idx_)
print(feat_cols)

In [None]:
# We can now use those features to build our model
# ---
# 

# Without step forward feature selection (sffs)
dec_regressor = DecisionTreeRegressor(random_state=27)
dec_regressor.fit(X_train, y_train)

# With step forward feature selection
dec_regressor2 = DecisionTreeRegressor(random_state=27)
dec_regressor2.fit(X_train[:, feat_cols], y_train)

# Making Predictions and determining the accuracies
y_test_pred = dec_regressor.predict(X_test)
print('Decision Tree RMSE Without sffs:', np.sqrt(metrics.mean_squared_error(y_test, y_test_pred)))

y_test_pred2 = dec_regressor2.predict(X_test[:, feat_cols])
print('Decision Tree RMSE with sffs:', np.sqrt(metrics.mean_squared_error(y_test, y_test_pred2)))

#### Wrapper Method: <font color="blue">Example: Step Backward Feature Selection</font>

In [None]:
# Example
# --- 
# Question: Let's use the following dataset that we used above in our example 
# and perform the step backward feature selection method.
# --
# Step backward feature selection involves starting with all candidate variables, 
# testing the deletion of each variable using a chosen model fit criterion, 
# deleting the variable (if any) whose loss gives the most statistically 
# insignificant deterioration of the model fit, 
# and repeating this process until no further variables can be deleted without 
# a statistically insignificant loss of fit.
# ---
# Dataset url = http://bit.ly/FishDatasetClean
# ---
# OUR CODE GOES BELOW
# 

##### Step 1. Loading our Dataset

In [None]:
# Loading our dataset
# ---
# 
df = pd.read_csv('http://bit.ly/FishDatasetClean')
df.head()

In [None]:
# Describing our dataset
# ---
# 
df.describe()

##### Step 2, 3, 4: Checking, Cleaning, Exploratory Analysis and have already been performed on our dataset.

##### Step 5. Implementation and Evaluation

In [None]:
# First, we then perform modeling with both standardisation and normalisation.
# We will use this as our base for our solution, then perform feature engineering 
# by filter methods.
# ---

# We select our features
X = df[['Length1', 'Length2', 'Length3', 'Height', 'Width']]
y = df['Weight']

# Splitting our dataset  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state = 42)

# Performing normalisation 
norm = MinMaxScaler().fit(X_train) 
X_train = norm.transform(X_train) 
X_test = norm.transform(X_test)

# Selecting the ML algorithm to use   
dec_regressor = DecisionTreeRegressor(random_state=27)

# We pass the dec_regressor the estimator to the SequentialFeatureSelector function. 
# The k_features specifies the number of features to select. 
# We can set any number of features here. The forward parameter, if set to False, 
# performs step backward feature selection. The verbose parameter is used for logging 
# the progress of the feature selector, the scoring parameter defines the performance 
# evaluation criteria and finally, cv refers to cross-validation folds.
# ---
# Hint: Hover cursor on SequentialFeatureSelector to get a list of more parameter values.
# ---
#
from mlxtend.feature_selection import SequentialFeatureSelector
feature_selector = SequentialFeatureSelector(dec_regressor,
           k_features=4,
           forward=False,
           verbose=2,
           scoring='r2',
           cv=4)
 
# Perform step backward feature selection
feature_selector = feature_selector.fit(X_train, y_train) 

In [None]:
# Which are the selected features?
# The columns at these indexes are those which were selected
# ---
#
feat_cols = list(feature_selector.k_feature_idx_)
print(feat_cols)

In [None]:
# We can now use those features to build a full model
# ---
# 

# Without step backward feature selection (sbfs)
dec_regressor = DecisionTreeRegressor(random_state=27)
dec_regressor.fit(X_train, y_train)

# With step backward feature selection
dec_regressor2 = DecisionTreeRegressor(random_state=27)
dec_regressor2.fit(X_train[:, feat_cols], y_train)

# Making Predictions and determining the accuracies  
y_test_pred = dec_regressor.predict(X_test)
print('Decision Tree RMSE Without sbfs:', np.sqrt(metrics.mean_squared_error(y_test, y_test_pred)))

y_test_pred2 = dec_regressor2.predict(X_test[:, feat_cols])
print('Decision Tree RMSE with sbfs:', np.sqrt(metrics.mean_squared_error(y_test, y_test_pred2)))

#### Wrapper Method: <font color="blue">Example: Recursive Feature Elimination</font>

In [None]:
# Example
# --- 
# Question: Let's use the following dataset that we used above in our example to
# use recursive feature elimination as our filtering method.
# --
# The Recursive Feature Elimination (RFE) method is a feature selection approach which 
# works by recursively removing attributes and building a model on those attributes that remain. 
# It uses the model accuracy to identify which attributes (and combination of attributes) 
# contribute the most to predicting the target attribute.
# ---
# Dataset url = http://bit.ly/FishDatasetClean
# ---
# OUR CODE GOES BELOW
# 

##### Step 1. Loading our Dataset

In [None]:
# Loading our dataset
# ---
# 
df = pd.read_csv('http://bit.ly/FishDatasetClean')
df.head()

In [None]:
# Describing our dataset
# ---
# 
df.describe()

##### Step 2, 3, 4: Checking, Cleaning, Exploratory Analysis and have already been performed on our dataset.

##### Step 5. Implementation and Evaluation

In [None]:
# First, we then perform modeling with both standardisation and normalisation.
# We will use this as our base for our solution, then perform feature engineering 
# by filter methods.
# ---

# We select our features
X = df[['Length1', 'Length2', 'Length3', 'Height', 'Width']]
y = df['Weight']

# Splitting our dataset  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state = 42)
 
# Performing normalisation 
norm = MinMaxScaler().fit(X_train) 
X_train = norm.transform(X_train) 
X_test = norm.transform(X_test)

# Fitting in our models   

svm_regressor = SVR(kernel="linear")   
dec_regressor = DecisionTreeRegressor(random_state=27)

# We want to select the best 2 features for our model. 
# NB: n_features_to_select will include the response variable
# ---
#  
from sklearn.feature_selection import RFE
svm_regressor = RFE(svm_regressor, n_features_to_select = 3, step=1)
dec_regressor = RFE(dec_regressor, n_features_to_select = 3, step=1)

svm_regressor.fit(X_train, y_train) 
dec_regressor.fit(X_train, y_train)

# Making Predictions  
svm_y_pred = svm_regressor.predict(X_test) 
dec_y_pred = dec_regressor.predict(X_test)

# Finally, evaluate our model  
print('SVM RMSE:', np.sqrt(metrics.mean_squared_error(y_test, svm_y_pred))) 
print('Decision Tree RMSE:', np.sqrt(metrics.mean_squared_error(y_test, dec_y_pred)))
 
# Displaying our best features
print('SVM Selected features: %s' % list(X.columns[svm_regressor.support_]))
print('Decision Tree Selected features: %s' % list(X.columns[dec_regressor.support_]))

#### Feature Transformation: <font color="blue">Example: Principal Component Analysis</font>

In [None]:
# Example
# --- 
# Question: Let's use the following dataset that we used above in our example to
# use the principal component analysis (PCA) to reduce our features into components.
# ---

# ---
# Dataset url = http://bit.ly/FishDatasetClean
# ---
# OUR CODE GOES BELOW
# 

##### Step 1. Loading our Dataset

In [None]:
# Loading our dataset
# ---
# 
df = pd.read_csv('http://bit.ly/FishDatasetClean')
df.head()

In [None]:
# Describing our dataset
# ---
# 
df.describe()

##### Step 2, 3, 4: Checking, Cleaning, Exploratory Analysis and have already been performed on our dataset.

##### Step 5. Implementation and Evaluation

In [None]:
# Again, create our base models and check for the accuracy to later compare it
# later with our PCA implementation.
# ---

# We select our features
X = df[['Length1', 'Length2', 'Length3', 'Height', 'Width']]
y = df['Weight']

# Splitting our dataset  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state = 42)

# Performing normalisation  
norm = MinMaxScaler().fit(X_train) 
X_train = norm.transform(X_train) 
X_test = norm.transform(X_test)

# Fitting in our models   
svm_regressor = SVR(kernel='rbf', C=10)
knn_regressor = KNeighborsRegressor()
dec_regressor = DecisionTreeRegressor(random_state=27)

svm_regressor.fit(X_train, y_train)
knn_regressor.fit(X_train, y_train)
dec_regressor.fit(X_train, y_train)

# Making Predictions  
svm_y_pred = svm_regressor.predict(X_test)
knn_y_pred = knn_regressor.predict(X_test)
dec_y_pred = dec_regressor.predict(X_test)

# Finally, evaluating our models  
print('SVM RMSE:', np.sqrt(metrics.mean_squared_error(y_test, svm_y_pred)))
print('KNN RMSE:', np.sqrt(metrics.mean_squared_error(y_test, knn_y_pred)))
print('Decision Tree RMSE:', np.sqrt(metrics.mean_squared_error(y_test, dec_y_pred)))

In [None]:
# Again, then apply PCA to our dataset
# ---

# We select our features
X = df[['Length1', 'Length2', 'Length3', 'Height', 'Width']]
y = df['Weight']

# Splitting our dataset  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state = 42)

# Performing normalisation 
norm = MinMaxScaler().fit(X_train) 
X_train = norm.transform(X_train) 
X_test = norm.transform(X_test)

# Applying PCA
# ---
# NB: PCA relies the feature set and not the label data.
# ---
# 
from sklearn.decomposition import PCA
pca = PCA()
X_train = pca.fit_transform(X_train)
X_test = pca.transform(X_test)

# Fitting in our models   
svm_regressor = SVR(kernel='rbf', C=10)
knn_regressor = KNeighborsRegressor()
dec_regressor = DecisionTreeRegressor(random_state=27)

svm_regressor.fit(X_train, y_train)
knn_regressor.fit(X_train, y_train)
dec_regressor.fit(X_train, y_train)

# Making Predictions  
svm_y_pred = svm_regressor.predict(X_test)
knn_y_pred = knn_regressor.predict(X_test)
dec_y_pred = dec_regressor.predict(X_test)

# Finally, evaluating our models 
print('SVM RMSE:', np.sqrt(metrics.mean_squared_error(y_test, svm_y_pred)))
print('KNN RMSE:', np.sqrt(metrics.mean_squared_error(y_test, knn_y_pred)))
print('Decision Tree RMSE:', np.sqrt(metrics.mean_squared_error(y_test, dec_y_pred)))

#### <font color="green">Challenge 1</font>

In [None]:
# Challenge 1
# ---
# Perform the above feature selection techniques to improve the accuracy of 
# your model that you use to predict prices in the previously used real estate dataset.
# ---
# Dataset url = http://bit.ly/RealEstateDataset2
# ---
# OUR CODE GOES BELOW
#

#### Feature Transformation: <font color="green">Challenge: Linear Discriminant Analysis</font>

In [None]:
# Challenge 1
# ---
# From day 1, we have held your hand in providing for examples that you'd learn from, 
# in order to work on the challenges. This time we would like you to refer to the 
# LDA sklearn documentation (Google) and then later perform LDA on the following dataset 
# with the main goal of improving the accuracy of your model.   
# ---
# Dataset url = http://bit.ly/2So1eGk
# ---
# OUR CODE GOES BELOW
#