# <img src="./resources/GA.png" width="25" height="25" /> <span style="color:Blue">DSI Capstone:  MTB Trail Recommender Engine</span> 
---
## <span style="color:Green">Preprocessing - Recommenders (Content-Based and User-Based (Binary))</span>      

#### Ryan McDonald -General Assembly 

---

### Notebook Contents:

- [Content- Based Recommender Prep](#intro)    
    - [Arizona Content Recommender](#recaz)
    - [Utah Content Recommender](#recut) 
- [User- Based Binary Recommender Prep](#user_rec)
    - [Arizona User Recommender](#azuser)
    - [Utah User Recommender](#utuser) 

**Imports**

In [59]:
# basic imports
import numpy as np
import pandas as pd
import sys

# general processing, CSV manipulation
from scipy import sparse
from sklearn.metrics.pairwise import pairwise_distances, cosine_similarity
from sklearn.preprocessing import MinMaxScaler

# # Spatial distance module
# import geopandas as gpd
# from shapely.geometry import Point
# from shapely.ops import nearest_points

<a id='intro'></a>
## 1. Content - Based Recommender

Now that all our data is cleaned and formatted appropriately, there are just a few preprocessing steps needed to create a reliable recommender system.  We'll start with a content-based recommender, utilizing cleaned trail statistics data, in order to show users the top ten most similar trails based on a trail of the users choosing.  The streamlit app will allow a user to search for a 'starter' trail by filtering through charateristics they enjoy most.  That trail will then be inputed into the streamlit-based content recommender to display the top 10 trails.  The user can then investigate those trails and get a great mountain bike ride planned!

## Read Data- 

### Arizona Trail Data

In [60]:
# reading in the scaled, one_hot_encoded dataset for the recommender system
az_trails = pd.read_csv('./data/recommender_data/az_trail_data.csv')
az_trails = az_trails.set_index('trail_name')
az_trails.head()

Unnamed: 0_level_0,length,longitude,latitude,popularity,rating,tot_climb,tot_descent,ave_grade,max_grade,max_elevation,...,difficulty_intermediate,difficulty_intermediate/difficult,difficulty_very difficult,dog_policy_leashed,dog_policy_no dogs,dog_policy_off-leash,dog_policy_unknown,e_bike_policy_allowed,e_bike_policy_not allowed,e_bike_policy_unknown
trail_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
hiline trail,0.022399,0.619678,0.507727,1.0,0.94,0.022963,0.057739,0.315789,0.357143,0.429345,...,0,0,1,0,0,0,1,0,0,1
slim shady trail,0.018786,0.6171,0.508725,0.998953,0.88,0.018666,0.021932,0.210526,0.112245,0.412061,...,0,1,0,0,0,0,1,0,0,1
mescal,0.017341,0.637923,0.498336,0.997906,0.92,0.01451,0.013791,0.157895,0.112245,0.435423,...,0,1,0,0,0,0,1,0,0,1
chuckwagon,0.039017,0.637893,0.498445,0.996859,0.9,0.039375,0.040625,0.210526,0.132653,0.432479,...,1,0,0,0,0,0,1,0,0,1
tortolita preserve loop,0.070087,0.197366,0.626201,0.995812,0.84,0.036627,0.0432,0.105263,0.040816,0.254416,...,1,0,0,0,0,0,1,0,0,1


In [61]:
az_trails.shape, az_trails.isnull().sum().sort_values(ascending = False).head()

((929, 24),
 tot_climb        23
 tot_descent      23
 ave_grade        23
 max_grade        23
 max_elevation    23
 dtype: int64)

### Utah Trail Data

In [62]:
# reading in the scaled, one_hot_encoded dataset for the recommender system
ut_trails = pd.read_csv('./data/recommender_data/ut_trail_data.csv')
ut_trails = ut_trails.set_index('trail_name')
ut_trails.head()

Unnamed: 0_level_0,length,longitude,latitude,popularity,rating,tot_climb,tot_descent,ave_grade,max_grade,max_elevation,...,difficulty_intermediate,difficulty_intermediate/difficult,difficulty_very difficult,dog_policy_leashed,dog_policy_no dogs,dog_policy_off-leash,dog_policy_unknown,e_bike_policy_allowed,e_bike_policy_not allowed,e_bike_policy_unknown
trail_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
thunder mountain trail #33098,0.065165,0.140063,0.312952,1.0,0.94,0.052217,0.14821,0.3,0.409091,0.632152,...,0,1,0,0,0,1,0,0,0,1
wasatch crest,0.100563,0.726322,0.462306,0.998922,0.96,0.082152,0.234174,0.3,0.393939,0.817796,...,0,1,0,0,1,0,0,0,1,0
captain ahab,0.033789,0.304305,0.87393,0.997845,0.94,0.024706,0.086493,0.3,0.348485,0.246302,...,0,0,0,1,0,0,0,0,1,0
wire mesa loop,0.059533,0.025333,0.145997,0.996767,0.92,0.032437,0.03659,0.1,0.181818,0.200894,...,0,1,0,0,0,0,1,1,0,0
ramblin',0.026549,0.328357,0.838821,0.99569,0.94,0.014778,0.035091,0.15,0.181818,0.28999,...,0,1,0,1,0,0,0,0,1,0


#### Creating a Content- Based Recommender

In [63]:
def content_recommend(df):
    
    # creating the sparse matrix
    sparse_matrix = sparse.csr_matrix(df.fillna(0))
       
    # calculating pairwise distances and building into a dataframe
    rec = pairwise_distances(sparse_matrix, metric = 'cosine')
    
    # saving pairwise matrix as a dataframe
    rec = pd.DataFrame(rec, index = df.index, columns = df.index)
    
    # return the dataframe
    return rec

### Arizona Trails Pairwise_Distance DF

In [64]:
az_rec = content_recommend(az_trails)
az_rec

trail_name,hiline trail,slim shady trail,mescal,chuckwagon,tortolita preserve loop,lone cactus loop,apache wash loop,desperado loop,north loop,bug springs,...,ridge trail connector,monument trail,spine trail to ridge trail connector,far west trail,alamo springs spur trail,trail c,trail g,trail h,trail d,kain trail
trail_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
hiline trail,0.000000,0.174012,0.173635,0.171861,0.210020,0.209322,0.385569,0.382047,0.373010,0.185404,...,0.460347,0.665606,0.460785,0.470300,0.464616,0.451589,0.451931,0.452244,0.452423,0.471334
slim shady trail,0.174012,0.000000,0.000515,0.170912,0.204185,0.205499,0.382805,0.382360,0.374641,0.199690,...,0.447000,0.657452,0.447449,0.457114,0.451318,0.438146,0.438498,0.212523,0.438998,0.458168
mescal,0.173635,0.000515,0.000000,0.169433,0.205173,0.206346,0.381109,0.381116,0.373362,0.201811,...,0.452309,0.659920,0.452746,0.463050,0.457094,0.442543,0.442897,0.219100,0.443400,0.464133
chuckwagon,0.171861,0.170912,0.169433,0.000000,0.026550,0.029780,0.382711,0.382585,0.373995,0.198061,...,0.226475,0.659476,0.226999,0.462371,0.234654,0.217506,0.442156,0.442496,0.218318,0.463456
tortolita preserve loop,0.210020,0.204185,0.205173,0.026550,0.000000,0.001065,0.395093,0.375853,0.370878,0.209881,...,0.191936,0.668620,0.192679,0.427624,0.191403,0.202672,0.440565,0.440806,0.203226,0.428254
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
trail c,0.451589,0.438146,0.442543,0.217506,0.202672,0.210741,0.666132,0.685327,0.687891,0.480942,...,0.005495,0.589004,0.005419,0.312180,0.013430,0.000000,0.297472,0.297507,0.000003,0.312153
trail g,0.451931,0.438498,0.442897,0.442156,0.440565,0.446111,0.425372,0.451260,0.458940,0.481156,...,0.304024,0.589164,0.303833,0.312170,0.307439,0.297472,0.000000,0.297580,0.297538,0.312144
trail h,0.452244,0.212523,0.219100,0.442496,0.440806,0.446303,0.666822,0.685819,0.688347,0.481330,...,0.303974,0.589217,0.303782,0.312062,0.307351,0.297507,0.297580,0.000000,0.297570,0.312030
trail d,0.452423,0.438998,0.443400,0.218318,0.203226,0.211239,0.666991,0.685947,0.688449,0.481464,...,0.005339,0.589115,0.005261,0.311969,0.013191,0.000003,0.297538,0.297570,0.000000,0.311930


### Utah Trails Pairwise_Distance DF

In [65]:
ut_rec = content_recommend(ut_trails)
ut_rec

trail_name,thunder mountain trail #33098,wasatch crest,captain ahab,wire mesa loop,ramblin',rush,bull run,big mesa,getaway,dino-flow,...,foresr service road 377,meadow loop,jones ranch trail #123 alternate access,sovereign connect,whales connect,humpback,bst access trail,hi line,carin-age,lasso
trail_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
thunder mountain trail #33098,0.000000,0.337835,0.551475,0.402220,0.385433,0.373467,0.375111,0.548581,0.553409,0.405215,...,0.729667,0.739342,0.732277,0.723858,0.749657,0.749919,0.741951,0.736569,0.736558,0.935932
wasatch crest,0.337835,0.000000,0.372677,0.429697,0.213528,0.178302,0.204471,0.364152,0.364672,0.543282,...,0.851406,0.844242,0.871185,0.873337,0.962400,0.962730,0.844444,0.911578,0.911266,0.686632
captain ahab,0.551475,0.372677,0.000000,0.603297,0.172981,0.531512,0.170624,0.172176,0.177188,0.350722,...,0.641620,0.872056,0.874924,0.820084,0.962778,0.963095,0.875798,0.906724,0.906549,0.619191
wire mesa loop,0.402220,0.429697,0.603297,0.000000,0.418805,0.237567,0.418482,0.602992,0.610502,0.620838,...,0.743898,0.752200,0.740866,0.739823,0.735807,0.736069,0.754791,0.736360,0.736444,0.740494
ramblin',0.385433,0.213528,0.172981,0.418805,0.000000,0.369284,0.001984,0.169689,0.172197,0.347940,...,0.849970,0.869276,0.873853,0.822636,0.962503,0.962826,0.872821,0.906466,0.906316,0.619359
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
humpback,0.749919,0.962730,0.963095,0.736069,0.962826,0.962139,0.963394,0.963010,0.963010,0.719055,...,0.359171,0.074761,0.042053,0.370802,0.329222,0.000000,0.077166,0.014307,0.014486,0.367005
bst access trail,0.741951,0.844444,0.875798,0.754791,0.872821,0.840659,0.875454,0.873384,0.874181,0.649396,...,0.269793,0.000118,0.010305,0.322834,0.367470,0.077166,0.000000,0.032835,0.032709,0.325619
hi line,0.736569,0.911578,0.906724,0.736360,0.906466,0.910615,0.907802,0.906946,0.906854,0.667282,...,0.304045,0.030358,0.008221,0.316428,0.332654,0.014307,0.032835,0.000000,0.000029,0.315760
carin-age,0.736558,0.911266,0.906549,0.736444,0.906316,0.910539,0.907599,0.906794,0.906589,0.667128,...,0.303806,0.030259,0.008148,0.316191,0.332744,0.014486,0.032709,0.000029,0.000000,0.315532


<a id='recaz'></a>
### Arizona Trail Content Recommender

Trails with highest similarity between eachother represent lower values (with **'0'** being equal to itself, **'1'** being not similar at all)

In [87]:
# Which 10 trails are most similar to Hangover Trail?
# This field is a user input within the streamlit app!

az_rec['hangover trail'].sort_values().head(11)[1:]

trail_name
hiline trail                       0.000617
kellog/incinerator ridge           0.028427
western loop trail                 0.039195
tabletop                           0.046170
green mountain                     0.081501
baby jesus trail                   0.104976
hog heaven                         0.164656
sunset                             0.165837
little yeager canyon trail #533    0.165987
cathedral rock connector trail     0.166559
Name: hangover trail, dtype: float64

'Hiline Trail' is most similar to 'Hangover Trail'! Several others share many characteristics!

In [89]:
# Creating a trail search term for Arizona Trails:
# This will bring up trails containing any part of the search term. 
search = "hangover"
trails = az_trails[az_trails.index.str.contains(search)].index
for trail in trails:
    print(trail)
    print("Popularity: ", az_trails.loc[trail, 'popularity'])
    print("Number of Ratings: ", az_trails.T[trail].count())
    print("")
    print("10 Closest Users")
    print(az_rec[trail].sort_values()[1:11])
    print("")
    print("*"*35)
    print("")

hangover trail
Popularity:  0.9821989528795813
Number of Ratings:  24

10 Closest Users
trail_name
hiline trail                       0.000617
kellog/incinerator ridge           0.028427
western loop trail                 0.039195
tabletop                           0.046170
green mountain                     0.081501
baby jesus trail                   0.104976
hog heaven                         0.164656
sunset                             0.165837
little yeager canyon trail #533    0.165987
cathedral rock connector trail     0.166559
Name: hangover trail, dtype: float64

***********************************



<a id='recut'></a>
### Utah Trail Content Recommender

Trails with highest similarity between eachother represent lower values (with **'0'** being equal to itself, **'1'** being not similar at all)

In [90]:
# Which 10 trails are most similar to Portal?
# This field is a user input within the streamlit app!

ut_rec['portal'].sort_values().head(11)[1:]

trail_name
gold bar rim                  0.061473
jacob's (jackson's) ladder    0.084635
four loko                     0.116763
la dee duh                    0.134960
mt. van cott trail            0.151111
homer                         0.188846
agate loop                    0.196337
mega steps                    0.198167
moose puddle                  0.199861
moab rim trail                0.203932
Name: portal, dtype: float64

'gold bar rim' is most similar to 'Portal'! Several others share many characteristics!

In [73]:
# Creating a trail search term for Utah trails:
# This will bring up trails containing any part of the search term. 
search = "portal"
trails = ut_trails[ut_trails.index.str.contains(search)].index
for trail in trails:
    print(trail)
    print("Popularity: ", ut_trails.loc[trail, 'popularity'])
    print("Number of Ratings: ", ut_trails.T[trail].count())
    print("")
    print("10 Closest Users")
    print(ut_rec[trail].sort_values()[1:11])
    print("")
    print("*"*35)
    print("")

portal
Popularity:  0.9773706896551724
Number of Ratings:  24

10 Closest Users
trail_name
gold bar rim                  0.061473
jacob's (jackson's) ladder    0.084635
four loko                     0.116763
la dee duh                    0.134960
mt. van cott trail            0.151111
homer                         0.188846
agate loop                    0.196337
mega steps                    0.198167
moose puddle                  0.199861
moab rim trail                0.203932
Name: portal, dtype: float64

***********************************

poison spider - portal connector
Popularity:  0.03771551724137934
Number of Ratings:  24

10 Closest Users
trail_name
kane creek canyon trail          0.051474
baby steps singletrack loop 2    0.063542
jedi slickrock                   0.068459
7-up to rocky tops connector     0.082026
fast pitch                       0.084120
inside passage                   0.085293
overlook                         0.091614
baby steps singletrack loop 1    0.09580

<a id='user_rec'></a>
## 2. User - Based (Binary) Recommender
## Read Data- Arizona and Utah User Data

In [74]:
# reading in the cleaned, sorted Arizona user dataset for the recommender system
az_users = pd.read_csv('./data/all_arizona_users.csv')
az_users.head()

Unnamed: 0,user_name,trail_name
0,Maxx Byerly,Hiline Trail
1,Cameron McFarland,Hiline Trail
2,Ascanio Pignatelli,Hiline Trail
3,Sabrina Katharina,Hiline Trail
4,Clayton Burtsfield,Hiline Trail


In [75]:
# shape of df and verifying no nulls!
az_users.shape, az_users.isnull().sum().sort_values(ascending = False).head()

((5192, 2),
 trail_name    0
 user_name     0
 dtype: int64)

In [76]:
# reading in the cleaned, sorted Utah user dataset for the recommender system
ut_users = pd.read_csv('./data/all_utah_users.csv')
ut_users.head()

Unnamed: 0,user_name,trail_name
0,MadHamish H,Thunder Mountain Trail #33098
1,Matt Lane,Thunder Mountain Trail #33098
2,Phil Broadbent,Thunder Mountain Trail #33098
3,Jacob Crockett,Thunder Mountain Trail #33098
4,Heather Bond,Thunder Mountain Trail #33098


In [77]:
# shape of df and verifying no nulls!
ut_users.shape, ut_users.isnull().sum().sort_values(ascending = False).head()

((7346, 2),
 trail_name    0
 user_name     0
 dtype: int64)

#### Creating a User- Based Binary Recommender

In [78]:
def user_recommend(df):
    
    # adding binary rating column for trails that users rated
    df['binary_rate'] = 1
    
    # transforming to a pivot table
    pivot = df.pivot_table(index='user_name', columns= 'trail_name', values = 'binary_rate')
    
    # creating the sparse matrix
    sparse_users = sparse.csc_matrix(pivot.fillna(0))
       
    # calculating pairwise distances and building into a dataframe
    user_rec = pairwise_distances(sparse_users, metric = 'cosine')
   
    # saving pairwise matrix as a dataframe
    rec = pd.DataFrame(user_rec, index = pivot.index, columns = pivot.index)
    
    # return the dataframe
    return rec

### Arizona Users Pairwise_Distance DF

In [79]:
az_user_rec = user_recommend(az_users)
az_user_rec

user_name,A H,AJ Wanta,Aaron Cholewa,Aaron Davies,Aaron Frank,Aaron Hickson,Aaron Johnson,Aaron Lovato,Abe Ferraro,Abe Gold,...,sal serrano,sam schwann,skelldify,stuart schwartz,theiner Heiner,trevjens,victor thompson,yannick,Þorvarður Hálfdanarson,❤️
user_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
A H,0.0,1.0,1.000000,1.0,1.000000,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
AJ Wanta,1.0,0.0,1.000000,1.0,1.000000,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
Aaron Cholewa,1.0,1.0,0.000000,1.0,0.666667,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
Aaron Davies,1.0,1.0,1.000000,0.0,1.000000,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
Aaron Frank,1.0,1.0,0.666667,1.0,0.000000,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
trevjens,1.0,1.0,1.000000,1.0,1.000000,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0
victor thompson,1.0,1.0,1.000000,1.0,1.000000,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0
yannick,1.0,1.0,1.000000,1.0,1.000000,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0
Þorvarður Hálfdanarson,1.0,1.0,1.000000,1.0,1.000000,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0


### Utah Users Pairwise_Distance DF

In [80]:
ut_user_rec = user_recommend(ut_users)
ut_user_rec

user_name,#deeznutzfosho,46and2,A B,A Estrada,A MG,A Rodriguez,AKA Surfer,AMANDA MELESSA,AOSR,Aaron Anderstrom,...,tharlow harlow,thehiker 2000,theiner Heiner,tourjee Tourjee,tracy bilhorn,tyler bostwick,tyte 754,wimolrat Tangtiphongkul,zachnielsen999 Nielsen,สีดำ ภูเขา
user_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
#deeznutzfosho,0.0,1.0,0.0,1.0,1.0,0.0,1.000000,1.0,1.000000,1.0,...,1.0,1.0,1.0,1.0,0.42265,1.0,1.0,1.0,1.0,1.0
46and2,1.0,0.0,1.0,1.0,1.0,1.0,1.000000,1.0,1.000000,1.0,...,1.0,1.0,1.0,1.0,1.00000,1.0,1.0,1.0,1.0,1.0
A B,0.0,1.0,0.0,1.0,1.0,0.0,1.000000,1.0,1.000000,1.0,...,1.0,1.0,1.0,1.0,0.42265,1.0,1.0,1.0,1.0,1.0
A Estrada,1.0,1.0,1.0,0.0,1.0,1.0,0.833333,1.0,0.711325,1.0,...,1.0,1.0,1.0,1.0,1.00000,1.0,1.0,1.0,1.0,1.0
A MG,1.0,1.0,1.0,1.0,0.0,1.0,1.000000,1.0,1.000000,1.0,...,1.0,1.0,1.0,1.0,1.00000,1.0,1.0,1.0,1.0,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
tyler bostwick,1.0,1.0,1.0,1.0,1.0,1.0,1.000000,1.0,1.000000,1.0,...,1.0,1.0,1.0,1.0,1.00000,0.0,1.0,1.0,1.0,1.0
tyte 754,1.0,1.0,1.0,1.0,1.0,1.0,1.000000,1.0,1.000000,1.0,...,1.0,1.0,1.0,1.0,1.00000,1.0,0.0,1.0,1.0,1.0
wimolrat Tangtiphongkul,1.0,1.0,1.0,1.0,1.0,1.0,1.000000,1.0,1.000000,1.0,...,1.0,1.0,1.0,1.0,1.00000,1.0,1.0,0.0,1.0,1.0
zachnielsen999 Nielsen,1.0,1.0,1.0,1.0,1.0,1.0,1.000000,1.0,1.000000,1.0,...,1.0,1.0,1.0,1.0,1.00000,1.0,1.0,1.0,0.0,1.0


<a id='azuser'></a>
### Arizona User- Based Binary Recommender
Users with highest similarity between eachother represent lower values (with **'0'** being equal to itself, **'1'** being not similar at all)

In [81]:
# Which 10 users are most similar to A H?

az_user_rec['A H'].sort_values().head(11)[1:]

user_name
Soloman Picoult          0.000000
Josh Richart             0.000000
Brian Derrick            0.292893
Brandon Sudeith          0.422650
Bob Spak                 0.500000
Mark Smith               0.552786
Nikki McIntyre           0.666667
Pablo Cortez             0.750000
Happy Cycling            0.781782
Michael Bartholomeusz    1.000000
Name: A H, dtype: float64

'Soloman Picoult' and 'Josh Richart' must be close riding partners to 'A H'.  Two other users are very close (less than 0.5) to 'A H'.  Then, users become quite dissimilar.
'A H' must be a strong rider since he has rated mostly challenging trails.

In [82]:
# Creating a user search term:
# This will bring up users containing any part of the search term. 
az_pivot = az_users.pivot_table(index='user_name', columns= 'trail_name', values = 'binary_rate')

search = "A H"
users = az_pivot[az_pivot.index.str.contains(search)].index
for user in users:
    print(user)
    print("Average Rating: ", az_pivot.loc[user, :].mean())
    print("Number of Ratings: ", az_pivot.T[user].count())
    print("")
    print("10 Closest Users")
    print(az_user_rec[user].sort_values()[1:11])
    print("")
    print("*"*35)


A H
Average Rating:  1.0
Number of Ratings:  1

10 Closest Users
user_name
Soloman Picoult          0.000000
Josh Richart             0.000000
Brian Derrick            0.292893
Brandon Sudeith          0.422650
Bob Spak                 0.500000
Mark Smith               0.552786
Nikki McIntyre           0.666667
Pablo Cortez             0.750000
Happy Cycling            0.781782
Michael Bartholomeusz    1.000000
Name: A H, dtype: float64

***********************************


<a id='utuser'></a>
### Utah User- Based Binary Recommender
Users with highest similarity between eachother represent lower values (with **'0'** being equal to itself, **'1'** being not similar at all)

In [83]:
# Which 10 users are most similar to A H?

ut_user_rec['AKA Surfer'].sort_values().head(11)[1:]

user_name
Justin Pingatore    0.183503
Evan Christensen    0.183503
Matt Davis          0.422650
Joshua Shockley     0.422650
Christi Worstell    0.422650
Chris Stewart       0.422650
Chris Marsh         0.422650
Igor K              0.422650
Mark Tjaden         0.422650
Luke Perkerwicz     0.422650
Name: AKA Surfer, dtype: float64

'Justin Pingatore' and 'Evan Christensen' match up very well with 'AKA Surfer'.  And all top 10 similar users share many trail ratings in common to 'AKA Surfer'.

In [84]:
# Creating a user search term:
# This will bring up users containing any part of the search term. 
ut_pivot = ut_users.pivot_table(index='user_name', columns= 'trail_name', values = 'binary_rate')

search = "Fred"
users = ut_pivot[ut_pivot.index.str.contains(search)].index
for user in users:
    print(user)
    print("Average Rating: ", ut_pivot.loc[user, :].mean())
    print("Number of Ratings: ", ut_pivot.T[user].count())
    print("")
    print("10 Closest Users")
    print(ut_user_rec[user].sort_values()[1:11])
    print("")
    print("*"*35)
  

Brian Fredricksen
Average Rating:  1.0
Number of Ratings:  2

10 Closest Users
user_name
Donny O'Neill     0.292893
Andrew Ozmun      0.292893
Wesley LeFevre    0.292893
Russell Ochoa     0.292893
Chris Sarot       0.292893
eric clark        0.292893
Brandon Tuttle    0.292893
Cat Sales         0.500000
Hayley Kemp       0.500000
Lloyd McFarlin    0.500000
Name: Brian Fredricksen, dtype: float64

***********************************
Fred Hudso
Average Rating:  1.0
Number of Ratings:  1

10 Closest Users
user_name
Alex Leibold            0.422650
Jon Zanone              0.422650
Justin Steele           0.422650
Chad Hackley            0.500000
John Connolly           0.905084
Michael Martori         1.000000
Michelle Hoffer         1.000000
Michelle Manke-Horat    1.000000
Miguel Suarez           1.000000
Mike Anderson           1.000000
Name: Fred Hudso, dtype: float64

***********************************
Freddy Calk
Average Rating:  1.0
Number of Ratings:  1

10 Closest Users
user_name