___

<a href='https://github.com/eliasmelul/'> <img src='https://s3.us-east-2.amazonaws.com/wordontheamazon.com/NoMargin_NewLogo.png' style='width: 15em;' align='right' /></a>
# Finding my Schitt's Creek
#### Recommender System
___
<h3 align="right">by Elias Melul, Data Scientist </h3> 

___


In [1]:
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity

%matplotlib inline

#### Import Dataframe

In [2]:
norm_df = pd.read_csv('https://s3.us-east-2.amazonaws.com/www.findingmyschittscreek.com/Data/normalized_df_sub.csv', index_col=0)

#### Variables Considered

This recommendation system takes into consideration a multitude of variables scraped, selected, processed and modeled in the following notebooks:
1. <a href="https://github.com/eliasmelul/finding_schitts/blob/master/weather_data_FMSC.ipynb">Weather Data</a> --- <a href="https://s3.us-east-2.amazonaws.com/www.findingmyschittscreek.com/Data/final_weather_data.csv">Data Direct Download</a>
2. <a href="https://github.com/eliasmelul/finding_schitts/blob/master/general_data_FMSC.ipynb">General Socioeconomic Data</a> --- <a href="https://s3.us-east-2.amazonaws.com/www.findingmyschittscreek.com/Data/scraped_datausa.csv">Data Direct Download</a>
3. <a href="https://github.com/eliasmelul/finding_schitts/blob/master/Foursquare_data_FMSC.ipynb">Venue Data: Foursquare</a> --- <a href="https://s3.us-east-2.amazonaws.com/www.findingmyschittscreek.com/Data/one_hot_encoded_usa_FS.csv">Data Direct Download</a>
4. <a href="https://github.com/eliasmelul/finding_schitts/blob/master/EDA_FMSC.ipynb">Exploratory Data Analysis and Preprocessing</a> --- <a href="https://s3.us-east-2.amazonaws.com/www.findingmyschittscreek.com/Data/allDataCombined.csv">Data Direct Download</a>
5. <a href="https://github.com/eliasmelul/finding_schitts/blob/master/modeling_LM_FMSC.ipynb">Modeling: Weights</a> --- <a href="https://s3.us-east-2.amazonaws.com/www.findingmyschittscreek.com/Data/normalized_df_sub.csv">Data Direct Download</a>

---

# Recommendation System

### Function Definition


In [3]:
def from_city_cosSim(data, name):

    try:
        Xs = data[data.City == name].drop('City',1)
        Col_A = data[data.City != name].City
        Ys = data[data.City != name].drop('City',1)

        cosSim = cosine_similarity(X=Xs,Y=Ys)
        sim = list(cosSim[0])
        cty = list(Col_A)
        comb = {"City":cty,"Similarity":sim}
        dfdf = pd.DataFrame(comb).reset_index()

        # Adding a row with the Y City
        currCity = {"City":name,"Similarity":1}
        curr = pd.DataFrame(currCity, index=[0])

        # Concatenate to finalize DF
        dfdf = pd.concat([dfdf,curr], sort=False).reset_index(drop=True).drop('index',1)
        return(dfdf)
    except:
        print("Wrong input: this entry will be ignored")

In [4]:
def input_cities(numShow=10):
    # create class that defines cities
    class rated_city:
        def __init__(self, city):
            self.city = city
    
    #Loop to input cities based on the user
    add_city = True
    userInput = []
    w = 0
    while add_city == True:
        city_name = input("City (Include state - Ex. New York, NY): ")
        userInput.append(city_name)
        simSim = from_city_cosSim(data=norm_df, name=city_name)
        try:
            cosSim = cosSim.merge(simSim, how='inner', on='City')
        except:
            cosSim = simSim
        
        city = rated_city(city_name)
        cont = input("Do you want to include another city?")
        add_city = cont.lower() in ['yes','true','of course','y','si','1']
        w+=1
    
    simCols = cosSim.drop("City",1)
    cits = cosSim.City
        
    for i, row in simCols.iterrows():
        simCols.at[i,'SumVal'] = row.sum()/w
    simi = simCols.SumVal
    
    out = {"City":cits,"Score":simi}
    out = pd.DataFrame(out).set_index("City").drop(userInput)
    out = out.sort_values('Score', ascending=False)
    
    return out.head(numShow)      

## Recommendations

---
**Use:** to use this system, all you have to do is run the _input_cities()_ function. This will return a list of the most similar cities - the recommendations!

In [None]:
input_cities()