# Exchange Student Decision Assistant - Notebook

### Initializaiton -  Package Load & Fixed Parameters
Presented here is the coordinates for each of the universities supported by the Exchange Student Decision Assistant

- GPS coordinates of Denmark. Latitude: 56.2639 Longitude: 9.5017
- GPS coordinates of University of Copenhagen (UoC), Denmark. Latitude: 55.6745 Longitude: 12.5702.
- GPS cpordinates of Aarhus University (AAU), Denmark. Latitude: 56.1666 Longitude: 10.1999
- GPS coordinates of University of Southern Denmark (USD), Denmark.Latitude: 55.3678 Longitude: 10.4234

In [4]:
#Load necessary packages
import numpy as np
import pandas as pd
import json
import requests

!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    altair-4.1.0               |             py_1         614 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    ca-certificates-2020.6.20  |       hecda079_0         145 KB  conda-forge
    certifi-2020.6.20          |   py36h9f0ad1d_0         151 KB  conda-forge
    openssl-1.1.1g             |       h516909a_1         2.1 MB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    branca-0.4.1               |             py_0          26 KB  conda-forge
    ------------------------------------------------------------
                       

In [5]:
# Setting the fixed parameters - namely the longitude and latitude of each university
DK_Lat = 56.2639
DK_Long = 9.5017
UOC_Lat = 55.6745
UOC_Long = 12.5702
AAU_Lat = 56.1666
AAU_Long = 10.1999
USD_Lat = 55.3678
USD_Long = 10.4234

### Visualization of Universities in Denmark 

In [6]:
# creates a map of Denmark using the DK_Lat and DK_Long values
map_denmark = folium.Map(location=[DK_Lat, DK_Long], zoom_start=8)

# add markers to map
#Initial adding, University of Copenhagen
folium.CircleMarker(
    [UOC_Lat, UOC_Long],
    radius=5,
    color='blue',
    popup = 'University of Copenhagen',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_denmark)
#Adds Aarhus University to the map
folium.CircleMarker(
    [AAU_Lat, AAU_Long],
    radius=5,
    color='blue',
    popup = 'Aarhus University',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_denmark) 
#Adds University of Southern Denmark to the map
folium.CircleMarker(
    [USD_Lat, USD_Long],
    radius=5,
    color='blue',
    popup = 'University of Southern Denmark',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_denmark) 
    
map_denmark

### User Input - Distance, Venues and Weight

A Map of Denmark including the location of the three different universities have been plotted using Folium.

Next up - the notebook will take userinput to set the following parameters

- Distance (DIST) from each university in which you want to search the area for your favourite venues. DIST is measured in the unit of 'meters'
- Favourite Venue Type 1 (FAV1) 
- Favourite Venue Type 2 (FAV2)
- Favourite Venue Type 3 (FAV3)
- Weighting of FAV1 (WF1)
- Weighting of FAV2 (WF2)
- Weighting of FAV3 (WF3)

The 'Favourite Venue Type X', should be selected from the foursquare homepage over venues, and must include the 'categoryId' number. Follow this [link](https://developer.foursquare.com/docs/build-with-foursquare/categories/) and copy/paste the name and number under each of your favourite three venues.

In [9]:
DIST = input('Please select the distance from the university you want to explore [m]: ')

Please select the distance from the university you want to explore [m]: 8000


Now select your three favourite types of venues

In [10]:
FAV1_Name = input('Please Enter the name of the venue type: ') #let's the user enter Venue name
FAV1 = input('Please enter the matching CategoryId for your Favourite Venue Type 1: ' ) #Let's the user paste categoryId of Venue

Please Enter the name of the venue type: Music Venue
Please enter the matching CategoryId for your Favourite Venue Type 1: 4bf58dd8d48988d1e5931735


In [11]:
FAV2_Name = input('Please Enter the name of the venue type: ')
FAV2 = input('Please enter the matching CategoryId for your Favourite Venue Type 2: ' )

Please Enter the name of the venue type: Basketball Stadium
Please enter the matching CategoryId for your Favourite Venue Type 2: 4bf58dd8d48988d18b941735


In [12]:
FAV3_Name = input('Please Enter the name of the venue type: ')
FAV3 = input('Please enter the matching CategoryId for your Favourite Venue Type 3: ' )

Please Enter the name of the venue type: Beer Bar
Please enter the matching CategoryId for your Favourite Venue Type 3: 56aa371ce4b08b9a8d57356c


Next select the weighting (integer) of each Favourite Venue Type, the sum of the weighting must equal 100. Split the 100 between each of the three venues to ensure

In [13]:
WF1 = 0 #sets the initial weight factor to 0 of all weights
WF2 = 0
WF3 = 0
twf = [WF1, WF2, WF3] #array with different weight factors, used for assessing the total value of weight factors and used in the try/except functions below

In [14]:
while True:
    try:
        WF1 = int(input('Enter the weight you wish to assign Venue Type 1: ')) #Let's the user assign a value between 0-100 to rate the value of each Venue for them
        if sum(twf) < 100:
            break
        else:
            print('Total weight exceeds 100')
    except ValueError:
        print('Please enter a whole number')
        
rem1 = 100 - WF1 - WF2 - WF3
print('You have ', rem1, ' points left to distribute')

Enter the weight you wish to assign Venue Type 1: 50
You have  50  points left to distribute


In [15]:
while True:
    try:
        WF2 = int(input('Enter the weight you wish to assign Venue Type 1: ')) #Let's the user assign a value between 0-100 to rate the value of each Venue for them
        if sum(twf) < 100:
            break
        else:
            print('Total weight exceeds 100')
    except ValueError:
        print('Please enter a whole number')
        
rem1 = 100 - WF1 - WF2 - WF3
print('You have ', rem1, ' points left to distribute')

Enter the weight you wish to assign Venue Type 1: 30
You have  20  points left to distribute


In [16]:
while True:
    try:
        WF3 = int(input('Enter the weight you wish to assign Venue Type 1: ')) #Let's the user assign a value between 0-100 to rate the value of each Venue for them
        if sum(twf) < 100:
            break
        else:
            print('Total weight exceeds 100')
    except ValueError:
        print('Please enter a whole number')
        
rem1 = 100 - WF1 - WF2 - WF3
print('You have ', rem1, ' points left to distribute')

Enter the weight you wish to assign Venue Type 1: 20
You have  0  points left to distribute


In [17]:
# The code was removed by Watson Studio for sharing.

### Foursquare API calls - urls

In [18]:
#University of Copenhagen URL
UOC_url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&categoryId={},{},{}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    UOC_Lat, 
    UOC_Long, 
    DIST, 
    FAV1,
    FAV2,
    FAV3)

#Aarhus University URL
AAU_url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&categoryId={},{},{}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    AAU_Lat, 
    AAU_Long, 
    DIST, 
    FAV1,
    FAV2,
    FAV3)

#University of Southern Denmark URL
USD_url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&categoryId={},{},{}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    USD_Lat, 
    USD_Long, 
    DIST, 
    FAV1,
    FAV2,
    FAV3)

### Foursquare API calls - Request and create dataframe for each University

In [19]:
##CREATING DATAFRAMES
#UOC dataframe
UOC_results = requests.get(UOC_url).json()["response"]['groups'][0]['items']
UOC_venues_list=[]

UOC_venues_list.append([(
    v['venue']['name'],
    v['venue']['location']['lat'],
    v['venue']['location']['lng'],
    v['venue']['categories'][0]['name'],
    v['venue']['categories'][0]['id']) for v in UOC_results])

UOC_Venues = pd.DataFrame([item for UOC_venues_list in UOC_venues_list for item in UOC_venues_list])
UOC_Venues.columns = ['Venue', 'Venue Latitude', 
                      'Venue Longitude', 
                      'Venue Category', 'Venue Categories ID'] 
#AAU dataframe
AAU_results = requests.get(AAU_url).json()["response"]['groups'][0]['items']
AAU_venues_list=[]

AAU_venues_list.append([(
    v['venue']['name'],
    v['venue']['location']['lat'],
    v['venue']['location']['lng'],
    v['venue']['categories'][0]['name'],
    v['venue']['categories'][0]['id']) for v in AAU_results])

AAU_Venues = pd.DataFrame([item for AAU_venues_list in AAU_venues_list for item in AAU_venues_list])
AAU_Venues.columns = ['Venue', 'Venue Latitude', 
                      'Venue Longitude', 
                      'Venue Category', 'Venue Categories ID'] 
#USD dataframe
USD_results = requests.get(USD_url).json()["response"]['groups'][0]['items']
USD_venues_list=[]

USD_venues_list.append([(
    v['venue']['name'],
    v['venue']['location']['lat'],
    v['venue']['location']['lng'],
    v['venue']['categories'][0]['name'],
    v['venue']['categories'][0]['id']) for v in USD_results])

USD_Venues = pd.DataFrame([item for USD_venues_list in USD_venues_list for item in USD_venues_list])
USD_Venues.columns = ['Venue', 'Venue Latitude', 
                      'Venue Longitude', 
                      'Venue Category', 'Venue Categories ID'] 


### Count (through sum) number of Venues within Each Category

In [20]:
#Now we have created a dataframe for the venues in a selected distance to the University of Copenhagen, based on our favourite venue types
UOC_CFV1 = np.sum(UOC_Venues['Venue Categories ID'] == FAV1)
UOC_CFV2 = np.sum(UOC_Venues['Venue Categories ID'] == FAV2)
UOC_CFV3 = np.sum(UOC_Venues['Venue Categories ID'] == FAV3)


In [21]:
#Now we have created a dataframe for the venues in a selected distance to Aarhus University, based on our favourite venue types
AAU_CFV1 = np.sum(AAU_Venues['Venue Categories ID'] == FAV1)
AAU_CFV2 = np.sum(AAU_Venues['Venue Categories ID'] == FAV2)
AAU_CFV3 = np.sum(AAU_Venues['Venue Categories ID'] == FAV3)

In [22]:
#Now we have created a dataframe for the venues in a selected distance to the University of Southern Denmark, based on our favourite venue types
USD_CFV1 = np.sum(USD_Venues['Venue Categories ID'] == FAV1)
USD_CFV2 = np.sum(USD_Venues['Venue Categories ID'] == FAV2)
USD_CFV3 = np.sum(USD_Venues['Venue Categories ID'] == FAV3)

### Results
Now we have an overview of the number of FaVorite Venues, within a selected distance (DIST) from each University. Based on the Weightning of each Favourite venue, each university are assigned a "Match Score". The "Match Score" indicates which university fulfils the Exchange Students interest the most, based on their type of venue and the weighting. 

To Calculate the Match Score, we will utilize pandas inbuilt 'ranking' function. As we have three universities, this means the university with the most entries within a category will receive a '3', while the second most hits '2' and the least hits '1'. In case the universities are tied on a score, they will receive the same ranking.

The ranking (as it can be 3 for most and 1 for least hits), will be multipled with '1/3' of each of the favorite weightings. Meaning if a university gets the most hits, it will receive a 'full weighting '1/3 * rank * weight'. While the lowest rank, will score only '1/3' of the rating. Simple formula.

In [23]:
#Create dataframe of 'counted' values
universities = {FAV1_Name: [UOC_CFV1, AAU_CFV1, USD_CFV1], 
                FAV2_Name: [UOC_CFV2, AAU_CFV2, USD_CFV2],
                FAV3_Name: [UOC_CFV3, AAU_CFV3, USD_CFV3]}

University = {'University of Copenhagen','Aarhus University', 'University of Southern Denmark'}

University_Count = pd.DataFrame(universities, columns = [FAV1_Name, FAV2_Name, FAV3_Name])
University_Count.index = University
University_Count

Unnamed: 0,Music Venue,Basketball Stadium,Beer Bar
Aarhus University,12,0,18
University of Copenhagen,17,1,4
University of Southern Denmark,5,0,0


In [24]:
#Ranks each university with regards to Venue (High is better)
Uni_Rank = University_Count.rank()
Uni_Rank

Unnamed: 0,Music Venue,Basketball Stadium,Beer Bar
Aarhus University,2.0,1.5,3.0
University of Copenhagen,3.0,3.0,2.0
University of Southern Denmark,1.0,1.5,1.0


In [25]:
#This section calculates the weighted score for each Venue type, based in user input
Match_Score = Uni_Rank.rename(columns={FAV1_Name: FAV1_Name  + " Score", FAV2_Name: FAV2_Name  + " Score", FAV3_Name: FAV3_Name  + " Score"})
Match_Score[FAV1_Name  + " Score"] = (Match_Score[FAV1_Name  + " Score"]*WF1*(1/3))/100
Match_Score[FAV2_Name  + " Score"] = (Match_Score[FAV2_Name  + " Score"]*WF2*(1/3))/100
Match_Score[FAV3_Name  + " Score"] = (Match_Score[FAV3_Name  + " Score"]*WF3*(1/3))/100
Match_Score['Total Match'] = Match_Score.sum(axis=1)
Match_Score = Match_Score.sort_values('Total Match', ascending=False)
Match_Score

Unnamed: 0,Music Venue Score,Basketball Stadium Score,Beer Bar Score,Total Match
University of Copenhagen,0.5,0.3,0.133333,0.933333
Aarhus University,0.333333,0.15,0.2,0.683333
University of Southern Denmark,0.166667,0.15,0.066667,0.383333


### Recommendation

In [26]:
print('Your best Match is with:', '\033[1m' + Match_Score.index[0] + '\033[0m')
print('Your match:', '\033[1m' + Match_Score['Total Match'].iloc[0].round(2).astype(str) + '%' + '\033[0m')
print('This is based on your three chosen favourite venues', FAV1_Name + ", " + FAV2_Name + ", and " + FAV3_Name)

Your best Match is with: [1mUniversity of Copenhagen[0m
Your match: [1m0.93%[0m
This is based on your three chosen favourite venues Music Venue, Basketball Stadium, and Beer Bar
