# 381 Project Notebook

### Imports

Fist we import the package we will be using and then we import our csv into a SFrame (package specific data structure)

In [1]:
import turicreate as tc

In [2]:
activity_data = tc.SFrame('activity.csv')
activity_data

------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[str,str,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------


Activity,Group,<15 min,15 to 30,30 to 60,hour+,Socialization,Outside home,Weather Permitting
Jog,Exercise,1,1,1,1,0,1,1
Lifting,Exercise,0,1,1,1,0,1,0
Swimming,Exercise,0,0,1,1,0,1,0
Biking,Exercise,1,1,1,1,0,1,1
Basketball,Exercise,0,1,1,1,1,1,0
Walk,Exercise,1,1,1,1,0,1,1
Hot Yoga,Exercise,0,1,1,0,1,1,0
Treadmill/Bike,Exercise,1,1,1,1,0,1,0
Hike,Exercise,0,0,0,1,1,1,1
Go for a drive,Relaxation,0,1,1,1,0,1,0

People Interaction,Nature,Healthy Physically,Healthy Intellectually,Healthy Mentally,Inexpensive to Enter
1,1,1,0,1,1
1,0,1,0,1,0
1,0,1,0,1,0
1,1,1,0,1,0
1,0,1,0,1,0
1,1,1,0,1,1
1,0,1,0,1,0
0,0,1,0,1,0
1,1,1,0,1,0
0,0,0,0,1,0

Inexpensive each time,Accessible,Impromptu,Relaxing,Active Mind,Screen,Coordination,Requires nonbasic Resources ...
1,1,1,0,0,0,0,0
0,1,1,0,0,0,0,1
0,0,0,0,0,0,0,1
1,1,1,0,0,0,0,1
0,0,1,0,0,0,1,1
1,1,1,0,0,0,0,0
0,1,0,0,0,0,1,1
1,1,1,0,0,0,0,1
0,0,0,1,0,0,0,1
0,0,0,1,1,0,0,1

Audio,Visual,Tactile,Taylor,Justin,Gabriel
0,0,0,0,1,1
0,0,1,1,0,1
0,0,0,0,0,0
0,0,1,0,0,1
1,1,1,1,0,0
0,0,0,0,0,1
1,0,0,0,0,0
0,0,0,1,0,1
0,1,0,0,0,1
1,1,1,0,0,0


### Modelling
Here is where we specify the type of model we want to use, create and use it determine the most similar activities

In [3]:
content_similarity_recommender = tc.recommender.item_content_recommender.create(item_data=activity_data, item_id='Activity')

Applying transform:
Class             : AutoVectorizer

Model Fields
------------
Features          : ['Group', '<15 min', '15 to 30', '30 to 60', 'hour+', 'Socialization', 'Outside home', 'Weather Permitting', 'People Interaction', 'Nature', 'Healthy Physically', 'Healthy Intellectually', 'Healthy Mentally', 'Inexpensive to Enter', 'Inexpensive each time', 'Accessible', 'Impromptu', 'Relaxing', 'Active Mind', 'Screen', 'Coordination', 'Requires nonbasic Resources', 'Audio', 'Visual', 'Tactile', 'Taylor', 'Justin', 'Gabriel']
Excluded Features : ['Activity']

Column                       Type  Interpretation  Transforms   Output Type
---------------------------  ----  --------------  -----------  -----------
Group                        str   categorical     None         str        
<15 min                      int   categorical     astype(str)  str        
15 to 30                     int   categorical     astype(str)  str        
30 to 60                     int   categorical     ast

Defaulting to brute force instead of ball tree because there are multiple distance components.


Below is the top ten similar activities for each activity in our csv

In [4]:
top_ten = content_similarity_recommender.get_similar_items()
top_ten.print_rows(420,3)

+-----------------------------+-----------------------------+--------------------+-----+
|           Activity          |           similar           |       score        | ... |
+-----------------------------+-----------------------------+--------------------+-----+
|             Jog             |             Walk            | 0.9642857313156128 | ... |
|             Jog             |            Biking           | 0.8571428656578064 | ... |
|             Jog             |        Treadmill/Bike       |        0.75        | ... |
|             Jog             |           Lifting           | 0.6785714030265808 | ... |
|             Jog             |        Music Therapy        | 0.6428571343421936 | ... |
|             Jog             |     Eat a healthy snack     | 0.6428571343421936 | ... |
|             Jog             |          Meditation         | 0.6428571343421936 | ... |
|             Jog             |      Muscle Relaxation      | 0.6071428656578064 | ... |
|             Jog    

### Making suggestions to a user

Now that the model is working we test it against a sample user that might be using our system.

They tell us the activities they enjoy how many periods of free time they have throughout the day and in which category they fall under

In [5]:
user_activities = ['Draw', 'Watch Videos', 'Yoga']
user_timings = {
    "<15 min": 1,
    "15 to 30": 3,
    "30 to 60": 2,
    "hour+": 1
}
set(user_activities);

Here we determine the list of activities that are similar to the ones the user input

In [6]:
similar_activities = []
similar_activities = set(similar_activities);
for user_activity in user_activities:
    for row in top_ten[top_ten['Activity'] == user_activity]:
        if row['similar'] not in user_activities and row['score']:
            similar_activities.add(row['similar'])
print(similar_activities)

{'Watch tv', 'Plan your day/week', 'Bars', 'Search for jobs', 'Music Therapy', 'Museum', 'Puzzles', 'Hot Yoga', 'Watch a movie', 'Basketball', 'Clubs', 'Learn something new', 'Read Textbook', 'Meditation', 'Self project', 'Spa', 'Video Games', 'Look into your future plans', 'Nap', 'Read', 'Social Media', 'Massage', 'Journaling', 'Swimming'}


Using the list of similar activities we make suggesstions for each of the time periods they have available throughout the day

In [7]:
activity_per_period = {
    "<15 min": [],
    "15 to 30": [],
    "30 to 60": [],
    "hour+": []
}

suggestions_made = set([])
for period in user_timings:
    for row in activity_data[activity_data[period] > 0]:
        if row['Activity'] in similar_activities and len(activity_per_period[period]) < user_timings[period] + 1 and row['Activity']:
            activity_per_period[period].append(row['Activity'])
            suggestions_made.add(row['Activity'])


print (activity_per_period)

{'<15 min': ['Nap', 'Read'], '15 to 30': ['Basketball', 'Hot Yoga', 'Watch tv', 'Nap'], '30 to 60': ['Swimming', 'Basketball', 'Hot Yoga'], 'hour+': ['Swimming', 'Basketball']}
