## Overview

In August of 2021, I created an activity log - noting when I started some task or activity and the amount of time I did it for. In a sentence, to see if my actions align with my goals. The intent was not to describe every detail of my day but rather to see what I __tried__ to do.

In [None]:
import numpy
import pandas as pd
import plotly.express as px # quickly create graphs from dataframes for exploration

df = pd.read_csv('Activity-Daily_Log.csv') # create a Pandas Dataframe a csv file
df.head(10) # day one and the start of day two (august 24 and 25)

"Perhaps", I thought to myself, "if I could see what I did with my days, maybe I'd start using my time more effectively."

Time management is not the problem that aimed I to investigate. I am a mostly functional adult human and high quality tools already exist for this purpose. My issue is with life itself, in a good way, uhm, let me explain: There is a buzzing whirlwind of stimulation around us every day, and meanwhile I have overarching goals related to family, hobbies, and health. I have found these larger areas of life can shrink as other areas grow. This is a natural part of life. Anyway, this dataset is a 30 day sample from my __Daily Activity Log__. 

## How much TV am I watching?

This was was first question I had to answer. After the fourth day of tracking I felt a little guilty entering in the data. I knew that I had watched TV at least one hour each day, so that was where I started.

In [None]:
# Locate only the activities that I did for more than an hour at a time more than once
dff = df.loc[(df['Duration'] >= 1) & (df['Action'].value_counts().sum() > 1)]

df.loc[(df.Event == 'Dance'),'Event']='Hip-Hop'

Pandas is a powerful data analyis library, but you can also use it for tiny spreadsheets and philosophical questions. In the previous line, we are using .loc function to filter out all entries where the __Duration__ value is less than or equal to one. Plotly Express allows data visualization to be an iterative and interactive process. In the line below, we are creating a pie chart using the __Action__ and __Duration__ columns from the filtered dataframe.

In [None]:
fig = px.pie(dff, names='Action', values='Duration', title='What I spent my time on')
fig.show() # if you are reading this and no graph is below you should watch this video of me explaining it 

It's clear that I spent a lot of time watching TV. I had a goal to learn more about Python during this time, so let's focus on that now. Let's look at just one __Action__

In [None]:
# return all sessions where I did Learn Python
learn_python_sessions_df = df.groupby(['Action']).get_group('Learn Python')
learn_python_sessions_df.head()

This was helpful in proving to myself that I was things during the day, but it would be more helpful if I could see __what__ I was doing. Tracking time in this way presents multiple problems: The data entry is arduous, accuracy is low and uncounted for time is a mystery. I learned about these problems as I tracked my activities for 30 days. The thought entering bad data into a study where I am both researcher and subject helped to keep me honest.

In [None]:
# simple bar chart for the grouped dataframe 
sessions_duration_fig = px.bar(learn_python_sessions_df, x='Date', y='Duration', title='Learn Python Session Durations')
sessions_duration_fig.show()

Let's ask a few more questions about this specific activity:

In [None]:
# How many times did I sit down to learn python?
number_of_sessions = learn_python_sessions_df['Duration'].count()
# How much time did I spend in total?
total_time = learn_python_sessions_df['Duration'].sum()
# How long did I usually do it for?
mean_time = learn_python_sessions_df['Duration'].mean()

formated_string = "" # use this later
print(number_of_sessions, total_time, mean_time)

Or, we can ask these same questions of every __Action__ 

In [None]:
#df[["Action", "Duration"]].describe()
number_sessions_per_action_group = df.groupby("Action").count().sort_values(by="Action", ascending=False)
number_sessions_per_action_group

#mean_durations = df.groupby("Action").mean().sort_values(by='Duration', ascending=False)
#mean_durations.head()
#titanic.groupby("Sex").mean()


   How often did I add a follow up task?

In [None]:
df['Action'].value_counts().head(10)

In [None]:
action_group = df.groupby(['Action'])

# MANUALLY GRAB THE TOP TEN THINGS I DO
tidy_kitchen = action_group.get_group('Tidy Kitchen')
make_food = action_group.get_group('Make Food')
learn_python = action_group.get_group('Learn Python')
watch_tv = action_group.get_group('Watch TV')
philosophy = action_group.get_group('Philosophy')
ride_bike = action_group.get_group('Ride bike')
work_out = action_group.get_group('Work out')
groceries = action_group.get_group('Groceries')
reading = action_group.get_group('Reading')
walk = action_group.get_group('Walk')


In [None]:
# this seems like probably the least efficient way to grab my data lol
tidy_kitchen_total = tidy_kitchen['Duration'].sum()
learn_python_total = learn_python['Duration'].sum()
make_food_total = make_food['Duration'].sum()
watch_tv_total = watch_tv['Duration'].sum()
philosophy_total = philosophy['Duration'].sum()
ride_bike_total = ride_bike['Duration'].sum()
work_out_total = work_out['Duration'].sum()
groceries_total = groceries['Duration'].sum()
reading_total = reading['Duration'].sum()
walk_total = walk['Duration'].sum()

all_totals_list = [['Tidy Kitchen', tidy_kitchen_total], ['Learn Python', learn_python_total], ['Make Food', make_food_total], ['Watch TV', watch_tv_total], ['Philosophy', philosophy_total], ['Ride Bike',ride_bike_total] , ['Work Out', work_out_total], ['Groceries', groceries_total], ['Reading', reading_total], ['Walk', walk_total]]

all_totals_df = pd.DataFrame (all_totals_list, columns = ['Action','Duration'])

In [None]:
# the below should do just fine
activities = df.Action.unique()
activity_means = []
activity_groups = df.groupby(['Action'])
for activity in activities:
     tmp_group = activity_groups.get_group(activity)
     group_mean = tmp_group.mean()
     activity_means.append(group_mean)
     print(activity + "," + str(group_mean.mean()))
    
# although i am missing some understnading of pandas and could probably do this with one line

In [None]:
all_totals_df.head(12)

In [None]:
fig = px.pie(all_totals_df, names='Action', values='Duration', title='Top Ten Aggregated Activies')
fig.show()

In [None]:
# attempt a smarter way about doing this 

# start with the top ten activitities by count

# use this as an iterative list to get a sum of each activity (instead of doing each one by one)

In [None]:
# now regenerate our graph 