What is persanalytics?
persanalytics is a repo in which I collect and visualize personal analytics.
persanalytics contains (for now):
- keystroke frequency information collected using minute-agent (my modified fork).
- todos (current and completed) managed using t.
- cycling data collected using a Garmin Edge 500.
- music tracks scrobbled on last.fm.
- instant messaging chat logs (
xmllogs collected using Adium).
This is all so I can play around with data and practice plotting and analyzing it, and get some insight into changes over time in the process. My goal is to collect and visualize data that goes back years. Keystrokes, emails, messages/SMS, and any physical activities I can record.
After a while, the above plot loses its usefulness. The lines are pushed against each other and, aside from the loess smoothing overlay, the viewer doesn't get any information from the line plot itself.
This is a good time to use a simple rolling mean.
My sensors don't measure which gear I'm in, so I created a pseudo "average gear" score:
gear score = total strokes / distance
The higher the value, the smaller the gear.
- The larger the point, the higher the gear.
- The redder the point, the higher the average heart rate .
Average heart rate is a good metric for how intense a training session was.
I am especially happy that I can finally get my music data. There is a clear pattern that I long knew/suspected, but am still impressed I can see in the plots: I love music, but I've been listening to less of it lately. The main reason is that I'm listening to more and more podcasts.
Even though the following plot implies that I am "loving" fewer and fewer tracks as time goes on, I think that's misleading. I am always finding tracks that I can't stop listening to, I just don't use the "Love this track" feature of last.fm as much as I used to.
I love regular expressions.
I got around to merging and parsing the logs from my most recent IM account. According to the final tally, there are 79715 messages that I've sent and received using that account since September 2012.
persanalytics is a collection of R (
.R) scripts written in RStudio. Each script, when run, will perform all the needed merging, crunching, nip 'n tucking, and plotting needed to arrive at the final plots, and saves them to
All plots are made using the ggplot2 library/package because it's awesome.
As mentioned above, keystrokes frequencies per min are collected using minute-agent.
keystrokes.R reads the
.log file and handles the rest.
I use t to manage my todos, and I love it. It is the only task-management system that has ever worked for me.
t saves current todos in a file called
tasks.txt and all completed todos in
todos.R reads both text files, counts the number of current and completed tasks, and appends it to
data/todos.csv. It also produces the above plot.
/usr/bin/rscript --vanilla Users/sherif/persanalytics/todos.R
I scrobble all my music. last.fm allows you to request your listening archive. You receive
.json files of your total, loved, banned, bootstrapped, and skipped tracks.
music.R reads in the
.tsv file and does what it does.
I've been using Adium since I first started using a Mac. This was a great decision because it means that even back when I was using Google Chat, I had all my chat logs stored locally. Adium saves chat logs in
~/Library/Application Support/Adium 2.0/Users/Default/Logs
However, I keep my Adium 2.0 folder symlinked on Dropbox, which means that all my settings and chat logs are kept in sync between my two computers. It works surprisingly well.
Logs, there is a folder per account. Within those there is a folder per contact. Within those a folder (with an extension
.chatlog) per conversation, and within those
.xml files with the content of the conversations.
IM.R executes a bash command to
cat all of those scattered
.xml files into one
mergedIM.xml file, which it then reads in and does some regex stuff to split it into the interesting components.
If there is an easier way to parse
xml using R, I don't know it.
Data on the todo list
The following data is being collected, I just need to figure out how to obtain/parse them.
- Email (incoming and outgoing). (See Visualize.mmBundle).