Zsh History Analysis
zsh logs commands and timestamps to
shell features such as reverse history search.
This repository is a fun project that provides shell, Python, and R
scripts to parse, analyze, visualize
These scripts can be extended to support Bash's
You can run this on your
.zsh_history files by cloning this repository
git clone https://github.com/bamos/zsh-history-analysis.git
and installing the following prerequisites.
Ensure you have increased the history file size so commands aren't removed.
Then, follow the steps in
Control Flow to generate the plots.
Rscript, which can be installed from your package manager. In Arch Linux, the required packages are python and r.
R: ggplot2 and reshape are installed from an R shell with
Increasing the History File Size
Unfortunately, zsh's default history file size is limited to 10000 lines by default and will truncate the history to this length by deduplicating entries and removing old data.
Adding the following lines to
.zshrc will remove the limits and
deduplication of the history file.
export HISTSIZE=1000000000 export SAVEHIST=$HISTSIZE setopt EXTENDED_HISTORY
The following is the control flow for generating plots.
- Archive all
./pull-history-data.shis a script to partially help archiving the data that will pull files from a list of servers separated by newlines in a file named
./analyze.pyto analyze the raw data files.
./analyze.py --helpwill provide a help menu with the supported options.
./plot.rto generate plots from the analyzed data.
At a given hour or weekday, how frequently do I run commands? The following shows the average number of commands executed for each hour and weekday. I average 10 commands per hour overnight and a little more during the day, and Wednesdays seem to be my least productive days.
Many hours have 0 commands executed since I'm not typing commands every hour of every day, so these points have a high standard deviation. Empirical Cumulative Distribution Functions (ECDF's) provide a deeper visualization of the distributions.
Average command length
What command was over 100 characters!?
analyze.py will output the top five commands, and these
long commands are from using the full path to an executable,
such as the Android ARM cross compiler, as shown in the following output.
$ ./analyze.py commandLengths 105: /opt/android-ndk-r9/toolchains/arm-linux-androideabi-4.8/prebuilt/linux-x86/bin/arm-linux-androideabi-gcc
Scoping into the majority of the data shows that almost 50% of my commands are one or two characters.
Since almost 50% of my commands are one or two characters, what are the top commands? The following plot shows the top commands are Linux utilities and oh-my-zsh aliases.