Wastat is a toolkit for analysing whatsapp chats, creating statistics and plotting pretty graphs.
It is assumed that you are using a *nix system, such as Linux, a BSD or MacOS. You will require a POSIX compatible shell, perl, AWK and Gnuplot. Basic acquaintance with shell operations is also expected. Gnuplot is not required if the user doesn't wish to create plots.
To work with a chat, one first has to receive in in Email format. Later on, it might be possible to extract the necessary information from a SQLite database, which one can access when one's phone is rooted. Refer to the official WhatsApp FAQ to find out how to Email a chat.
One should note that WhatsApp doesn't always allow exporting the full chat, due to the extension sizes. This is an external limitation this project can't do anything about.
Since different whatsapp versions using different languages export chats
in different ways, in a generally inconvenient format, the waconv
script standardizes different formats into a simple to parse TSV
structure. This means, that tools like AWK can easily process the
chat structure from now on (waextr
for example).
Currently, three different formats are recognized, with the following associated codes:
Format | Code | Date | Time |
---|---|---|---|
US | uk |
MM/DD/YYYY |
AM/AM |
UK | uk |
DD/MM/YYYY |
A.M./P.M. |
German | de |
DD/MM/YYYY |
vorm./nachm. |
Some of these might be out of date with newer versions, and will will be updated with newer versions, as soon as possible.
To actually process a file, made up of lines like these (ie. the uk
format):
24/01/2018, 9:49 p.m. - Faust: What meaning to these riddling words applies?
24/01/2016, 10:20 p.m. - Mephisto: I am the spirit, ever, that denies!
And rightly so: since everything created
In turn deserves the be annihilated.
one would write ./waconv uk [chatfile]
, and redirect the output. The
above example would thus become:
24/01/2018 21:49 faust what meaning to these riddling words applies
24/01/2016 22:20 mephisto i am the spirit ever that denies and rightly so since everything created in turn deserves the be annihilated
This step is necessary if one wants to work with the following two tools.
Waextr
is basically just a helper script for wastat
. It requires one
argument, which may contain one of the following letter, to enable the
output of certain columns. These are: d
(to output the date), t
(to
output the times), u
(to output the user) and m
(to update the
messages). So for example waextr dm [chatfile]
, processing the example
from above, would output:
24/01/2018 faust
24/01/2016 mephisto
If one executes wastat
using awk, setting the usern
variable, only
those lines will be printed, if the value matches the name. Hence, to output
24/01/2018 what meaning to these riddling words applies
one would run awk -v usern=faust -f waextr dm
.
This main script has multiple commands, and overview can be generated if
it is called without any arguments or by calling the script with the
argument help
. The same list is also presented here:
wastat wc
: counts how often words have been used in messageswastat wo
: prints all words used in messages, each on one linewastat uc
: counts how often a user (ie. number) has sent a messagewastat uwc
: counts how many "words" each user has usedwastat pt
: plots how many messages have been sent per minutewastat put
: plots how many messages selected users have sent per minutewastat pd
: plots how many messages have been sent each daywastat pud
: plots how many messages have been sent by selected users each day
The axillary command wastat clean
deletes all files and images
generated by wastat.
This software has been placed into the public domain, or an approximation of it, under CC0. If there are any issues with the software, contact the author or visit the GitHub repository.
The chat extract from this document has been taken from A. S. Kline's English Translation of J. W. Goethe's Faust.