Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Scripts to parse and analyze Google Voice data exports.

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 .gitignore
Octocat-spinner-32 LICENSE
Octocat-spinner-32 README.md
Octocat-spinner-32 emoticons.py
Octocat-spinner-32 gv_to_db.py
Octocat-spinner-32 models.py
Octocat-spinner-32 requirements.txt
Octocat-spinner-32 settings_example.py
Octocat-spinner-32 twokenize.py
Octocat-spinner-32 who_from.py
README.md

Google Voice Analysis

A collection of scripts to parse and analyze exported Google Voice data.

So far it just parses the data and runs a few Naive Bayes classifiers, but I'm looking to expand.

Usage

  1. Download and extract your Google Voice data.

  2. Install the dependencies with something like pip install -r requirements.txt.

  3. Copy settings_example.py to settings.py and configure appropriately.

  4. Run gv_to_db.py to load everything into a SQLite database.

  5. Run who_from.py to run a few simple analyzers using Naive Bayes.

Classifiers

  • people_with_many_texts(n) classifies the texts of people who have sent more than n texts.

  • recipient_is(name) classifies texts into either name or not_name.

  • split_me_not_me() classifies texts into either me or not_me depending on the sender.

Interactive Mode

  1. Get your training and test sets from one of the classifiers.

  2. Run build_classifier(training_set).

  3. Run interactive on your classifier.

  4. Enter some messages.

License

MIT/X11 licensed, except for emoticons.py and twokenize.py.

emoticons.py and twokenize.py are copyright Brendan O'Connor, Michel Krieger, and David Ahn and licensed under the Apache License 2.0.

Something went wrong with that request. Please try again.