Skip to content
This repository has been archived by the owner on Nov 22, 2020. It is now read-only.

Commit

Permalink
README!
Browse files Browse the repository at this point in the history
  • Loading branch information
rouge8 committed Apr 26, 2012
1 parent 5d79657 commit ecc57d8
Showing 1 changed file with 45 additions and 0 deletions.
45 changes: 45 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
Google Voice Analysis
=====================

A collection of scripts to parse and analyze exported Google Voice data.

So far it just parses the data and runs a few [Naive Bayes classifiers](http://en.wikipedia.org/wiki/Naive_Bayes_classifier), but
I'm looking to expand.

Usage
-----

0. [Download](https://www.google.com/takeout/?pli=1#custom:voice) and extract your Google Voice data.

1. Copy `settings_example.py` to `settings.py` and configure appropriately.

2. Run `gv_to_db.py` to load everything into a SQLite database.

3. Run `who_from.py` to run a few simple analyzers using Naive Bayes.

Classifiers
---------

- `people_with_many_texts(n)` classifies the texts of people who have sent more than `n` texts.

- `recipient_is(name)` classifies texts into either `name` or `not_name`.

- `split_me_not_me()` classifies texts into either `me` or `not_me` depending on the sender.

Interactive Mode
----------------

1. Get your training and test sets from one of the classifiers.

2. Run `build_classifier(training_set)`.

3. Run `interactive` on your classifier.

4. Enter some messages.

License
-------

MIT/X11 licensed, except for `emoticons.py` and `twokenize.py`.

`emoticons.py` and `twokenize.py` are copyright Brendan O'Connor, Michel Krieger, and David Ahn and licensed under the Apache License 2.0.

0 comments on commit ecc57d8

Please sign in to comment.