Twitter stream + search API grabber
Python Shell CSS HTML JavaScript
Latest commit fa77737 Dec 19, 2016 @boogheta boogheta move code around
Permalink
Failed to load latest commit information.
bin
collect_friendships
collect_list_accounts
gazouilloire
.gitignore
LICENSE
README.md
__init__.py
config.json.example
meta.json
requirements.txt
restart.sh

README.md

Gazouilloire

Twitter stream + search API grabber handling various config options such as collecting only during specific time periods, or limiting the collection to some locations.

HowTo

  • Install dependencies:
    sudo apt-get install mongodb-10gen
    pip install -r requirements.txt
  • Copy config.json.example to config.json

  • Set your Twitter API key and generate the related Access Token

"twitter": {
   "key": "<Consumer Key (API Key)>xxxxxxxxxxxxxxxxxxxxx",
   "secret": "<Consumer Secret (API Secret)>xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
   "oauth_token": "<Access Token>xxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
   "oauth_secret": "<Access Token Secret>xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}
  • Write down the list of desired keywords as json array.

      "keywords": [
          "amour"
      ],

    Note that there are two possibilities to filter further:

    • geolocalisation mode: just add `"geolocalisation": "Paris, France" field to the config with the desired geographical boundaries or give in coordinates of the desired box as shown in the config example file
    • time limited keywords mode, in order to filter on specific keywords during planned time period:
    "time_limited_keywords": {
          "#m6": [
              ["2014-05-01 16:00", "2014-05-08 16:05"],
              ["2014-05-08 16:00", "2014-05-08 16:05"],
              ["2014-05-15 16:00", "2014-05-08 16:05"],
              ["2014-05-22 16:00", "2014-05-08 16:05"]
          ],
          "bieber": [
              ["2014-05-08 16:00", "2014-05-08 16:05"]
          ]
      },
  • Run with:

    ./gazouilloire/run.py
  • Data is stored in your mongo, you can also export it easily with simple scripts such as those in the bin directory:
# To export a csv with the most useful fields:
bin/export_csv.py
# To export the whole text content of the tweets:
bin/export_all_text.py