Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

pigeo: A Python Geotagging Tool

New Features

  1. An accurate US model based on NA dataset is added to pigeo which can be loaded by --model ./models/lrna option. Run ./ again or manually download the models tar file and extract it in the models directory. It will result in very accurate geolocation in US and for example can be used in US sentiment analysis toward presidential candidates.

If you need a new feature that doesn't exist, you can send a request to me on Github or email me.


pigeo is a document or Twitter user geolocation tool. Given a piece of text or a Twitter user, it can predict their locations based on pre-trained models.

The design principles are as follows:

  1. Lightweight and fast
  2. Comes with text-based classification and network-based regression pre-trained models.
  3. It is possible to train new text-based classification models.
  4. It can be used in shell mode, web mode powered by Python Flask and as a library.
  5. It supports informal text.
  6. It's performance is evaluated over a standard Twitter geolocation dataset.

I try to keep the web-based app. Online here at if it wasn't Online and you needed it for testing the system you can contact me or easily bring up the server on your own machine using the instructions below.

Quick Start

pigeo's installation is straightforward:

  1. download the zip file from github or run: git clone

  2. cd pigeo then chmod +x and then run ./

    This downloads the pre-trained models and extracts them in models directory. alternatively (e.g. if you are using Windows) you can download the models directory from and extract it with an archive program.

  3. Requirements:

    3.0 sudo pip install -r requirements.txt and go to 4 install the libraries in requirements.txt one by one.

    Note: if you don't have root permission and can not run with sudo you can use pip with --user argument.

  4. Set the Twitter keys and tokens in If you don't have your own Twitter credentials (keys and tokens) you can create one from pigeo needs the Twitter credentials in order to geolocate Twitter users (e.g. @potus) otherwise it won't be able to download the user's tweets and won't be able to geolocate them. Text input though, will work without the Twitter credentials.

  5. pigeo is ready to use. Go to usage section.

There might be conflicts between installed version, and the versions used in pigeo, so if the above didn't work, run the follwoing commands in order:

conda create -n pigeo python=2.7

conda activate pigeo

git clone

cd pigeo

chmod +x


pip install -r requirements.txt --ignore-installed

python --mode shell

Directory Structure

The directory structure after installation and downloading the models should be:

├── models
│   ├── lpworld
│   │   └── userhash_coordinate.pkl.gz
│   └── lrworld
│       ├── clf.pkl.gz
│       ├── coordinate_address.pkl.gz
│       ├── label_coordinate.pkl.gz
│       └── vectorizer.pkl.gz
├── static
│   └── styles
│       ├── bootstrap-3.3.6-dist
│       │   ├── css
│       │   │   ├── bootstrap.css
│       │   │   ├──
│       │   │   ├── bootstrap.min.css
│       │   │   ├──
│       │   │   ├── bootstrap-theme.css
│       │   │   ├──
│       │   │   ├── bootstrap-theme.min.css
│       │   │   └──
│       │   ├── fonts
│       │   │   ├── glyphicons-halflings-regular.eot
│       │   │   ├── glyphicons-halflings-regular.svg
│       │   │   ├── glyphicons-halflings-regular.ttf
│       │   │   ├── glyphicons-halflings-regular.woff
│       │   │   └── glyphicons-halflings-regular.woff2
│       │   └── js
│       │       ├── bootstrap.js
│       │       ├── bootstrap.min.js
│       │       └── npm.js
│       └── main.css
├── templates
│   ├── index.html
│   └── index-simple.html


usage: [-h] [--model MODEL] [--dump_dir DUMP_DIR] [--host HOST]
                [--port PORT] [--mode MODE]

optional arguments:
  -h, --help            show this help message and exit
  --model MODEL, -d MODEL
                        text-based classification model directory to be used.
  --dump_dir DUMP_DIR, -o DUMP_DIR
                        directory to which a newly trained model is saved.
  --host HOST           host name/IP address where Flask web server in web
                        mode will be running on. Set to to make it
                        externally available. default (
  --port PORT, -p PORT  port number where Flask web server will bind to in web
                        mode. default (5000).
  --mode MODE, -m MODE  mode (web, shell) in which pigeo will be used. default

Shell Mode

This mode is well suited for initial testing of pigeo. Simply type python or python --mode shell to open the shell mode. You can type a string or a single Twitter user to be geolocated.

text to geolocate: yall and the result is:

{'city': u'Atlanta', 'state': u'Georgia', 'lat': 33.749000000000002, 'country': u'United States of America', 'lon': -84.387979999999999, 'label_distribution': {180: 0.063345477875493147}, 'top50': u'atlanta, newnan, atl, _madeinchyna, auc, spelman, emory, nisha_pooh_, mcdonough, lenox, redan, glambarsalon, morehouse, stockbridge, llh, riverdale, scoutmob, llf, marta, ladycaliibaybee, buckhead, georgia, ga, culc, decatur, cl_atlanta, coweta, peachtree, piedmont, colemankjohnson, cau, obsessions, followmeh, \uc9c0\uc6b0\uac1c, frfr, \uba38\ub9ac\uc18d\uc5d0, stonecrest, creekside, welcometoatlanta, lithonia, octane, duress, midtown, jortstorture, falcons, wpatl, a3c, criminalrecords, fairburn, frankski'}

The result is a json string which contains city, state, country and coordinates of the predicted location. It also contains the predicted class and its confidence. Note that the LR-WORLD model has 930 classes/regions. The top 50 most important features of the predicted class are also returned.

Web Mode

The web mode is powered by Flask which is a lightweight Python web framework. To start the webservice simply run

python --mode web --host --port 5000

If you want the web service to be available on the valid IP address run:

python --mode web --host --port 5000

Use or http://valid-ip-address:5000 in the browser to use the service. The service is able to geolocate a piece of text or a single Twitter user (e.g. @potus).

Library Mode

pigeo is well suited to be used in other python programs. In the library mode it is possible to geolocate a single piece of text and also a list of text documents. Simple use case:

import pigeo
# loads the world model (default)
# geolocate a sentence
pigeo.geo("gamble casino city")
# geolocate a Twitter user
# geolocate a list of texts
pigeo.geo(['city centre', 'city center'])

Note that it is not efficient to call pigeo.geo multiple times and the suggested way for geolocation of multiple documents is passing them as a list to pigeo.geo. It is possible to return the label distribution over all classes rather than only the predicted class by calling pigeo.get('a text', True).

Training Mode

To train a new geolocation model one needs a list of text and a list of corresponding coordinates. pigeo then is able to train a new model and save it as follows:

import pigeo
# train the model and save it in 'toy_model'
pigeo.train_model(['text1', 'text2'], 
[(lat1, lon1), (lat2, lon2)], num_classes=2, 
# the new model can be loaded and be used

LP Network-based Regression Mode

pigeo has a trained network-based regression model that can geolocate only Twitter user which requires both an Internet connection and the tweepy library installed.

import pigeo
# load lpworld
# geolocate a Twitter user (Internet neeeded).


Python version 2.7

scikit-learn version 0.17

pickle Revision: 72223

Numpy version 1.10.4

Note that the models might not be easily loadable by other pickle/scikit-learn versions.


  author    = {Rahimi, Afshin  and  Cohn, Trevor  and  Baldwin, Timothy},
  title     = {pigeo: A Python Geotagging Tool},
  booktitle = {Proceedings of ACL-2016 System Demonstrations},
  month     = {August},
  year      = {2016},
  address   = {Berlin, Germany},
  publisher = {Association for Computational Linguistics},
  pages     = {127--132},
  url       = {}


Afshin Rahimi


An Easy to Use, Accurate Python Geolocation Library






No releases published


No packages published