Skip to content
Location Prediction using Language Variation
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Baseline_Models.ipynb
CNN.ipynb
GRU.ipynb
LICENSE
LSTM.ipynb
MLP.ipynb
README.md
Report.pdf

README.md

TwiLoc - Location Prediction using Language Variation

TwiLoc investigates the feasibility of geographically locating Twitter users based solely on tweet content. We are trying to locate a user using their tweet content by understanding the dialect differences across geographies through deep learning techniques. We are not using any other external information to locate the user. This project provides an approach to augment existing systems that locate users.

Prerequisites

Requires Python 3.x.

Here's is the list of libraries required for this project

GloVe is used for obtaining vector representations for words.

Dataset

  • GeoText - Geo-tagged Microblog Corpus is the primary dataset for TwiLoc. All the results and hyperparameter tunings are based on this dataset.

  • Accuracy can be enhanced further by using massive datasets like UTGeo2011 can also be used to train.

Reverse geocoding can be done using services provided by MapQuest.

Pre-trained models

Model Accuracy (%)
CNN 57.43
GRU 56.35
LSTM 55.54
MLP 50.59

Note: Please read the report for more detailed information regarding the experiment's result.

References

Authors

You can’t perform that action at this time.