Predicting and Interpreting Smell Data Obtained from Smell Pittsburgh
Clone or download
Latest commit 3c2f0aa Jan 17, 2019


A tool for predicting and interpreting smell data obtained from Smell Pittsburgh. The design and evaluation are documented in the paper, Smell Pittsburgh: Community-Empowered Mobile Smell Reporting System. If you find this useful, please consider citing:

Yen-Chia Hsu, Jennifer Cross, Paul Dille, Michael Tasota, Beatrice Dias, Randy Sargent, Ting-Hao (Kenneth) Huang, and Illah Nourbakhsh. 2018. Smell Pittsburgh: Community-Empowered Mobile Smell Reporting System. arXivpreprint arXiv:1810.11143.

  title={Smell Pittsburgh: Community-Empowered Mobile Smell Reporting System},
  author={Hsu, Yen-Chia and Cross, Jennifer and Dille, Paul and Tasota, Michael and Dias, Beatrice and Sargent, Randy and Huang, Ting-Hao'Kenneth' and Nourbakhsh, Illah},
  journal={arXiv preprint arXiv:1810.11143},


Install conda. This assumes that Ubuntu is installed. A detailed documentation is here. First visit here to obtain the downloading path. The following script install conda for all users:

sudo sh -b -p /opt/miniconda3

sudo vim /etc/bash.bashrc
# Add the following lines to this file
export PATH="/opt/miniconda3/bin:$PATH"
. /opt/miniconda3/etc/profile.d/

source /etc/bash.bashrc

For Mac OS, I recommend installing conda by using Homebrew.

brew cask install miniconda
echo 'export PATH="/usr/local/miniconda3/bin:$PATH"' >> ~/.bash_profile
echo '. /usr/local/miniconda3/etc/profile.d/' >> ~/.bash_profile
source ~/.bash_profile

Clone this repository.

git clone
sudo chown -R $USER smell-pittsburgh-prediction

Create conda environment and install packages. It is important to install python 2.7 and pip first inside the newly created conda environment.

conda create -n smell-pittsburgh-prediction
conda activate smell-pittsburgh-prediction
conda install python=2.7
conda install pip
which pip # make sure this is the pip inside the smell-pittsburgh-prediction environment
sh smell-pittsburgh-prediction/

Get data, preprocess data, extract features, train the classifier, perform cross validation, analyze data, and interpret the model. This will create a directory (py/prediction/data_main/) to store all downloaded and processed data.

cd smell-pittsburgh-prediction/py/prediction/
python pipeline

# For each step in the pipeline
python data # get data
python preprocess # preprocess data
python feature # extract features
python validation # perform cross validation
python analyze # analyze data and interpret model

# For deployment, train the classifier and perform prediction
# Use crontab to call the following two commands periodically
# (
python train
python predict


The web/GeoHeatmap.html visualizes distribution of smell reports by zipcodes. You can open this by using a browser, such as Google Chrome.


A pre-downloaded dataset from 10/31/2016 to 9/30/2018 is included in this repository. To get recent data, change the end_dt (ending date time) variable in the file and then run the following:

python data

This will download smell data (py/prediction/data_main/smell_raw.csv) and sensor data (py/prediction/data_main/esdr_raw/). The smell data is obtained from Smell Pittsburgh. The sensor data is obtained from ESDR.