Deep Neural Network model to predict security perception from Google Street View images. Model based on AlexNet CNNs
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
caffe
data
figures
generated_files
.gitignore
ACMMM.ipynb
LICENSE
README.md
security_perception_prediction.png
streetview_image.jpg

README.md

Are Safer Looking Neighborhoods More Lively? A Multimodal Investigation into Urban Life

Security perception

This repository shows the code to apply the concept of "Are Safer Looking Neighborhoods More Lively? A Multimodal Investigation into Urban Life". Through the code it is possible to reproduce some results, and to see how security perception can be predicted from Google Street View images, automatically.

See the paper and the slides for more details.

Overview

  • data/ contains the list of the coordinates of the Google Street View images we used to train our network.
  • figures/ is the output directory of the images generated by the scripts.
  • generated_files/ contains the trained models of pytorch.
  • caffe/ contains the original caffe model we used for the paper.

Please consider citing our paper if you use our model or code (see below for citation). We live thanks to this small action you can take!

Installation

We assume that you're using Python 3.6.

Then we assume these Python package dependencies:

Data

Street View images

Unfortunately, we can not share the original Street View images we used to train and test the model. However, we extracted all the images and we predicted the score for all the images. The latitude, longitude and the prediction for each image can be found in data/list_files.csv

Model weights and definition

We shared the model of Caffe and PyTorch. The former can be found in caffe/, while the latter can be loaded from the state dictionary (pytorch_state.npy) or the full model (pytorch_model.pt).

Download them here:

The pre-processing of the images involves the subtraction of the mean. Thus, we included the means in format (3xWxH) in generated_files/places205CNN_mean_filtered.npy.

Mobile phone data

Sadly, we can't share the mobile phone dataset we used. However, there are similar dataset released in Open Data license.

Disclaimer

This code has been published after the peer-reviewed publication (1 year after it), to publish the code for new developers and researchers. Thus, we chose to share a PyTorch model. This allows to have an updated version of the code and weights, which can be used by today's researchers.

The original Caffe model was converted through MMdnn with these steps:

python -m mmdnn.conversion._script.convertToIR -f caffe -d kit_imagenet -n places205CNN_finetune.prototxt -w places205CNN_finetune_snap_iter_10000.caffemodel_save

python -m mmdnn.conversion._script.IRToCode --dstFramework pytorch --IRModelPath pytorch_places2_IR.pb --IRWeightPath pytorch_places2_IR.npy --dstModelPath pytorch_places.py -dw pytorch_places2_IR.npy

python -m mmdnn.conversion.examples.pytorch.imagenet_test -n pytorch_places.py -w pytorch_state.npy --dump pytorch_model.pt

I will soon share the code to make the "attention" images (converting the original Matlab code), and the code to produce the plots of the paper. Although I improved the original code A LOT with new software and scripts that have been released in this year, it has not been optimized for efficiency, but should be fast enough for most purposes. We do not give any guarantees that there are no bugs - use the code on your own responsibility!

License

This code is licensed under the MIT license.

Citation

@inproceedings{DeNadai:2016:SLN:2964284.2964312,
 author = {De Nadai, Marco and Vieriu, Radu Laurentiu and Zen, Gloria and Dragicevic, Stefan and Naik, Nikhil and Caraviello, Michele and Hidalgo, Cesar Augusto and Sebe, Nicu and Lepri, Bruno},
 title = {Are Safer Looking Neighborhoods More Lively?: A Multimodal Investigation into Urban Life},
 booktitle = {Proceedings of the 2016 ACM on Multimedia Conference},
 series = {MM '16},
 year = {2016},
 isbn = {978-1-4503-3603-1},
 location = {Amsterdam, The Netherlands},
 pages = {1127--1135},
 numpages = {9},
 url = {http://doi.acm.org/10.1145/2964284.2964312},
 doi = {10.1145/2964284.2964312},
 acmid = {2964312},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {computer vision, mobile phone data, social studies, urban perception, urban planning},
}