The following IPython Notebooks contain the code that has been used to perform the data analysis of the paper "Unveiling patterns of international communities in a global city using mobile phone data" and allow to reproduce the main results.
IPython notebooks are included with output, clicking on a link will open the corresponding notebook using the nbviewer service.
Parsing of the original dataset into an HDF5 store.
Aggregation of the dataset on wider time intervals.
Extraction of daily, monthly and yearly entropy time series.
Correlation between mobile calls volume and foreign resident population reported by census.
Classifier of point of interests based on entropy and phone activity.
Census data for the city of Milan aggregate by NIL (Nuclei di Identità Locale), source: http://dati.comune.milano.it.
List of point of interests in Milan as defined by TripAdvisor, source: http://www.tripadvisor.com/Attractions-g187849-Activities-Milan_Lombardy.html
Remittance data used to compare clusters of countries based on persisten homology, source:http://data.worldbank.org/indicator/BX.TRF.PWKR.DT.GD.ZS.
Running the IPython notebooks
In order to execute the notebooks you will need additional data and a working python environment containing the required dependencies.
Install the required submodules by running:
git submodule init git submodule update
Python dependencies can be installed in a virtual environment using the following instructions inside the project root directory:
virtualenv virtualenv . virtualenv/bin/activate pip install -r requirements.txt && pip install tables
To start an IPython notebook server using the newly created virtual environment:
. virtualenv/bin/activate ipython notebook
Running the notebooks
Select a notebook using the IPython notebook interface and execute its cells. To correctly reproduce the analysis pipeline the notebooks must be executed according to the progressive numbering.
You can report eventual bugs or issues using the GitHub issue tracking tool.