Skip to content

algrince/milan_communications

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Communications in Milan

This is an educational project that was made for completing Statistics in the university. The main language of the project is Spanish. The programming language is R. This repository contains the raw code file .rmd that generates .html file that contains the output.

Main objectives

The main objective of the project was to elaborate a statistical report (descriptive type). The source file contained data from mobile communications in Milan from one day in 2013 (4 842 624 interactions, including incoming and outcoming SMSs, calls and web traffic). The analysis includes these steps:

  1. Build new data sheet importing the raw data of traffic per cell. (The map of Milan, as any other region, is divided in cells to register traffic) The data sheet is to include:
    • The total sum of traffic for every cell
    • The average of traffic for every registered interaction
  2. Add frequencies for the variable Country code for incoming and outcoming SMSs, calls and web traffic. Interpret the results finding countries that generate max and min of trafic for every type of interaction.
  3. Add frequencies for the variable Square id for visualising the dymanics of generation of the interactions in Milan.
  4. Sort out data as SMS, call or Internet. Add frequencies for every type of interaction.
  5. In the same data sheet proceed to descriptive analysis of frequencies of the following variables:
    • Total traffic of incoming SMSs and its average traffic per interaction.
    • Total traffic of outcoming SMSs and its average traffic per interaction.
    • Total traffic of incoming calls and its average per interaction.
    • Total traffic of outcoming calls and its average per interaction.
    • Total traffic of Internet and its average per interaction.
  6. For all the variables from the previous step, add descriptive analysis which includes measures of position, statistical dispersion and its form, identification of atypical cells.
  7. Make log transformation of the same variables and repeat the same analysis with the new ones. Compare results for the normal and the log scale.

Optional objectives

The second part of the project is focused on graphic representation of the data. It includes downloading map of Europe (Eurostat) and of traffic grid of Milan. Additionally, Country code assigment was made.

  1. Associate country codes with country (additional data).
  2. Visualise variables from the main objectives in European map. For every variable add two maps: absolute frequency and relative to the population of every country (additional data, source: "World Bank Group" for 2013). Select countries with max and min traffic.
  3. Visualise the raw data in the grid map of Milan. Select regions of the city with the most interactions.

Example of the map represenation for general traffic

General traffic
General realtive traffic

Example of Milan grip map

Full grip map
City centre grip map

About

Educational statistical project made with R.

Topics

Resources

Stars

Watchers

Forks

Languages