# Predicting the best *Gilets Jaunes*' demonstration localization to limit material damages in Paris

*Morgane Nadal - PSL University & Ecole Normale Supérieure Student, Paris*

## **Business Plan**

  It has been now 5 months that *Gilets Jaunes* are demonstrating in Paris. Some are pacific whereas an increasing number of "black blocs" are decimating the old neighborhoods of the town, trashing streets and buildings and specifically targetting Gastronomic Restaurants, Luxury shops and other so-called symbolic places. The global cost for France is currently above the hundred million of euros.

  In this project, we will try to identify the neighboorhods that are the most likely to be subject to vandalism and try to find neighborhoods where the manifestors could be headed over in order to avoid damage as much as possible.
  
  We believe that it can help the French Government and Paris citizens to estimate what damages could be done in case of a march. It is also essential for the *Gilets Jaunes* who are truly willing to speak their voice during a planned and government accepted-march, and want to demonstrate without the violences and degradations that had accompagnied them every Saturday.
  
  We make this project public and we know that there are plenty of other factors to take into account on this very sensitive subject. This project is just a preliminary to further analyses.

## Importations

In [None]:
import numpy as np 

import pandas as pd

import json 

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

## Data Cleaning

We will use the borough and neighborhood data to find with Foursquare the venues in each neighborhoods. For that, we need a table with the boroough, neighborhood, coordinates and venues. Let's go!

***First, you will have to download the JSON file of the Paris Neighborhoods at this address : ***

    https://opendata.paris.fr/explore/dataset/quartier_paris/export/?location=12,48.85889,2.34692&basemap=jawg.streets&dataChart=eyJxdWVyaWVzIjpbeyJjb25maWciOnsiZGF0YXNldCI6InF1YXJ0aWVyX3BhcmlzIiwib3B0aW9ucyI6e319LCJjaGFydHMiOlt7ImFsaWduTW9udGgiOnRydWUsInR5cGUiOiJjb2x1bW4iLCJmdW5jIjoiQVZHIiwieUF4aXMiOiJuX3NxX3F1Iiwic2NpZW50aWZpY0Rpc3BsYXkiOnRydWUsImNvbG9yIjoiIzI2Mzg5MiJ9XSwieEF4aXMiOiJuX3NxX3F1IiwibWF4cG9pbnRzIjo1MCwic29ydCI6IiJ9XSwidGltZXNjYWxlIjoiIiwiZGlzcGxheUxlZ2VuZCI6dHJ1ZSwiYWxpZ25Nb250aCI6dHJ1ZX0%3D
    
***OR you can find it on my Github repository:*** https://github.com/Smaragdy/Coursera-Project-Gilets-Jaunes
    
    
We then extract a panda dataframe from it:

In [None]:
with open('YOUR_DIRECTORY_quartier_paris.json') as json_data:
    quartier_paris = json.load(json_data)

In [None]:
#Have a look at the data
quartier_paris[0]

In [1]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude','Longitude'] 

# instantiate the dataframe
neigh = pd.DataFrame(columns=column_names)

In [None]:
for data in quartier_paris:
    borough = neighborhood_name = data['fields']['c_ar']
    neighborhood_name = data['fields']['l_qu']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neigh = neigh.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [None]:
neigh.head()

In [None]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neigh['Borough'].unique()),
        neigh.shape[0]
    )
)

Now that we have the coordinates associated with each neighboorhood, we will find the venues using Foursquare. Once we have the venue, we will try to cluster the neighborhoods in order to know which are the more likely to have huge damages.

***But FIRST, We will have to take into account the kind of population living in these area, who could join or not the movement, be more impacted, etc. ***

*If the dataset were easily available, it would be nice to add to this datatable the index of criminality in the different neighborhood, as well as a population kind (student, residential, ...) index.*

Instead, we will use Poverty Index and Revenues known in Paris Borough (INSEE 2015). ***Please download the file on my Github repository :***
https://github.com/Smaragdy/Coursera-Project-Gilets-Jaunes

In [None]:
Pov_df = pd.read_excel('YOUR_DIRECTORY/base-cc-filosofi-2015.xls',header = 4)

In [None]:
Pov_df.head()

In [None]:
PRC = Pov_df[["Code géographique","Taux de pauvreté-Ensemble (%)","Médiane du niveau vie (€)"]]

In [None]:
PRC = PRC.drop([0], axis=0)

In [None]:
PRC.rename(columns={'Code géographique':'Borough',
                          'Libellé géographique':'Borough_name',
                          'Taux de pauvreté-Ensemble (%)':'Poverty',
                          'Médiane du niveau vie (€)':'Median_Life_level'}, 
                 inplace=True)

In [None]:
PRC.head()

In [None]:
PRC = PRC.drop([0], axis=0)

In [None]:
PRCC = PRC[PRC['Borough'].str.contains('751+') == True]
PRCC.head()

In [None]:
P = PRCC[PRCC['Borough_name'].str.contains('Paris') == True]
P

In [None]:
Bor = list(range(1,21))

In [None]:
P['Borough'] = Bor
P = P.drop(['Borough_name'], axis=1)
P

In [None]:
Bo = list(P.iloc[:,0])
Pov = list(P.iloc[:,1])
Med = list(P.iloc[:,2])

In [None]:
L=[]
M=[]
for i in neigh['Borough']:
    for j in Bo:
        if i == j:
            L.append(Pov[j-1])
            M.append(Med[j-1])

In [None]:
neigh['Poverty'] = L
neigh['Median_Life_Level'] = M

In [None]:
neigh.head()

Done ! We have finished the data cleaning !

## Exploration of the venues in each neighborhood

## Clustering the neighborhoods

## Conclusion

## Discussion