# About 
The goal of this notebook is to find a systematic way to identify France's communes with the highest change of being good candidates for refugies. 
The proposed approach is to geo-cluster communes based on K-means according to some features that are of J'accueille's concern.
Intuitively the idea is that:
- we look at a given commune and the adjacent ones
- some relevant features are picked
- each selected features can have custom weights based on the refugee profile
- maybe some additional discretionary filters can be passed ('Look only in South West France')
- based on these, features

## In more details

Alright so let's start with the features we start with:
- % Logement vacants: the lower the better
- Unemployement for metiers en tension: higher is better
- Hébergement citoyen availability: higher is better
- Primary school at closure risk
- Political color: the reddier, the better

These would be the 5 features that we can start with and will we start with a equally weighted feature matrix. In the future we want to make feature selection and weight selection dynamic, each time resuting in a different map of France. 

# Data Work

Sources:
- Besoin Main d'Oeuvre: https://www.data.gouv.fr/fr/datasets/enquete-besoins-en-main-doeuvre-bmo/
- Métiers en tension: https://www.herault.gouv.fr/contenu/telechargement/49273/367990/file/Liste%20des%20m%C3%A9tiers%20en%20tension.pdf
- Communes <-> Bassin Emploi: https://www.insee.fr/fr/information/4652957

## Let's get the relevant datapoints

In [54]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import geopandas as gpd
import folium as flm
from shapely import wkt

In [55]:
#This is the base df with all the communes and some geographic charactéristics listed
df = pd.read_csv('../csv/communes-avec-coords-polygons-population-voisins-aa.csv')

  df = pd.read_csv('../csv/communes-avec-coords-polygons-population-voisins-aa.csv')


In [56]:
df.head()

Unnamed: 0,codgeo,codpost,nom,longitude,latitude,geometry,polygon,libgeo,p21_pop,p20_pop,population,nb_voisins,liste_voisins,name,codgeo_aa,aa_name,aa_cat
0,1001,1400,L ABERGEMENT CLEMENCIAT,4.9306,46.151702,POINT (4.9306005 46.1517018),"POLYGON ((4.904571 46.160961, 4.913322 46.1829...",L' Abergement-Clémenciat,832.0,806.0,832.0,6.0,"['01412', '01093', '01028', '01146', '01351', ...",L'Abergement-Clémenciat,524,Châtillon-sur-Chalaronne,20.0
1,1002,1640,L ABERGEMENT DE VAREY,5.424644,46.007131,POINT (5.4246442 46.007131),"POLYGON ((5.424759 46.031308, 5.441286 46.0254...",L' Abergement-de-Varey,267.0,262.0,267.0,6.0,"['01056', '01277', '01384', '01007', '01363', ...",L'Abergement-de-Varey,0,Commune hors attraction des villes,30.0
2,1004,1500,AMBERIEU EN BUGEY,5.370568,45.957471,POINT (5.37056825 45.9574707),"POLYGON ((5.386191 45.930928, 5.357241 45.9486...",Ambérieu-en-Bugey,14854.0,14288.0,14854.0,7.0,"['01384', '01421', '01041', '01345', '01089', ...",Ambérieu-en-Bugey,243,Ambérieu-en-Bugey,11.0
3,1005,1330,AMBERIEUX EN DOMBES,4.911872,45.999229,POINT (4.9118718 45.99922935),"POLYGON ((4.942867 45.979142, 4.92773 45.98003...",Ambérieux-en-Dombes,1897.0,1782.0,1897.0,7.0,"['01382', '01207', '01261', '01362', '01318', ...",Ambérieux-en-Dombes,2,Lyon,20.0
4,1006,1300,AMBLEON,5.592785,45.748314,POINT (5.5927847 45.74831435),"POLYGON ((5.570824 45.753383, 5.584292 45.7625...",Ambléon,113.0,113.0,113.0,6.0,"['01358', '01110', '01117', '01216', '01233', ...",Ambléon,286,Belley,20.0


In [57]:
emploi_tension = pd.read_csv('../csv/metiers_en_tension_mars_2024.csv')
regions_insee = pd.read_csv('../csv/insee_region_2022.csv')

In [58]:
emploi_tension = emploi_tension.rename({'Region':'nom_region','Code FAP':'code_fap','Famille Professionnelle':'nom_fap'},axis=1)
emploi_tension.head()

Unnamed: 0,nom_region,nom_fap,code_fap
0,France Entière,Agriculteurs salariés,A0Z40
1,France Entière,Éleveurs salariés,A0Z41
2,France Entière,Maraîchers; horticulteurs salariés,A1Z40
3,France Entière,Viticulteurs; arboriculteurs salariés,A1Z42
4,Auvergne-Rhône-Alpes,Agriculteurs salariés,A0Z40


In [59]:
regions_insee = regions_insee.rename({'LIBELLE':'nom_region','REG':'code_region'},axis=1)
regions_insee = regions_insee[['nom_region','code_region']]
regions_insee.head()

Unnamed: 0,nom_region,code_region
0,Guadeloupe,1
1,Martinique,2
2,Guyane,3
3,La Réunion,4
4,Mayotte,6


In [60]:
emploi_tension = pd.merge(emploi_tension, regions_insee, on='nom_region')
emploi_tension

Unnamed: 0,nom_region,nom_fap,code_fap,code_region
0,Auvergne-Rhône-Alpes,Agriculteurs salariés,A0Z40,84
1,Auvergne-Rhône-Alpes,Éleveurs salariés,A0Z41,84
2,Auvergne-Rhône-Alpes,Maraîchers; horticulteurs salariés,A1Z40,84
3,Auvergne-Rhône-Alpes,Viticulteurs; arboriculteurs salariés,A1Z42,84
4,Auvergne-Rhône-Alpes,Agents de maîtrise et assimilés des industries...,E2Z80,84
...,...,...,...,...
356,Provence-Alpes-Côte d'Azur,Techniciens en électricité et en électronique,C2Z70,93
357,Provence-Alpes-Côte d'Azur,Techniciens en mécanique et travail des métaux,D6Z70,93
358,Provence-Alpes-Côte d'Azur,Techniciens et agents de maîtrise de la mainte...,G1Z70,93
359,Provence-Alpes-Côte d'Azur,Techniciens et chargés d'études du bâtiment et...,B6Z71,93


In [64]:
emploi_besoin = pd.read_csv('../csv/france_travail_base_open_data_BMO_2024.csv')
emploi_besoin = emploi_besoin.rename({'Code métier BMO':'code_fap', 'REG': 'code_region', 'Dept':'code_dept','BE24':'code_bassin'}, axis=1)
emploi_besoin = emploi_besoin[['code_fap','code_region','code_dept','code_bassin','met','xmet','smet']]
emploi_besoin.head()

Unnamed: 0,code_fap,code_region,code_dept,code_bassin,met,xmet,smet
0,A0X40,1,971,101,95,31,44
1,A0X40,1,971,102,167,69,86
2,A0X40,1,971,103,17,*,*
3,A0X40,1,978,104,6,6,*
4,A0X40,1,971,105,45,37,30


In [65]:
emploi_besoin.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 53443 entries, 0 to 53442
Data columns (total 7 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   code_fap     53443 non-null  object
 1   code_region  53443 non-null  int64 
 2   code_dept    53443 non-null  object
 3   code_bassin  53443 non-null  int64 
 4   met          53443 non-null  object
 5   xmet         53443 non-null  object
 6   smet         53443 non-null  object
dtypes: int64(2), object(5)
memory usage: 2.9+ MB


In [66]:
emploi_tension_besoin = pd.merge(emploi_besoin, emploi_tension, on=['code_region','code_fap'])
emploi_tension_besoin.head()

Unnamed: 0,code_fap,code_region,code_dept,code_bassin,met,xmet,smet,nom_region,nom_fap
0,G0A40,24,18,2401,48,48,8,Centre-Val de Loire,Ouvriers qualifiés de la maintenance en mécanique
1,G0A40,24,18,2402,10,9,*,Centre-Val de Loire,Ouvriers qualifiés de la maintenance en mécanique
2,G0A40,24,18,2403,10,10,*,Centre-Val de Loire,Ouvriers qualifiés de la maintenance en mécanique
3,G0A40,24,18,2404,5,5,*,Centre-Val de Loire,Ouvriers qualifiés de la maintenance en mécanique
4,G0A40,24,28,2405,76,46,*,Centre-Val de Loire,Ouvriers qualifiés de la maintenance en mécanique


In [67]:
emploi_tension_besoin.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 657 entries, 0 to 656
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   code_fap     657 non-null    object
 1   code_region  657 non-null    int64 
 2   code_dept    657 non-null    object
 3   code_bassin  657 non-null    int64 
 4   met          657 non-null    object
 5   xmet         657 non-null    object
 6   smet         657 non-null    object
 7   nom_region   657 non-null    object
 8   nom_fap      657 non-null    object
dtypes: int64(2), object(7)
memory usage: 46.3+ KB


In [68]:
gdf = gpd.read_file('../csv/com_ze2020_2024.shp')

In [69]:
gdf.head()

AttributeError: 'NoneType' object has no attribute 'copy'

Unnamed: 0,codgeo,libgeo,ze2020,libze2020,ze20part_r,dep,reg,geometry
0,61145,Domfront en Poiraie,2809,Flers,,61,28,"POLYGON ((-0.75826 48.59475, -0.75166 48.60343..."
1,14644,Saint-Philbert-des-Champs,2814,Lisieux,,14,28,"POLYGON ((0.23419 49.22047, 0.23696 49.2232, 0..."
2,19059,Concèze,7507,Brive-la-Gaillarde,,19,75,"POLYGON ((1.30773 45.35797, 1.31596 45.36118, ..."
3,51084,Bréban,4427,Vitry-le-François Saint-Dizier,,51,44,"POLYGON ((4.34966 48.59794, 4.35265 48.60065, ..."
4,61166,Ferrières-la-Verrerie,51,Alençon,0051-28,61,28,"POLYGON ((0.34238 48.67928, 0.35574 48.68171, ..."


In [70]:
#gdf.explore()

In [71]:
gdf[gdf['libze2020']=='Bordeaux'].head(20)

AttributeError: 'NoneType' object has no attribute 'copy'

Unnamed: 0,codgeo,libgeo,ze2020,libze2020,ze20part_r,dep,reg,geometry
171,33236,Lège-Cap-Ferret,7505,Bordeaux,,33,75,"POLYGON ((-1.26055 44.62666, -1.26052 44.65666..."
613,33233,Laruscade,7505,Bordeaux,,33,75,"POLYGON ((-0.3843 45.14075, -0.3834 45.14334, ..."
686,33101,Cartelègue,7505,Bordeaux,,33,75,"POLYGON ((-0.60174 45.18286, -0.59611 45.2056,..."
720,33089,Campugnan,7505,Bordeaux,,33,75,"POLYGON ((-0.58411 45.16282, -0.56315 45.19314..."
752,33389,Saint-Ciers-sur-Gironde,7505,Bordeaux,,33,75,"POLYGON ((-0.71581 45.32726, -0.70824 45.32748..."
1029,33011,Arès,7505,Bordeaux,,33,75,"POLYGON ((-1.16193 44.77491, -1.14815 44.77869..."
1059,33018,Val de Virvée,7505,Bordeaux,,33,75,"POLYGON ((-0.42687 45.01546, -0.42304 45.02131..."
1586,33182,Gauriac,7505,Bordeaux,,33,75,"POLYGON ((-0.65909 45.06458, -0.63839 45.0778,..."
1751,33086,Camiac-et-Saint-Denis,7505,Bordeaux,,33,75,"POLYGON ((-0.31521 44.79379, -0.31237 44.79953..."
1785,33349,Quinsac,7505,Bordeaux,,33,75,"POLYGON ((-0.51158 44.74584, -0.51119 44.76495..."


In [72]:
emploi_besoin[emploi_besoin.NOMBE24=='BORDEAUX'].head(20)

AttributeError: 'DataFrame' object has no attribute 'NOMBE24'

In [75]:
fermetures_ecoles = gpd.read_file('../csv/fr-en-etablissements-fermes.geojson')
fermetures_ecoles.head()

AttributeError: 'NoneType' object has no attribute 'copy'

Unnamed: 0,numero_uai,appellation_officielle,denomination_principale,patronyme_uai,secteur_public_prive_libe,date_ouverture,date_fermeture,adresse_uai,lieu_dit_uai,boite_postale_uai,...,libelle_region,libelle_academie,libelle_commune,restauration,herbergement,ecole_maternelle,ecole_elementaire,ulis,greta,geometry
0,0133285A,Institut privé Leschi Les Chemins (Ecole secon...,ECOLE 2ND DEGRE GENERAL PRIVEE,INSTITUT LESCHI LES CHEMINS,Privé,1987-09-01,2006-12-31,16 RUE MATHERON,,,...,Provence-Alpes-Côte d'Azur,Aix-Marseille,Aix-en-Provence,0,0,,,0,0,POINT (5.44928 43.53058)
1,0133310C,Ecole privée Prado-Plage Coiffure (Ecole secon...,ECOLE SECONDAIRE PROF.PRIVEE,PRADO-PLAGE (COIFFURE),Privé,1988-05-20,1996-08-31,RUE DES MOUSSES ZAC LA PLAGE,,,...,Provence-Alpes-Côte d'Azur,Aix-Marseille,Marseille 8e Arrondissement,0,0,,,0,0,POINT (5.37525 43.26672)
2,0133379C,Cours Miramas (Ecole secondaire technologique ...,ECOLE TECHNIQUE PRIVEE,MIRAMAS (COURS),Privé,1990-09-01,1995-08-31,PARC LA CARRAIRE,,,...,Provence-Alpes-Côte d'Azur,Aix-Marseille,Miramas,0,0,,,0,0,POINT (5.00122 43.57224)
3,0133411M,Ecole privée Sud Institut Européen (Ecole seco...,ECOLE TECHNIQUE PRIVEE,SUD INSTITUT EUROPEEN,Privé,1991-09-01,1994-08-31,,PARADIS ST ROCH ALLEE,,...,Provence-Alpes-Côte d'Azur,Aix-Marseille,Martigues,0,0,,,0,0,POINT (5.04363 43.40773)
4,0133426D,Collège privé Saint Louis - Sainte Marthe,COLLEGE PRIVE,ST LOUIS - STE MARTHE,Privé,1992-09-01,2003-08-31,DOMAINE CHESNERAIE 105 CH BOSQUE,,,...,Provence-Alpes-Côte d'Azur,Aix-Marseille,Aix-en-Provence,1,1,,,0,0,POINT (5.43453 43.57433)


In [78]:
fermetures_ecoles[fermetures_ecoles.secteur_public_prive_libe=='Public'].sort_values('date_fermeture', ascending=False).head(20)

AttributeError: 'NoneType' object has no attribute 'copy'

Unnamed: 0,numero_uai,appellation_officielle,denomination_principale,patronyme_uai,secteur_public_prive_libe,date_ouverture,date_fermeture,adresse_uai,lieu_dit_uai,boite_postale_uai,...,libelle_region,libelle_academie,libelle_commune,restauration,herbergement,ecole_maternelle,ecole_elementaire,ulis,greta,geometry
21062,9710618V,Ecole primaire Riflet,ECOLE PRIMAIRE PUBLIQUE,RIFLET,Public,1974-03-27,2024-12-31,,RIFLET,,...,Guadeloupe,Guadeloupe,Deshaies,1,0,1.0,1.0,0,0,POINT (-61.77727 16.34028)
18573,0860332U,Ecole primaire Leugny,ECOLE ELEMENTAIRE PUBLIQUE,,Public,1965-08-16,2024-09-05,5 rue de la Mairie,,,...,Nouvelle-Aquitaine,Poitiers,Leugny,1,0,1.0,1.0,0,0,POINT (0.70148 46.91094)
41481,0540547B,Ecole élémentaire des Petits Princes,ECOLE ELEMENTAIRE PUBLIQUE,DES PETITS PRINCES,Public,1965-07-15,2024-09-01,1 place de la Mairie,,,...,Grand Est,Nancy-Metz,Jaillon,0,0,0.0,1.0,0,0,POINT (5.96847 48.75714)
930,0250794C,Ecole maternelle Vannolles,ECOLE MATERNELLE PUBLIQUE,VANNOLLES,Public,1965-07-06,2024-09-01,4 rue de Vannolles,,,...,Bourgogne-Franche-Comté,Besançon,Pontarlier,0,0,1.0,0.0,0,0,POINT (6.35598 46.90393)
35796,0540409B,Ecole élémentaire,ECOLE ELEMENTAIRE PUBLIQUE,,Public,1965-07-15,2024-09-01,16 GRANDE RUE GDE RUE,,,...,Grand Est,Nancy-Metz,Emberménil,0,0,0.0,1.0,0,0,POINT (6.69494 48.6288)
16697,0541182S,Ecole primaire,ECOLE PRIMAIRE,,Public,1965-07-15,2024-09-01,51 rue des Généraux Mangin,,,...,Grand Est,Nancy-Metz,Xermaménil,0,0,1.0,1.0,0,0,POINT (6.46193 48.53294)
35974,0541910H,Ecole maternelle,ECOLE MATERNELLE PUBLIQUE,,Public,1976-05-24,2024-09-01,GRANDE RUE GDE RUE,,,...,Grand Est,Nancy-Metz,Méréville,0,0,1.0,0.0,0,0,POINT (6.15243 48.59797)
30747,0251077K,Ecole maternelle Jean de la Fontaine,ECOLE MATERNELLE PUBLIQUE,JEAN DE LA FONTAINE,Public,1967-10-10,2024-09-01,5 bis rue de Champvallon,,,...,Bourgogne-Franche-Comté,Besançon,Bethoncourt,0,0,1.0,0.0,0,0,POINT (6.79949 47.53701)
27399,0101073M,Ecole élémentaire,ECOLE ELEMENTAIRE PUBLIQUE,,Public,1998-09-01,2024-09-01,Place du Tilleul,,,...,Grand Est,Reims,Jaucourt,0,0,0.0,1.0,0,0,POINT (4.6461 48.25998)
1887,0410275B,Ecole élémentaire,ECOLE ELEMENTAIRE PUBLIQUE,,Public,1965-09-06,2024-09-01,1 rue de la Manufacture,,,...,Centre-Val de Loire,Orléans-Tours,Meslay,0,0,0.0,1.0,0,0,POINT (1.09681 47.81193)
