## Immigrants, Nationalities, Districts and Neighbourhoods
- What is the percentage of immigrants living in certain neighbourhoods? And percentage by nationality?
- Is there a link between the country of origin of immigrants and the Barcelona’s neighbourhood chosen to live?

### What is the percentage of immigrants living in certain neighbourhoods? And percentage by nationality?

- The vast majority of immigrants come from Spain (36.30%). The next biggest group is Italy (6.44%), followed by China (3.40%), Colombia (3.32%) and Venezuela (3.10%).
- The most populated neighbourhood by immigrants is *el Barri Gòtic* (16.73% of total neighbourhood population).
- The most populated district by immigrants is *Ciutat Vella* (12.99% of total neighbourhood population) and the least populated is *Sant Andreu* (4.64% of total neighbourhood population).
- *Ciutat Vella* hold the biggest number of foreign vs national immigrants, at over 75% in all its neighbourhoods. *El Gòtic* is the top neighbourhood at 82.06%.
- Immigrants from one country are dispersed among differnet dist ????

In [1]:
import pandas as pd
import numpy as np

In [2]:
df = pd.read_csv("../datasets/Data_filtered/complete_dataset.csv", index_col=0)
df_original = df.copy()

immi = pd.read_csv("../datasets/Data_filtered/immi.csv", index_col=0)
immi.columns = ['dist', 'nbh', 'nationality', 'num_immi']
immi_original = immi.copy()

immi_nbh = pd.read_csv("../datasets/Data_filtered/immi_nbh.csv", index_col=0)
immi_nbh.columns = ['nbh', 'dist', 'num_immi']
immi_nbh_original = immi_nbh.copy()

immi_foreign = pd.read_csv("../datasets/Data_filtered/immi_foreign.csv", index_col=0)
immi_foreign.columns = ['dist', 'nbh', 'nationality', 'num_immi']
immi_foreign_original = immi_foreign.copy()

immi_foreign_nbh = pd.read_csv("../datasets/Data_filtered/immi_foreign_nbh.csv", index_col=0)
immi_foreign_nbh.columns = ['nbh', 'dist', 'num_immi']
immi_foreign_nbh_original = immi_foreign_nbh.copy()

df = pd.read_csv("../datasets/Data_filtered/complete_dataset.csv")
df["perc_immi"]=round(df["num_immi"]/df["population"]*100,2)
df_original = df.copy()

In [3]:
immi.groupby("dist").sum().sort_values("num_immi", ascending=False)

Unnamed: 0_level_0,num_immi
dist,Unnamed: 1_level_1
Eixample,19047
Sant Martí,12720
Ciutat Vella,12611
Sants-Montjuïc,11683
Nou Barris,8274
Horta-Guinardó,7799
Gràcia,7254
Sarrià-Sant Gervasi,7227
Sant Andreu,6335
Les Corts,4375


In [4]:
# Immigrant by district as percentage of total immigrants in Barcelona
immi_foreign_grouped = immi_foreign.groupby("dist").sum().sort_values("num_immi", ascending=False)
immi_foreign_grouped

Unnamed: 0_level_0,num_immi
dist,Unnamed: 1_level_1
Eixample,12487
Ciutat Vella,10069
Sant Martí,7994
Sants-Montjuïc,7464
Nou Barris,5261
Horta-Guinardó,4544
Gràcia,4310
Sarrià-Sant Gervasi,3802
Sant Andreu,3622
Les Corts,2420


### First analysis: include "Spain" in the dataset

#### Basic analysis *Immi*

In [5]:
immi.head()

Unnamed: 0,dist,nbh,nationality,num_immi
0,Ciutat Vella,el Raval,Spain,1109
1,Ciutat Vella,el Barri Gòtic,Spain,482
2,Ciutat Vella,la Barceloneta,Spain,414
3,Ciutat Vella,"Sant Pere, Santa Caterina i la Ribera",Spain,537
4,Eixample,el Fort Pienc,Spain,663


In [6]:
immi.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 11766 entries, 0 to 11765
Data columns (total 4 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   dist         11766 non-null  object
 1   nbh          11766 non-null  object
 2   nationality  11766 non-null  object
 3   num_immi     11766 non-null  int64 
dtypes: int64(1), object(3)
memory usage: 459.6+ KB


In [7]:
immi.describe()

Unnamed: 0,num_immi
count,11766.0
mean,8.271885
std,50.821491
min,0.0
25%,0.0
50%,0.0
75%,2.0
max,1593.0


In [8]:
# list of nbh, ordered by highest % of immigrants
df.groupby("nbh").aggregate({"num_immi":"sum",
                             "population":"sum",
                            "perc_immi":"mean"}).sort_values("perc_immi", ascending=False)

Unnamed: 0_level_0,num_immi,population,perc_immi
nbh,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
el Barri Gòtic,2687.0,16062.0,16.73
"Sant Pere, Santa Caterina i la Ribera",2765.0,22721.0,12.17
la Barceloneta,1759.0,14996.0,11.73
el Raval,5400.0,47608.0,11.34
el Besòs i el Maresme,2016.0,23009.0,8.76
...,...,...,...
la Marina del Prat Vermell,32.0,1149.0,2.79
Can Peguera,57.0,2271.0,2.51
Canyelles,140.0,6856.0,2.04
el Poble-sec,0.0,0.0,


In [9]:
# list of dist, ordered by highest % of immigrants
df.groupby("dist").aggregate({"num_immi":"sum",
                             "population":"sum",
                            "perc_immi":"mean"}).sort_values("perc_immi", ascending=False)

Unnamed: 0_level_0,num_immi,population,perc_immi
dist,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Ciutat Vella,12611.0,101387.0,12.9925
Eixample,19047.0,266416.0,7.148333
Sants-Montjuïc,11683.0,181910.0,5.87625
Les Corts,4375.0,82033.0,5.616667
Gràcia,7254.0,121347.0,5.554
Sant Martí,12720.0,235513.0,5.418
Nou Barris,8274.0,166579.0,4.755385
Horta-Guinardó,7799.0,168751.0,4.700909
Sarrià-Sant Gervasi,7227.0,149279.0,4.645
Sant Andreu,6335.0,147594.0,4.635714


__*Droping rows where num_immi = 0*__

In [10]:
immi = immi[immi["num_immi"]!=0]
immi.head()

Unnamed: 0,dist,nbh,nationality,num_immi
0,Ciutat Vella,el Raval,Spain,1109
1,Ciutat Vella,el Barri Gòtic,Spain,482
2,Ciutat Vella,la Barceloneta,Spain,414
3,Ciutat Vella,"Sant Pere, Santa Caterina i la Ribera",Spain,537
4,Eixample,el Fort Pienc,Spain,663


In [11]:
immi.info

<bound method DataFrame.info of                  dist                                    nbh     nationality  \
0        Ciutat Vella                               el Raval           Spain   
1        Ciutat Vella                         el Barri Gòtic           Spain   
2        Ciutat Vella                         la Barceloneta           Spain   
3        Ciutat Vella  Sant Pere, Santa Caterina i la Ribera           Spain   
4            Eixample                          el Fort Pienc           Spain   
...               ...                                    ...             ...   
11724  Horta-Guinardó                       el Baix Guinardó  No information   
11726  Horta-Guinardó                            el Guinardó  No information   
11735      Nou Barris          Vilapicina i la Torre Llobeta  No information   
11747      Nou Barris                               Vallbona  No information   
11756      Sant Martí                                el Clot  No information   

       

In [12]:
immi.describe()

Unnamed: 0,num_immi
count,4716.0
mean,20.637617
std,78.673256
min,1.0
25%,1.0
50%,4.0
75%,13.0
max,1593.0


In [13]:
# Merging dataset to add percentage of immigrants by nbh
perc_immi = df[["dist", "nbh", "num_immi", "population", "perc_immi"]]
perc_immi.sort_values("perc_immi", ascending=False)

Unnamed: 0,dist,nbh,num_immi,population,perc_immi
10,Ciutat Vella,el Barri Gòtic,2687.0,16062.0,16.73
7,Ciutat Vella,"Sant Pere, Santa Caterina i la Ribera",2765.0,22721.0,12.17
21,Ciutat Vella,la Barceloneta,1759.0,14996.0,11.73
19,Ciutat Vella,el Raval,5400.0,47608.0,11.34
50,Sant Martí,el Besòs i el Maresme,2016.0,23009.0,8.76
...,...,...,...,...,...
62,Sants-Montjuïc,la Marina del Prat Vermell,32.0,1149.0,2.79
32,Nou Barris,Can Peguera,57.0,2271.0,2.51
33,Nou Barris,Canyelles,140.0,6856.0,2.04
16,,el Poble-sec,,,


### Second analysis: include "Spain" in the dataset

In [14]:
# Nbh by % of foreigners immigrants vs total number of immigrants

df_foreign = pd.merge(df, immi_foreign_nbh, how="left", on=["nbh", "dist"])

df_foreign.columns = ['Unnamed: 0', 'nbh', 'bars', 'children_places', 'cinemas_theatres',
       'schools', 'pre-schools', 'hospitals', 'libraries_theatres',
       'park_gardens', 'sport_centers', 'population', 'net_density(hab/ha)',
       'avg_occupation', 'dist', 'num_immi_total', 'mort_rate', 'rent_price',
       'num_crimes', 'children_places_pop', 'cinemas_theatres_pop',
       'schools_pop', 'pre-schools_pop', 'hospitals_pop',
       'libraries_theatres_pop', 'park_gardens_pop', 'sport_centers_pop',
       'facilities_pop', 'perc_immi', 'num_immi_foreign']

df_foreign["perc_foreign"]=round(df_foreign["num_immi_foreign"]/df_foreign["num_immi_total"]*100,2)

df_foreign["perc_immi_foreign"]=round(df_foreign["num_immi_foreign"]/df_foreign["population"]*100, 2)

df_foreign1 = df_foreign[["dist", "nbh", "num_immi_total", "num_immi_foreign", "perc_foreign"]].sort_values("perc_foreign", ascending=False)
df_foreign1.head(10)

Unnamed: 0,dist,nbh,num_immi_total,num_immi_foreign,perc_foreign
10,Ciutat Vella,el Barri Gòtic,2687.0,2205.0,82.06
7,Ciutat Vella,"Sant Pere, Santa Caterina i la Ribera",2765.0,2228.0,80.58
19,Ciutat Vella,el Raval,5400.0,4291.0,79.46
21,Ciutat Vella,la Barceloneta,1759.0,1345.0,76.46
67,Sant Andreu,la Trinitat Vella,675.0,501.0,74.22
50,Sant Martí,el Besòs i el Maresme,2016.0,1478.0,73.31
34,Nou Barris,Ciutat Meridiana,838.0,614.0,73.27
74,Sants-Montjuïc,el Poble Sec,3087.0,2261.0,73.24
13,Eixample,el Fort Pienc,2237.0,1574.0,70.36
66,Nou Barris,la Trinitat Nova,458.0,320.0,69.87


In [15]:
df_foreign1.tail(12)

Unnamed: 0,dist,nbh,num_immi_total,num_immi_foreign,perc_foreign
6,Sarrià-Sant Gervasi,Sant Gervasi - la Bonanova,1181.0,594.0,50.3
71,Sarrià-Sant Gervasi,les Tres Torres,702.0,339.0,48.29
30,Sant Andreu,Baró de Viver,92.0,44.0,47.83
44,Nou Barris,Vallbona,69.0,33.0,47.83
3,Sant Andreu,Sant Andreu,1965.0,923.0,46.97
60,Nou Barris,la Guineueta,493.0,228.0,46.25
58,Horta-Guinardó,la Font d'en Fargues,283.0,126.0,44.52
32,Nou Barris,Can Peguera,57.0,24.0,42.11
33,Nou Barris,Canyelles,140.0,40.0,28.57
62,Sants-Montjuïc,la Marina del Prat Vermell,32.0,9.0,28.12


In [16]:
# list of nbh, ordered by highest % of foreign immigrants
df_foreign.groupby("nbh").aggregate({"num_immi_foreign":"sum",
                             "population":"sum",
                            "perc_immi_foreign":"mean"}).sort_values("perc_immi_foreign", ascending=False)

Unnamed: 0_level_0,num_immi_foreign,population,perc_immi_foreign
nbh,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
el Barri Gòtic,2205.0,16062.0,13.73
"Sant Pere, Santa Caterina i la Ribera",2228.0,22721.0,9.81
el Raval,4291.0,47608.0,9.01
la Barceloneta,1345.0,14996.0,8.97
el Besòs i el Maresme,1478.0,23009.0,6.42
...,...,...,...
Can Peguera,24.0,2271.0,1.06
la Marina del Prat Vermell,9.0,1149.0,0.78
Canyelles,40.0,6856.0,0.58
el Poble-sec,0.0,0.0,


In [17]:
# list of dist, ordered by highest % of foreign immigrants
df_foreign.groupby("dist").aggregate({"num_immi_foreign":"sum",
                             "population":"sum",
                            "perc_immi_foreign":"mean"}).sort_values("perc_immi_foreign", ascending=False)

Unnamed: 0_level_0,num_immi_foreign,population,perc_immi_foreign
dist,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Ciutat Vella,10069.0,101387.0,10.38
Eixample,12487.0,266416.0,4.716667
Sants-Montjuïc,7464.0,181910.0,3.5575
Sant Martí,7994.0,235513.0,3.425
Gràcia,4310.0,121347.0,3.242
Les Corts,2420.0,82033.0,3.17
Nou Barris,5261.0,166579.0,2.948462
Sant Andreu,3622.0,147594.0,2.802857
Horta-Guinardó,4544.0,168751.0,2.682727
Sarrià-Sant Gervasi,3802.0,149279.0,2.421667


### Third analysis

In [18]:
# More frequent immigrant nationality by Nbh
immi_foreign.groupby('nbh', group_keys=False).apply(lambda x: x.nlargest(3, "num_immi"))

Unnamed: 0,dist,nbh,nationality,num_immi
649,Sant Andreu,Baró de Viver,Peru,7
279,Sant Andreu,Baró de Viver,Colombia,6
797,Sant Andreu,Baró de Viver,Argentina,6
107,Horta-Guinardó,Can Baró,Italy,27
255,Horta-Guinardó,Can Baró,Colombia,18
...,...,...,...,...
197,Nou Barris,les Roquetes,China,37
1159,Nou Barris,les Roquetes,Ecuador,33
911,Sarrià-Sant Gervasi,les Tres Torres,United States,59
2613,Sarrià-Sant Gervasi,les Tres Torres,Japan,34


In [19]:
# Ordering immi dataset by district, including Spain
immi_dist = immi.groupby(["dist","nationality"])["num_immi"].sum()
immi_dist = immi_dist.to_frame()
immi_dist

Unnamed: 0_level_0,Unnamed: 1_level_0,num_immi
dist,nationality,Unnamed: 2_level_1
Ciutat Vella,Afghanistan,10
Ciutat Vella,Albania,9
Ciutat Vella,Algeria,75
Ciutat Vella,Andorra,1
Ciutat Vella,Angola,1
...,...,...
Sarrià-Sant Gervasi,United States,263
Sarrià-Sant Gervasi,Uruguay,13
Sarrià-Sant Gervasi,Uzbekistan,1
Sarrià-Sant Gervasi,Venezuela,215


In [20]:
# Top 5 most frequent immigrant nationality by District, including Spain
immi_dist.groupby('dist', group_keys=False).apply(lambda x: x.nlargest(5, "num_immi"))

Unnamed: 0_level_0,Unnamed: 1_level_0,num_immi
dist,nationality,Unnamed: 2_level_1
Ciutat Vella,Spain,2542
Ciutat Vella,Italy,1275
Ciutat Vella,Pakistan,998
Ciutat Vella,France,596
Ciutat Vella,Bangladesh,566
Eixample,Spain,6560
Eixample,Italy,1568
Eixample,China,918
Eixample,Colombia,781
Eixample,Venezuela,724


In [21]:
# Ordering immi dataset by district, not including Spain
immi_dist_foreign = immi_foreign.groupby(["dist","nationality"])["num_immi"].sum()
immi_dist_foreign = immi_dist_foreign.to_frame()
immi_dist_foreign

Unnamed: 0_level_0,Unnamed: 1_level_0,num_immi
dist,nationality,Unnamed: 2_level_1
Ciutat Vella,Afghanistan,10
Ciutat Vella,Albania,9
Ciutat Vella,Algeria,75
Ciutat Vella,Andorra,1
Ciutat Vella,Angola,1
...,...,...
Sarrià-Sant Gervasi,Venezuela,215
Sarrià-Sant Gervasi,Vietnam,4
Sarrià-Sant Gervasi,Yemen,0
Sarrià-Sant Gervasi,Zambia,0


In [22]:
# Top 5 most frequent immigrant nationality by District, including Spain
immi_dist_foreign.groupby('dist', group_keys=False).apply(lambda x: x.nlargest(3, "num_immi"))

Unnamed: 0_level_0,Unnamed: 1_level_0,num_immi
dist,nationality,Unnamed: 2_level_1
Ciutat Vella,Italy,1275
Ciutat Vella,Pakistan,998
Ciutat Vella,France,596
Eixample,Italy,1568
Eixample,China,918
Eixample,Colombia,781
Gràcia,Italy,598
Gràcia,France,277
Gràcia,Colombia,200
Horta-Guinardó,Italy,413


In [23]:
# Computing percentage of immigrants from each nationality vs total number of immigrants
immi["perc_immi"]=round((immi["num_immi"]/immi["num_immi"].sum())*100,2)
immi

Unnamed: 0,dist,nbh,nationality,num_immi,perc_immi
0,Ciutat Vella,el Raval,Spain,1109,1.14
1,Ciutat Vella,el Barri Gòtic,Spain,482,0.50
2,Ciutat Vella,la Barceloneta,Spain,414,0.43
3,Ciutat Vella,"Sant Pere, Santa Caterina i la Ribera",Spain,537,0.55
4,Eixample,el Fort Pienc,Spain,663,0.68
...,...,...,...,...,...
11724,Horta-Guinardó,el Baix Guinardó,No information,1,0.00
11726,Horta-Guinardó,el Guinardó,No information,1,0.00
11735,Nou Barris,Vilapicina i la Torre Llobeta,No information,1,0.00
11747,Nou Barris,Vallbona,No information,1,0.00


In [24]:
# Listing countries by percentage of immigrants vs total number of immigrants, including Spain
nationality_immi= immi.groupby("nationality").sum().sort_values("perc_immi", ascending=False)
nationality_immi.head(20)

Unnamed: 0_level_0,num_immi,perc_immi
nationality,Unnamed: 1_level_1,Unnamed: 2_level_1
Spain,35354,36.3
Italy,6309,6.44
China,3299,3.4
Colombia,3255,3.32
Venezuela,3021,3.1
Pakistan,2967,3.01
Honduras,2767,2.85
France,2670,2.75
Peru,2473,2.54
Argentina,1885,1.99


In [25]:
immi_foreign["perc_immi"]=round((immi_foreign["num_immi"]/immi_foreign["num_immi"].sum())*100, 2)

In [26]:
# Listing countries by percentage of immigrants vs total number of immigrants, not including Spain
nationality_immi_foreign = immi_foreign.groupby("nationality", as_index = False).sum().sort_values("perc_immi", ascending=False)
nationality_immi_foreign.head(10)

Unnamed: 0,nationality,num_immi,perc_immi
69,Italy,6309,10.18
29,China,3299,5.31
30,Colombia,3255,5.26
153,Venezuela,3021,4.91
110,Pakistan,2967,4.73
60,Honduras,2767,4.45
48,France,2670,4.26
113,Peru,2473,3.97
97,Morocco,1931,3.12
6,Argentina,1885,3.03


In [27]:
list_of_nationalities = list(nationality_immi_foreign.head(5)['nationality'])

In [28]:
immi_dist = immi.groupby(["dist","nationality"])["num_immi"].sum()
immi_dist = immi_dist.to_frame()
immi_dist

Unnamed: 0_level_0,Unnamed: 1_level_0,num_immi
dist,nationality,Unnamed: 2_level_1
Ciutat Vella,Afghanistan,10
Ciutat Vella,Albania,9
Ciutat Vella,Algeria,75
Ciutat Vella,Andorra,1
Ciutat Vella,Angola,1
...,...,...
Sarrià-Sant Gervasi,United States,263
Sarrià-Sant Gervasi,Uruguay,13
Sarrià-Sant Gervasi,Uzbekistan,1
Sarrià-Sant Gervasi,Venezuela,215


In [29]:
# Top 5 nationalities, num_immi by districy
immi_countries = immi.groupby(["nationality", "dist"], as_index=False)["num_immi"].sum()
immi_countries = immi.groupby(["nationality", "dist"], as_index=False)["num_immi"].sum()
immi_countries = immi_countries.loc[immi_countries.nationality.isin(list_of_nationalities)]
immi_countries = immi_countries.groupby("nationality", group_keys=False).apply(lambda x: x.nlargest(3, "num_immi"))
immi_countries

Unnamed: 0,nationality,dist,num_immi
209,China,Eixample,918
215,China,Sant Martí,606
216,China,Sants-Montjuïc,512
219,Colombia,Eixample,781
226,Colombia,Sants-Montjuïc,431
225,Colombia,Sant Martí,375
553,Italy,Eixample,1568
552,Italy,Ciutat Vella,1275
559,Italy,Sant Martí,798
831,Pakistan,Ciutat Vella,998


In [30]:
# Trying to get it in percentage
    #num_immi_x = immigrants per district, by nationality
    #num immi_y = total immigrants, by nationality

immi_countries_perc = pd.merge(immi_countries, nationality_immi_foreign, on="nationality")
immi_countries_perc = immi_countries_perc[["nationality", "dist", "num_immi_x", "num_immi_y"]]
immi_countries_perc.columns = ["nationality", "dist", "num_immi_dist", "num_immi_total"]
immi_countries_perc

Unnamed: 0,nationality,dist,num_immi_dist,num_immi_total
0,China,Eixample,918,3299
1,China,Sant Martí,606,3299
2,China,Sants-Montjuïc,512,3299
3,Colombia,Eixample,781,3255
4,Colombia,Sants-Montjuïc,431,3255
5,Colombia,Sant Martí,375,3255
6,Italy,Eixample,1568,6309
7,Italy,Ciutat Vella,1275,6309
8,Italy,Sant Martí,798,6309
9,Pakistan,Ciutat Vella,998,2967


In [31]:
# Trying to get it in percentage

immi_countries_perc["num_immi_%"] = round(immi_countries_perc["num_immi_dist"]/immi_countries_perc["num_immi_total"]*100, 2)
immi_countries_perc

Unnamed: 0,nationality,dist,num_immi_dist,num_immi_total,num_immi_%
0,China,Eixample,918,3299,27.83
1,China,Sant Martí,606,3299,18.37
2,China,Sants-Montjuïc,512,3299,15.52
3,Colombia,Eixample,781,3255,23.99
4,Colombia,Sants-Montjuïc,431,3255,13.24
5,Colombia,Sant Martí,375,3255,11.52
6,Italy,Eixample,1568,6309,24.85
7,Italy,Ciutat Vella,1275,6309,20.21
8,Italy,Sant Martí,798,6309,12.65
9,Pakistan,Ciutat Vella,998,2967,33.64


In [32]:
# Showing only % column
immi_countries_perc = immi_countries_perc[["nationality", "dist", "num_immi_%"]]
immi_countries_perc

Unnamed: 0,nationality,dist,num_immi_%
0,China,Eixample,27.83
1,China,Sant Martí,18.37
2,China,Sants-Montjuïc,15.52
3,Colombia,Eixample,23.99
4,Colombia,Sants-Montjuïc,13.24
5,Colombia,Sant Martí,11.52
6,Italy,Eixample,24.85
7,Italy,Ciutat Vella,20.21
8,Italy,Sant Martí,12.65
9,Pakistan,Ciutat Vella,33.64


### Analyzing immigration and districts

In [117]:
population_ranked = pd.read_csv("../datasets/Data_filtered/Population_ranked.csv", index_col=0)
population_ranked = population_ranked[['dist', 'population', 'net_density(hab/ha)', 'mort_rate',
       'rent_price', 'facilities_sum', 'crime_rate']]
population_ranked

Unnamed: 0,dist,population,net_density(hab/ha),mort_rate,rent_price,facilities_sum,crime_rate
0,Ciutat Vella,101387,High,High,Neutral,Neutral,High
1,Eixample,266416,High,Neutral,Neutral,Neutral,High
2,Sant Martí,235513,High,Neutral,Neutral,Neutral,High
3,Sants-Montjuïc,181910,Neutral,Neutral,Neutral,Low,High
4,Gràcia,121347,Neutral,Neutral,Neutral,Neutral,Low
5,Horta-Guinardó,168751,Low,Neutral,Low,Neutral,Low
6,Les Corts,82033,Low,Low,High,High,Low
7,Nou Barris,166579,Neutral,Neutral,Low,Low,Neutral
8,Sant Andreu,147594,Neutral,Neutral,Low,Low,Neutral
9,Sarrià-Sant Gervasi,149279,Low,Neutral,High,Neutral,Neutral


In [118]:
df_foreign_compare["%immi_total_dist"]=round(df_foreign_compare["num_immi_foreign"]/df_foreign_compare["population"]*100, 2)

In [119]:
df_foreign_compare

Unnamed: 0_level_0,num_immi_foreign,population,perc_immi_foreign,%immi_total_dist
dist,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Ciutat Vella,10069.0,101387.0,10.38,9.93
Eixample,12487.0,266416.0,4.716667,4.69
Sants-Montjuïc,7464.0,181910.0,3.5575,4.1
Sant Martí,7994.0,235513.0,3.425,3.39
Gràcia,4310.0,121347.0,3.242,3.55
Les Corts,2420.0,82033.0,3.17,2.95
Nou Barris,5261.0,166579.0,2.948462,3.16
Sant Andreu,3622.0,147594.0,2.802857,2.45
Horta-Guinardó,4544.0,168751.0,2.682727,2.69
Sarrià-Sant Gervasi,3802.0,149279.0,2.421667,2.55


In [120]:
new_df

Unnamed: 0,dist,population,net_density(hab/ha),mort_rate,rent_price,facilities_sum,crime_rate,num_immi,%immi_total_immi
0,Ciutat Vella,101387,High,High,Neutral,Neutral,High,10069,16.25
1,Eixample,266416,High,Neutral,Neutral,Neutral,High,12487,20.15
2,Sant Martí,235513,High,Neutral,Neutral,Neutral,High,7994,12.9
3,Sants-Montjuïc,181910,Neutral,Neutral,Neutral,Low,High,7464,12.04
4,Gràcia,121347,Neutral,Neutral,Neutral,Neutral,Low,4310,6.95
5,Horta-Guinardó,168751,Low,Neutral,Low,Neutral,Low,4544,7.33
6,Les Corts,82033,Low,Low,High,High,Low,2420,3.9
7,Nou Barris,166579,Neutral,Neutral,Low,Low,Neutral,5261,8.49
8,Sant Andreu,147594,Neutral,Neutral,Low,Low,Neutral,3622,5.84
9,Sarrià-Sant Gervasi,149279,Low,Neutral,High,Neutral,Neutral,3802,6.13


In [121]:
immi_foreign_grouped

Unnamed: 0_level_0,num_immi,%immi_total_immi
dist,Unnamed: 1_level_1,Unnamed: 2_level_1
Eixample,12487,20.15
Ciutat Vella,10069,16.25
Sant Martí,7994,12.9
Sants-Montjuïc,7464,12.04
Nou Barris,5261,8.49
Horta-Guinardó,4544,7.33
Gràcia,4310,6.95
Sarrià-Sant Gervasi,3802,6.13
Sant Andreu,3622,5.84
Les Corts,2420,3.9


In [122]:
total_immi = immi_foreign_grouped.sum()
total_immi

num_immi            61973.00
%immi_total_immi       99.98
dtype: float64

In [123]:
immi_foreign_grouped["%immi_total_immi"]=immi_foreign_grouped["num_immi"]/61973*100

In [124]:
immi_foreign_grouped["%immi_total_immi"]=round(immi_foreign_grouped["%immi_total_immi"], 2)

In [125]:
immi_foreign_grouped

Unnamed: 0_level_0,num_immi,%immi_total_immi
dist,Unnamed: 1_level_1,Unnamed: 2_level_1
Eixample,12487,20.15
Ciutat Vella,10069,16.25
Sant Martí,7994,12.9
Sants-Montjuïc,7464,12.04
Nou Barris,5261,8.49
Horta-Guinardó,4544,7.33
Gràcia,4310,6.95
Sarrià-Sant Gervasi,3802,6.13
Sant Andreu,3622,5.84
Les Corts,2420,3.9


In [126]:
new_df = pd.merge(population_ranked, immi_foreign_grouped, on="dist", how="left")
new_df

Unnamed: 0,dist,population,net_density(hab/ha),mort_rate,rent_price,facilities_sum,crime_rate,num_immi,%immi_total_immi
0,Ciutat Vella,101387,High,High,Neutral,Neutral,High,10069,16.25
1,Eixample,266416,High,Neutral,Neutral,Neutral,High,12487,20.15
2,Sant Martí,235513,High,Neutral,Neutral,Neutral,High,7994,12.9
3,Sants-Montjuïc,181910,Neutral,Neutral,Neutral,Low,High,7464,12.04
4,Gràcia,121347,Neutral,Neutral,Neutral,Neutral,Low,4310,6.95
5,Horta-Guinardó,168751,Low,Neutral,Low,Neutral,Low,4544,7.33
6,Les Corts,82033,Low,Low,High,High,Low,2420,3.9
7,Nou Barris,166579,Neutral,Neutral,Low,Low,Neutral,5261,8.49
8,Sant Andreu,147594,Neutral,Neutral,Low,Low,Neutral,3622,5.84
9,Sarrià-Sant Gervasi,149279,Low,Neutral,High,Neutral,Neutral,3802,6.13


In [133]:
new_df1 = pd.merge(new_df, df_foreign_compare, on=["dist", "population"], how="left")
new_df1

Unnamed: 0,dist,population,net_density(hab/ha),mort_rate,rent_price,facilities_sum,crime_rate,num_immi,%immi_total_immi,num_immi_foreign,perc_immi_foreign,%immi_total_dist
0,Ciutat Vella,101387,High,High,Neutral,Neutral,High,10069,16.25,10069.0,10.38,9.93
1,Eixample,266416,High,Neutral,Neutral,Neutral,High,12487,20.15,12487.0,4.716667,4.69
2,Sant Martí,235513,High,Neutral,Neutral,Neutral,High,7994,12.9,7994.0,3.425,3.39
3,Sants-Montjuïc,181910,Neutral,Neutral,Neutral,Low,High,7464,12.04,7464.0,3.5575,4.1
4,Gràcia,121347,Neutral,Neutral,Neutral,Neutral,Low,4310,6.95,4310.0,3.242,3.55
5,Horta-Guinardó,168751,Low,Neutral,Low,Neutral,Low,4544,7.33,4544.0,2.682727,2.69
6,Les Corts,82033,Low,Low,High,High,Low,2420,3.9,2420.0,3.17,2.95
7,Nou Barris,166579,Neutral,Neutral,Low,Low,Neutral,5261,8.49,5261.0,2.948462,3.16
8,Sant Andreu,147594,Neutral,Neutral,Low,Low,Neutral,3622,5.84,3622.0,2.802857,2.45
9,Sarrià-Sant Gervasi,149279,Low,Neutral,High,Neutral,Neutral,3802,6.13,3802.0,2.421667,2.55


In [143]:
new_df1_good = new_df1[['dist', 'population', 'net_density(hab/ha)', 'mort_rate', 'rent_price',
       'facilities_sum', 'crime_rate', 'num_immi', '%immi_total_immi', '%immi_total_dist']]

In [145]:
new_df1_good

Unnamed: 0,dist,population,net_density(hab/ha),mort_rate,rent_price,facilities_sum,crime_rate,num_immi,%immi_total_immi,%immi_total_dist
0,Ciutat Vella,101387,High,High,Neutral,Neutral,High,10069,16.25,9.93
1,Eixample,266416,High,Neutral,Neutral,Neutral,High,12487,20.15,4.69
2,Sant Martí,235513,High,Neutral,Neutral,Neutral,High,7994,12.9,3.39
3,Sants-Montjuïc,181910,Neutral,Neutral,Neutral,Low,High,7464,12.04,4.1
4,Gràcia,121347,Neutral,Neutral,Neutral,Neutral,Low,4310,6.95,3.55
5,Horta-Guinardó,168751,Low,Neutral,Low,Neutral,Low,4544,7.33,2.69
6,Les Corts,82033,Low,Low,High,High,Low,2420,3.9,2.95
7,Nou Barris,166579,Neutral,Neutral,Low,Low,Neutral,5261,8.49,3.16
8,Sant Andreu,147594,Neutral,Neutral,Low,Low,Neutral,3622,5.84,2.45
9,Sarrià-Sant Gervasi,149279,Low,Neutral,High,Neutral,Neutral,3802,6.13,2.55


NameError: name 'df_dist' is not defined

In [148]:
dist3 = pd.read_csv("../datasets/Data_filtered/data_extraFiltered.csv")

In [155]:
dist3 = dist3[['dist', 'mort_rate','facilities_rate', 'crime_rate']]

In [156]:
dist3

Unnamed: 0,dist,mort_rate,facilities_rate,crime_rate
0,Ciutat Vella,1002.95,0.37,272.956283
1,Eixample,794.48,0.36,285.104537
2,Gràcia,805.1,0.25,47.698402
3,Horta-Guinardó,871.53,0.21,48.562169
4,Les Corts,677.3,0.52,45.927682
5,Nou Barris,885.38,0.12,53.226506
6,Sant Andreu,819.19,0.2,65.751116
7,Sant Martí,777.4,0.23,138.307475
8,Sants-Montjuïc,827.7,0.19,124.746346
9,Sarrià-Sant Gervasi,821.88,0.31,57.082605


In [153]:
new_df1_good.columns

Index(['dist', 'population', 'net_density(hab/ha)', 'mort_rate', 'rent_price',
       'facilities_sum', 'crime_rate', 'num_immi', '%immi_total_immi',
       '%immi_total_dist'],
      dtype='object')

In [164]:
new_df1_good2 = new_df1_good[['dist', '%immi_total_immi','%immi_total_dist']]
new_df1_good2

Unnamed: 0,dist,%immi_total_immi,%immi_total_dist
0,Ciutat Vella,16.25,9.93
1,Eixample,20.15,4.69
2,Sant Martí,12.9,3.39
3,Sants-Montjuïc,12.04,4.1
4,Gràcia,6.95,3.55
5,Horta-Guinardó,7.33,2.69
6,Les Corts,3.9,2.95
7,Nou Barris,8.49,3.16
8,Sant Andreu,5.84,2.45
9,Sarrià-Sant Gervasi,6.13,2.55


In [165]:
crime_immi_compare = pd.merge(new_df1_good2, dist3, on="dist", how="outer")

In [168]:
crime_immi_compare

Unnamed: 0,dist,%immi_total_immi,%immi_total_dist,mort_rate,facilities_rate,crime_rate
0,Ciutat Vella,16.25,9.93,1002.95,0.37,272.956283
1,Eixample,20.15,4.69,794.48,0.36,285.104537
2,Sant Martí,12.9,3.39,777.4,0.23,138.307475
3,Sants-Montjuïc,12.04,4.1,827.7,0.19,124.746346
4,Gràcia,6.95,3.55,805.1,0.25,47.698402
5,Horta-Guinardó,7.33,2.69,871.53,0.21,48.562169
6,Les Corts,3.9,2.95,677.3,0.52,45.927682
7,Nou Barris,8.49,3.16,885.38,0.12,53.226506
8,Sant Andreu,5.84,2.45,819.19,0.2,65.751116
9,Sarrià-Sant Gervasi,6.13,2.55,821.88,0.31,57.082605


In [None]:
choices = ['High', 'Neutral']
conditions5 = [(Population.rent_price >= 5), ((Population.rent_price < 5) & (Population.rent_price > 3))]
Population.rent_price = np.select(conditions5, choices, default='Low')

In [169]:
crime_immi_compare.to_csv("../datasets/Data_filtered/comparison_table.csv")