## Questions to answer:
- What could define a better or worse neighbourhood?
- Does the quality (bad,normal,good,etc) of the neighbourhood relate with its death rate?
- Do the neighbourhoods with higher births or higher population have more schools or hospitals? what about the other facilities (parks, sports centers,etc)

### Values to analyse:
- **Define Nbh quality**:
    - Are there facilities more valuable than others?:
        - top 5 for each facility
        - worst 5
        - top 5 **Nbh** and **Dist** with higher number of facilities
        - worst 5          
<br><br>  
- **Higher populated Nbh and Dist have more facilities? And schools in particular?**
    - Top 5 Nbh and Dist with highest birth rates
    - Facilities number
    - schools number
<br><br>  
- **Nbh and Dist with higher population number have more facilities? And schools in particular?**
    - Top 5 Nbh and Dist with highest population
    - Facilities number
    - schools number

### Description variables:
- **nbh**: Barcelona's Neighbourhoods
- **bars**: number of bars in 2017
- **children_places**: number of places dedicated to children  in 2017
- **cinemas_theatres**: number of cinemas or theatres  in 2017
- **schools**: number of schools in 2017
- **pre-schools**: number of pre-schools in 2017
- **hospitals**: number of hospitals in 2017
- **libraries_museums**: number of libraries and museums in 2017
- **park_gardens**: number of parks and gardens in 2017
- **sport_centers**: number of sports centers in 2017
- **population**: number of people in 2017
- **net_density(hab/ha)**: density per habitable are (habitants per habitable hectares) in 2017
- **avg_occupation**: average number of people living in a house
- **dist**: Barcelona's district
- **num_immi**: number of immigrants in 2017
- **mort_rate**: rate of people that died in 2017
- **rent_price**: price of rent per m2 in 2017
- **num_crimes**: number of crimes during 2017
- **crime_rate**: crimes/10000hab in 2017
- **facilities**:children_places, schools, pre-schools, hospitals, cinemas_theatres, libraries_museums,park_gardens
- **facilities_sum**: sum of facilities per 10000hab

In [133]:
#import libraries to work
import pandas as pd
import numpy as np

In [134]:
#import the cleaned dataset

Complete=pd.read_csv("../datasets/Data_filtered/complete_dataset.csv")

In [214]:
#Have a quick look

Complete.head()

Unnamed: 0,nbh,bars,children_places,cinemas_theatres,schools,pre-schools,hospitals,libraries_theatres,park_gardens,sport_centers,...,num_crimes,children_places_pop,cinemas_theatres_pop,schools_pop,pre-schools_pop,hospitals_pop,libraries_theatres_pop,park_gardens_pop,sport_centers_pop,facilities_pop
0,Horta,4.0,9.0,3.0,11.0,7.0,3.0,5.0,3.0,2.0,...,7871.0,0.336889,0.112296,0.411754,0.262025,0.112296,0.187161,0.112296,0.074864,0.168445
1,Navas,2.0,3.0,,6.0,3.0,,,,,...,10657.0,0.135569,,0.271137,0.135569,,,,,
2,Pedralbes,1.0,7.0,,8.0,41.0,,10.0,9.0,1.0,...,7444.0,0.579662,,0.662471,3.395164,,0.828089,0.74528,0.082809,
3,Sant Andreu,3.0,22.0,5.0,22.0,17.0,5.0,2.0,5.0,3.0,...,10657.0,0.38473,0.087439,0.38473,0.297291,0.087439,0.034975,0.087439,0.052463,0.139902
4,Sant Antoni,9.0,8.0,4.0,11.0,11.0,1.0,5.0,7.0,1.0,...,46210.0,0.208632,0.104316,0.286869,0.286869,0.026079,0.130395,0.182553,0.026079,0.120615


In [136]:
#Eliminate the index Unnamed:0

Complete=Complete.drop(["Unnamed: 0"],axis=1)

In [137]:
#Make a copy to modify stuff

Complete.to_csv("../datasets/Data_filtered/Analysis_M.csv",index=False)

***

In [138]:
#Import the copy

Analysis=pd.read_csv("../datasets/Data_filtered/Analysis_M.csv")


In [139]:
Analysis.columns

Index(['nbh', 'bars', 'children_places', 'cinemas_theatres', 'schools',
       'pre-schools', 'hospitals', 'libraries_theatres', 'park_gardens',
       'sport_centers', 'population', 'net_density(hab/ha)', 'avg_occupation',
       'dist', 'num_immi', 'mort_rate', 'rent_price', 'num_crimes',
       'children_places_pop', 'cinemas_theatres_pop', 'schools_pop',
       'pre-schools_pop', 'hospitals_pop', 'libraries_theatres_pop',
       'park_gardens_pop', 'sport_centers_pop', 'facilities_pop'],
      dtype='object')

In [170]:
#Creation of a dataset with lesser features

Analysis=Analysis.drop(['children_places_pop', 'cinemas_theatres_pop', 'schools_pop',
       'pre-schools_pop', 'hospitals_pop', 'libraries_theatres_pop',
       'park_gardens_pop', 'sport_centers_pop', 'facilities_pop'], axis=1)

In [141]:
#changing the variables order

Analysis=Analysis[['dist','nbh','population', 'net_density(hab/ha)','num_immi',
                  'bars', 'children_places','cinemas_theatres', 'schools','pre-schools',
                  'hospitals', 'libraries_theatres','park_gardens','sport_centers',
                  'avg_occupation','mort_rate', 'rent_price', 'num_crimes']]

In [142]:
#Renaming the column libraries (wrong name)

Analysis=Analysis.rename(columns={"libraries_theatres":"libraries_museums"})

In [143]:
Analysis.head(5)

Unnamed: 0,dist,nbh,population,net_density(hab/ha),num_immi,bars,children_places,cinemas_theatres,schools,pre-schools,hospitals,libraries_museums,park_gardens,sport_centers,avg_occupation,mort_rate,rent_price,num_crimes
0,Horta-Guinardó,Horta,26715.0,422.0,1127.0,4.0,9.0,3.0,11.0,7.0,3.0,5.0,3.0,2.0,2.5,805.5,2.97,7871.0
1,Sant Andreu,Navas,22129.0,984.0,988.0,2.0,3.0,,6.0,3.0,,,,,2.5,696.0,3.029,10657.0
2,Les Corts,Pedralbes,12076.0,147.0,764.0,1.0,7.0,,8.0,41.0,,10.0,9.0,1.0,2.9,661.6,6.34,7444.0
3,Sant Andreu,Sant Andreu,57183.0,746.0,1965.0,3.0,22.0,5.0,22.0,17.0,5.0,2.0,5.0,3.0,2.4,763.0,3.21,10657.0
4,Eixample,Sant Antoni,38345.0,928.0,2490.0,9.0,8.0,4.0,11.0,11.0,1.0,5.0,7.0,1.0,2.3,737.3,4.591,46210.0


In [144]:
#Step 1: Define Nbh quality
#Are there facilities more valuable than others?:
#top 5 for each facility
#worst 5
#top 5 Nbh and Dist with higher number of facilities
#worst 5

In [145]:
Analysis["dist"].describe()
#There are 10 dist

count             73
unique            10
top       Nou Barris
freq              13
Name: dist, dtype: object

In [146]:
#Everything needs to be in float

Analysis["num_crimes"]=Analysis["num_crimes"].astype("float")

In [147]:
Analysis["rent_price"]=Analysis["rent_price"].astype("float")

In [148]:
Analysis["mort_rate"]=Analysis["mort_rate"].astype("float")

In [149]:
#just to make sure all went as expected
Analysis.dtypes

dist                    object
nbh                     object
population             float64
net_density(hab/ha)    float64
num_immi               float64
bars                   float64
children_places        float64
cinemas_theatres       float64
schools                float64
pre-schools            float64
hospitals              float64
libraries_museums      float64
park_gardens           float64
sport_centers          float64
avg_occupation         float64
mort_rate              float64
rent_price             float64
num_crimes             float64
dtype: object

In [162]:
#Information by dist (rounded)

MeanDist=Analysis.groupby("dist").aggregate({"mean"}).round(decimals=2)

In [163]:
MeanDist.head()
MeanDist.columns

MultiIndex([(         'population', 'mean'),
            ('net_density(hab/ha)', 'mean'),
            (           'num_immi', 'mean'),
            (               'bars', 'mean'),
            (    'children_places', 'mean'),
            (   'cinemas_theatres', 'mean'),
            (            'schools', 'mean'),
            (        'pre-schools', 'mean'),
            (          'hospitals', 'mean'),
            (  'libraries_museums', 'mean'),
            (       'park_gardens', 'mean'),
            (      'sport_centers', 'mean'),
            (     'avg_occupation', 'mean'),
            (          'mort_rate', 'mean'),
            (         'rent_price', 'mean'),
            (         'num_crimes', 'mean')],
           )

In [164]:
# renaming the columns

MeanDist.columns =['population', 'net_density(hab/ha)', 'num_immi', 'bars',
       'children_places', 'cinemas_theatres', 'schools', 'pre-schools',
       'hospitals', 'libraries_museums', 'park_gardens', 'sport_centers',
       'avg_occupation', 'mort_rate', 'rent_price', 'num_crimes']

In [165]:
MeanDist.head()

Unnamed: 0_level_0,population,net_density(hab/ha),num_immi,bars,children_places,cinemas_theatres,schools,pre-schools,hospitals,libraries_museums,park_gardens,sport_centers,avg_occupation,mort_rate,rent_price,num_crimes
dist,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
Ciutat Vella,25346.75,809.25,3152.75,28.75,5.0,9.0,7.5,15.5,2.0,17.75,3.5,2.0,2.45,1002.95,4.81,44241.0
Eixample,44402.67,800.33,3174.5,15.0,9.6,3.2,12.8,15.0,2.4,6.6,9.0,1.25,2.38,794.48,4.8,46210.0
Gràcia,24269.4,622.2,1450.8,15.5,5.4,9.5,9.6,8.2,3.75,2.5,2.2,2.6,2.34,805.1,3.98,7731.0
Horta-Guinardó,15341.0,559.18,709.0,2.5,7.2,3.0,7.4,7.33,2.4,3.0,3.55,1.71,2.43,871.53,2.92,7871.0
Les Corts,27344.33,528.67,1458.33,4.33,8.67,3.0,12.33,37.33,3.5,8.33,10.67,2.0,2.6,677.3,5.17,7444.0


In [166]:
#saving the dataframe with the mean values per dist and not the info per nbh

MeanDist.to_csv("../datasets/Data_filtered/Analysis_M_dist.csv")

In [167]:
#importing the dataset

Dist=pd.read_csv("../datasets/Data_filtered/Analysis_M_dist.csv")

In [173]:
Dist.head()

Unnamed: 0,dist,population,net_density(hab/ha),num_immi,children_places,cinemas_theatres,schools,pre-schools,hospitals,libraries_museums,park_gardens,sport_centers,avg_occupation,mort_rate,rent_price,num_crimes
0,Ciutat Vella,25346.75,809.25,3152.75,5.0,9.0,7.5,15.5,2.0,17.75,3.5,2.0,2.45,1002.95,4.81,44241.0
1,Eixample,44402.67,800.33,3174.5,9.6,3.2,12.8,15.0,2.4,6.6,9.0,1.25,2.38,794.48,4.8,46210.0
2,Gràcia,24269.4,622.2,1450.8,5.4,9.5,9.6,8.2,3.75,2.5,2.2,2.6,2.34,805.1,3.98,7731.0
3,Horta-Guinardó,15341.0,559.18,709.0,7.2,3.0,7.4,7.33,2.4,3.0,3.55,1.71,2.43,871.53,2.92,7871.0
4,Les Corts,27344.33,528.67,1458.33,8.67,3.0,12.33,37.33,3.5,8.33,10.67,2.0,2.6,677.3,5.17,7444.0


In [171]:
#I am gonna drop the bars and sports column becuase it doesn't make sense... probably due to the little info.
#the values in certain districts are too low

Dist=Dist.drop(['bars'], axis=1)

In [216]:
Dist=Dist.drop(['sport_centers'], axis=1)

In [195]:
#let's have a look at the max values for each variable per dist. function returns max value and name
def max_value (column):
    Max = Dist[column].max()
    index = Dist[column].idxmax()
    name = Dist.loc[index, 'dist']
    return (name,Max)

In [217]:
Dist.columns

Index(['dist', 'population', 'net_density(hab/ha)', 'num_immi',
       'children_places', 'cinemas_theatres', 'schools', 'pre-schools',
       'hospitals', 'libraries_museums', 'park_gardens', 'avg_occupation',
       'mort_rate', 'rent_price', 'num_crimes'],
      dtype='object')

In [209]:
columns=Dist.columns.tolist()

In [184]:
Dist["population"].idxmax()

1

In [None]:
# best dist
population->not defining
net_density-> doesn't really apply
num_immi-->not defining
avg_occupation-->does not apply
rent_price-->higher prices?

num_crimes-->low
facilities-->high

In [219]:
Dist.head()

Unnamed: 0,dist,population,net_density(hab/ha),num_immi,children_places,cinemas_theatres,schools,pre-schools,hospitals,libraries_museums,park_gardens,avg_occupation,mort_rate,rent_price,num_crimes
0,Ciutat Vella,25346.75,809.25,3152.75,5.0,9.0,7.5,15.5,2.0,17.75,3.5,2.45,1002.95,4.81,44241.0
1,Eixample,44402.67,800.33,3174.5,9.6,3.2,12.8,15.0,2.4,6.6,9.0,2.38,794.48,4.8,46210.0
2,Gràcia,24269.4,622.2,1450.8,5.4,9.5,9.6,8.2,3.75,2.5,2.2,2.34,805.1,3.98,7731.0
3,Horta-Guinardó,15341.0,559.18,709.0,7.2,3.0,7.4,7.33,2.4,3.0,3.55,2.43,871.53,2.92,7871.0
4,Les Corts,27344.33,528.67,1458.33,8.67,3.0,12.33,37.33,3.5,8.33,10.67,2.6,677.3,5.17,7444.0


In [215]:
max_value("net_density(hab/ha)")

('Sant Martí', 877.2)

In [233]:
Dist["population"].sum()

241772.66000000003

In [280]:
#having many libraries and museums is good? and cinemas and theatres?
Dist["facilities_rate"]=(
                    ((Dist["children_places"]/1620809) + (Dist["cinemas_theatres"]/1620809)
                    + (Dist["schools"]/1620809) + (Dist["pre-schools"]/1620809)
                    + (Dist["hospitals"]/1620809) + (Dist["libraries_museums"]/1620809)
                    + (Dist["park_gardens"]/1620809))*10000
                    ).round(decimals=2)

In [281]:
Dist["facilities_rate"] #it is the sum of the rate of each facility per 10000 hab

0    0.37
1    0.36
2    0.25
3    0.21
4    0.52
5    0.12
6    0.20
7    0.23
8    0.19
9    0.31
Name: facilities_rate, dtype: float64

In [282]:
Dist

Unnamed: 0,dist,population,net_density(hab/ha),num_immi,children_places,cinemas_theatres,schools,pre-schools,hospitals,libraries_museums,park_gardens,avg_occupation,mort_rate,rent_price,num_crimes,facilities_rate
0,Ciutat Vella,25346.75,809.25,3152.75,5.0,9.0,7.5,15.5,2.0,17.75,3.5,2.45,1002.95,4.81,44241.0,0.37
1,Eixample,44402.67,800.33,3174.5,9.6,3.2,12.8,15.0,2.4,6.6,9.0,2.38,794.48,4.8,46210.0,0.36
2,Gràcia,24269.4,622.2,1450.8,5.4,9.5,9.6,8.2,3.75,2.5,2.2,2.34,805.1,3.98,7731.0,0.25
3,Horta-Guinardó,15341.0,559.18,709.0,7.2,3.0,7.4,7.33,2.4,3.0,3.55,2.43,871.53,2.92,7871.0,0.21
4,Les Corts,27344.33,528.67,1458.33,8.67,3.0,12.33,37.33,3.5,8.33,10.67,2.6,677.3,5.17,7444.0,0.52
5,Nou Barris,12813.77,669.46,636.46,5.31,1.0,4.92,4.23,1.0,1.17,1.86,2.59,885.38,2.29,8627.0,0.12
6,Sant Andreu,21084.86,762.86,905.0,7.86,2.33,8.57,6.71,2.25,1.8,2.4,2.59,819.19,2.83,10657.0,0.2
7,Sant Martí,23551.3,877.2,1272.0,9.6,2.57,8.8,8.9,1.22,2.0,3.44,2.54,777.4,3.85,22417.0,0.23
8,Sants-Montjuïc,22738.75,757.5,1460.38,5.0,2.75,8.0,7.71,2.75,1.8,3.5,2.48,827.7,3.41,20219.0,0.19
9,Sarrià-Sant Gervasi,24879.83,345.17,1204.5,7.17,2.5,11.83,14.33,4.67,3.67,6.67,2.77,821.88,5.27,9252.0,0.31


In [283]:
#eliminate the facilities columns not needed
Dist2=Dist.drop(['children_places', 'cinemas_theatres', 'schools', 'pre-schools',
       'hospitals', 'libraries_museums', 'park_gardens'],axis=1)


In [284]:
Dist2.head()

Unnamed: 0,dist,population,net_density(hab/ha),num_immi,avg_occupation,mort_rate,rent_price,num_crimes,facilities_rate
0,Ciutat Vella,25346.75,809.25,3152.75,2.45,1002.95,4.81,44241.0,0.37
1,Eixample,44402.67,800.33,3174.5,2.38,794.48,4.8,46210.0,0.36
2,Gràcia,24269.4,622.2,1450.8,2.34,805.1,3.98,7731.0,0.25
3,Horta-Guinardó,15341.0,559.18,709.0,2.43,871.53,2.92,7871.0,0.21
4,Les Corts,27344.33,528.67,1458.33,2.6,677.3,5.17,7444.0,0.52


In [295]:
#I don't think the population is the right number

pop=[101387,266416,181910,82033,149279,121347,168751,166579,147594,235513]
district=["Ciutat Vella","Eixample","Sants-Montjuïc","Les Corts",
         "Sarrià-Sant Gervasi","Gràcia","Horta-Guinardó","Nou Barris","Sant Andreu","Sant Martí"]
my_dict={"dist":district,"population":pop}

In [296]:
population=pd.DataFrame(my_dict)

In [297]:
population.head()

Unnamed: 0,dist,population
0,Ciutat Vella,101387
1,Eixample,266416
2,Sants-Montjuïc,181910
3,Les Corts,82033
4,Sarrià-Sant Gervasi,149279


In [301]:
Dist2=Dist2.drop(["population"],axis=1)

In [302]:
Dist3 = pd.merge(Dist2,population, how="right", on="dist")

In [303]:
Dist3

Unnamed: 0,dist,net_density(hab/ha),num_immi,avg_occupation,mort_rate,rent_price,num_crimes,facilities_rate,population
0,Ciutat Vella,809.25,3152.75,2.45,1002.95,4.81,44241.0,0.37,101387
1,Eixample,800.33,3174.5,2.38,794.48,4.8,46210.0,0.36,266416
2,Gràcia,622.2,1450.8,2.34,805.1,3.98,7731.0,0.25,121347
3,Horta-Guinardó,559.18,709.0,2.43,871.53,2.92,7871.0,0.21,168751
4,Les Corts,528.67,1458.33,2.6,677.3,5.17,7444.0,0.52,82033
5,Nou Barris,669.46,636.46,2.59,885.38,2.29,8627.0,0.12,166579
6,Sant Andreu,762.86,905.0,2.59,819.19,2.83,10657.0,0.2,147594
7,Sant Martí,877.2,1272.0,2.54,777.4,3.85,22417.0,0.23,235513
8,Sants-Montjuïc,757.5,1460.38,2.48,827.7,3.41,20219.0,0.19,181910
9,Sarrià-Sant Gervasi,345.17,1204.5,2.77,821.88,5.27,9252.0,0.31,149279


In [308]:
#moving population
Dist3=Dist3[['dist','population', 'net_density(hab/ha)', 'num_immi', 'avg_occupation',
       'mort_rate', 'rent_price', 'num_crimes', 'facilities_rate',
       ]]

In [342]:
Dist3.sort_values("num_crimes",ascending=False)
#higher crime: Eixample and Ciutat Vella (big difference with the top 3), Sant Martí
#lower:Gràcia and Les Corts (little difference with the lower 3), Horta-Guinardó

Unnamed: 0,dist,population,net_density(hab/ha),num_immi,avg_occupation,mort_rate,rent_price,num_crimes,facilities_rate,crime_rate
1,Eixample,266416,800.33,3174.5,2.38,794.48,4.8,46210.0,0.36,285.104537
0,Ciutat Vella,101387,809.25,3152.75,2.45,1002.95,4.81,44241.0,0.37,272.956283
7,Sant Martí,235513,877.2,1272.0,2.54,777.4,3.85,22417.0,0.23,138.307475
8,Sants-Montjuïc,181910,757.5,1460.38,2.48,827.7,3.41,20219.0,0.19,124.746346
6,Sant Andreu,147594,762.86,905.0,2.59,819.19,2.83,10657.0,0.2,65.751116
9,Sarrià-Sant Gervasi,149279,345.17,1204.5,2.77,821.88,5.27,9252.0,0.31,57.082605
5,Nou Barris,166579,669.46,636.46,2.59,885.38,2.29,8627.0,0.12,53.226506
3,Horta-Guinardó,168751,559.18,709.0,2.43,871.53,2.92,7871.0,0.21,48.562169
2,Gràcia,121347,622.2,1450.8,2.34,805.1,3.98,7731.0,0.25,47.698402
4,Les Corts,82033,528.67,1458.33,2.6,677.3,5.17,7444.0,0.52,45.927682


In [343]:
Dist3.sort_values("facilities_rate",ascending=False)
#higher: Les Corts, Ciutat Vella, Eixample
#lower: Sant Andreu, Sants-Monjuïc, Nou Barris

Unnamed: 0,dist,population,net_density(hab/ha),num_immi,avg_occupation,mort_rate,rent_price,num_crimes,facilities_rate,crime_rate
4,Les Corts,82033,528.67,1458.33,2.6,677.3,5.17,7444.0,0.52,45.927682
0,Ciutat Vella,101387,809.25,3152.75,2.45,1002.95,4.81,44241.0,0.37,272.956283
1,Eixample,266416,800.33,3174.5,2.38,794.48,4.8,46210.0,0.36,285.104537
9,Sarrià-Sant Gervasi,149279,345.17,1204.5,2.77,821.88,5.27,9252.0,0.31,57.082605
2,Gràcia,121347,622.2,1450.8,2.34,805.1,3.98,7731.0,0.25,47.698402
7,Sant Martí,235513,877.2,1272.0,2.54,777.4,3.85,22417.0,0.23,138.307475
3,Horta-Guinardó,168751,559.18,709.0,2.43,871.53,2.92,7871.0,0.21,48.562169
6,Sant Andreu,147594,762.86,905.0,2.59,819.19,2.83,10657.0,0.2,65.751116
8,Sants-Montjuïc,181910,757.5,1460.38,2.48,827.7,3.41,20219.0,0.19,124.746346
5,Nou Barris,166579,669.46,636.46,2.59,885.38,2.29,8627.0,0.12,53.226506


In [344]:
Dist3.sort_values("rent_price",ascending=False)
#higher: Sarrià-Sant Gervasi, Les Corts, Ciutat Vella
#lower: Horta-Guinardó, Sant Andreu, Nou Barris

Unnamed: 0,dist,population,net_density(hab/ha),num_immi,avg_occupation,mort_rate,rent_price,num_crimes,facilities_rate,crime_rate
9,Sarrià-Sant Gervasi,149279,345.17,1204.5,2.77,821.88,5.27,9252.0,0.31,57.082605
4,Les Corts,82033,528.67,1458.33,2.6,677.3,5.17,7444.0,0.52,45.927682
0,Ciutat Vella,101387,809.25,3152.75,2.45,1002.95,4.81,44241.0,0.37,272.956283
1,Eixample,266416,800.33,3174.5,2.38,794.48,4.8,46210.0,0.36,285.104537
2,Gràcia,121347,622.2,1450.8,2.34,805.1,3.98,7731.0,0.25,47.698402
7,Sant Martí,235513,877.2,1272.0,2.54,777.4,3.85,22417.0,0.23,138.307475
8,Sants-Montjuïc,181910,757.5,1460.38,2.48,827.7,3.41,20219.0,0.19,124.746346
3,Horta-Guinardó,168751,559.18,709.0,2.43,871.53,2.92,7871.0,0.21,48.562169
6,Sant Andreu,147594,762.86,905.0,2.59,819.19,2.83,10657.0,0.2,65.751116
5,Nou Barris,166579,669.46,636.46,2.59,885.38,2.29,8627.0,0.12,53.226506


In [368]:
Dist3[["dist","crime_rate","facilities_rate","net_density(hab/ha)","num_immi","mort_rate"]].sort_values("mort_rate",ascending=False)
#higher: Ciutat Vella, Nou Barris, Horta-Guinardó
#lower: Eixample, Sant Martí, Les Corts

Unnamed: 0,dist,crime_rate,facilities_rate,net_density(hab/ha),num_immi,mort_rate
0,Ciutat Vella,272.956283,0.37,809.25,3152.75,1002.95
5,Nou Barris,53.226506,0.12,669.46,636.46,885.38
3,Horta-Guinardó,48.562169,0.21,559.18,709.0,871.53
8,Sants-Montjuïc,124.746346,0.19,757.5,1460.38,827.7
9,Sarrià-Sant Gervasi,57.082605,0.31,345.17,1204.5,821.88
6,Sant Andreu,65.751116,0.2,762.86,905.0,819.19
2,Gràcia,47.698402,0.25,622.2,1450.8,805.1
1,Eixample,285.104537,0.36,800.33,3174.5,794.48
7,Sant Martí,138.307475,0.23,877.2,1272.0,777.4
4,Les Corts,45.927682,0.52,528.67,1458.33,677.3


In [364]:
Dist3[["dist","crime_rate","facilities_rate","net_density(hab/ha)"]].sort_values("net_density(hab/ha)",ascending=False)

Unnamed: 0,dist,crime_rate,facilities_rate,net_density(hab/ha)
7,Sant Martí,138.307475,0.23,877.2
0,Ciutat Vella,272.956283,0.37,809.25
1,Eixample,285.104537,0.36,800.33
6,Sant Andreu,65.751116,0.2,762.86
8,Sants-Montjuïc,124.746346,0.19,757.5
5,Nou Barris,53.226506,0.12,669.46
2,Gràcia,47.698402,0.25,622.2
3,Horta-Guinardó,48.562169,0.21,559.18
4,Les Corts,45.927682,0.52,528.67
9,Sarrià-Sant Gervasi,57.082605,0.31,345.17


In [347]:
Dist3.corr()

Unnamed: 0,population,net_density(hab/ha),num_immi,avg_occupation,mort_rate,rent_price,num_crimes,facilities_rate,crime_rate
population,1.0,0.453958,0.141743,-0.204465,-0.095976,-0.17712,0.402391,-0.354558,0.402391
net_density(hab/ha),0.453958,1.0,0.404369,-0.511504,0.231659,-0.278312,0.624093,-0.218737,0.624093
num_immi,0.141743,0.404369,1.0,-0.434911,0.256742,0.62259,0.922054,0.552726,0.922054
avg_occupation,-0.204465,-0.511504,-0.434911,1.0,-0.193699,0.132333,-0.418129,0.049523,-0.418129
mort_rate,-0.095976,0.231659,0.256742,-0.193699,1.0,-0.245994,0.386923,-0.361552,0.386923
rent_price,-0.17712,-0.278312,0.62259,0.132333,-0.245994,1.0,0.379735,0.866505,0.379735
num_crimes,0.402391,0.624093,0.922054,-0.418129,0.386923,0.379735,1.0,0.287412,1.0
facilities_rate,-0.354558,-0.218737,0.552726,0.049523,-0.361552,0.866505,0.287412,1.0,0.287412
crime_rate,0.402391,0.624093,0.922054,-0.418129,0.386923,0.379735,1.0,0.287412,1.0


In [351]:
Dist3.describe().round(decimals=2)

Unnamed: 0,population,net_density(hab/ha),num_immi,avg_occupation,mort_rate,rent_price,num_crimes,facilities_rate,crime_rate
count,10.0,10.0,10.0,10.0,10.0,10.0,10.0,10.0,10.0
mean,162080.9,673.18,1542.37,2.52,828.29,3.93,18466.9,0.28,113.94
std,56627.86,161.78,906.03,0.13,83.55,1.06,15071.78,0.12,92.99
min,82033.0,345.17,636.46,2.34,677.3,2.29,7444.0,0.12,45.93
25%,127908.75,574.93,979.88,2.44,797.14,3.04,8060.0,0.2,49.73
50%,157929.0,713.48,1361.4,2.51,820.54,3.92,9954.5,0.24,61.42
75%,178620.25,790.96,1459.87,2.59,860.57,4.81,21867.5,0.35,134.92
max,266416.0,877.2,3174.5,2.77,1002.95,5.27,46210.0,0.52,285.1


In [465]:
Dist3["%immi"]=((Dist3["num_immi"]/Dist3["population"])*100).round(decimals=2)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  Dist3["%immi"]=((Dist3["num_immi"]/Dist3["population"])*100).round(decimals=2)


In [466]:
Dist3

Unnamed: 0,dist,population,net_density(hab/ha),num_immi,avg_occupation,mort_rate,rent_price,num_crimes,facilities_rate,crime_rate,%immi
0,Ciutat Vella,101387,809.25,3152.75,2.45,1002.95,4.81,44241.0,0.37,272.956283,3.11
1,Eixample,266416,800.33,3174.5,2.38,794.48,4.8,46210.0,0.36,285.104537,1.19
2,Gràcia,121347,622.2,1450.8,2.34,805.1,3.98,7731.0,0.25,47.698402,1.2
3,Horta-Guinardó,168751,559.18,709.0,2.43,871.53,2.92,7871.0,0.21,48.562169,0.42
4,Les Corts,82033,528.67,1458.33,2.6,677.3,5.17,7444.0,0.52,45.927682,1.78
5,Nou Barris,166579,669.46,636.46,2.59,885.38,2.29,8627.0,0.12,53.226506,0.38
6,Sant Andreu,147594,762.86,905.0,2.59,819.19,2.83,10657.0,0.2,65.751116,0.61
7,Sant Martí,235513,877.2,1272.0,2.54,777.4,3.85,22417.0,0.23,138.307475,0.54
8,Sants-Montjuïc,181910,757.5,1460.38,2.48,827.7,3.41,20219.0,0.19,124.746346,0.8
9,Sarrià-Sant Gervasi,149279,345.17,1204.5,2.77,821.88,5.27,9252.0,0.31,57.082605,0.81


In [376]:
#USING CRIME_RATE (crime cases/total pop*10000) INSTEAD OF CRIME CASES

Dist4=Dist3[["dist","crime_rate","facilities_rate"]]
Dist4.sort_values("crime_rate",ascending=False).round(decimals=2)
#most secure: >50
#neutral: 50-100
#high:>100

Unnamed: 0,dist,crime_rate,facilities_rate
1,Eixample,285.1,0.36
0,Ciutat Vella,272.96,0.37
7,Sant Martí,138.31,0.23
8,Sants-Montjuïc,124.75,0.19
6,Sant Andreu,65.75,0.2
9,Sarrià-Sant Gervasi,57.08,0.31
5,Nou Barris,53.23,0.12
3,Horta-Guinardó,48.56,0.21
2,Gràcia,47.7,0.25
4,Les Corts,45.93,0.52


In [468]:
Dist3.corr()


Unnamed: 0,population,net_density(hab/ha),num_immi,avg_occupation,mort_rate,rent_price,num_crimes,facilities_rate,crime_rate,%immi
population,1.0,0.453958,0.141743,-0.204465,-0.095976,-0.17712,0.402391,-0.354558,0.402391,-0.507518
net_density(hab/ha),0.453958,1.0,0.404369,-0.511504,0.231659,-0.278312,0.624093,-0.218737,0.624093,0.143191
num_immi,0.141743,0.404369,1.0,-0.434911,0.256742,0.62259,0.922054,0.552726,0.922054,0.75528
avg_occupation,-0.204465,-0.511504,-0.434911,1.0,-0.193699,0.132333,-0.418129,0.049523,-0.418129,-0.223254
mort_rate,-0.095976,0.231659,0.256742,-0.193699,1.0,-0.245994,0.386923,-0.361552,0.386923,0.33178
rent_price,-0.17712,-0.278312,0.62259,0.132333,-0.245994,1.0,0.379735,0.866505,0.379735,0.62085
num_crimes,0.402391,0.624093,0.922054,-0.418129,0.386923,0.379735,1.0,0.287412,1.0,0.565336
facilities_rate,-0.354558,-0.218737,0.552726,0.049523,-0.361552,0.866505,0.287412,1.0,0.287412,0.68591
crime_rate,0.402391,0.624093,0.922054,-0.418129,0.386923,0.379735,1.0,0.287412,1.0,0.565336
%immi,-0.507518,0.143191,0.75528,-0.223254,0.33178,0.62085,0.565336,0.68591,0.565336,1.0


In [423]:
Dist3.to_csv("../datasets/Data_filtered/data_extraFiltered.csv",index=False)
Population=pd.read_csv("../datasets/Data_filtered/data_extraFiltered.csv")

In [424]:
Population=Population.round(decimals=2)

In [416]:
#crime ranking(high,neutral,low)

Population.loc[df.crime_rate>100, "crime rate"] = "high"

In [387]:
df.loc[df.Weight == "155", "Name"] = "John"

In [388]:
#Population['crime_rate'] = np.where(Population['crime_rate'] > 100, "high", df['crime_rate'])

In [417]:
#most secure: >50
#neutral: 50-100
#high:>100


def crime_ranker(num):
    if num >= 100:
        return 'High'
    elif (num < 100) & (num > 50):
        return 'Neutral'
    else:
        return 'Low'
    
Population.crime_rate = Population.crime_rate.apply(crime_ranker)

In [428]:
#eliminate num_crimes since we have now the crime rate

Population=Population.drop(['num_crimes'],axis=1)

In [430]:
Population.rename(columns={"facilities_rate":"facilities_sum"},inplace=True)

In [438]:
choices = ['High', 'Neutral']

#crime_rate
#conditions = [(Population.crime_rate >= 100), ((Population.crime_rate < 100) & (Population.crime_rate > 50))]


#facilities_sum
conditions2 = [(Population.facilities_sum >=0.5 ), ((Population.facilities_sum < 0.5) & (Population.facilities_sum >0.2))]

#mort_rate
conditions3 = [(Population.mort_rate >= 900), ((Population.mort_rate < 900) & (Population.mort_rate > 700))]

#net_density(hab/ha)
conditions4 = [(Population["net_density(hab/ha)"] >= 800), ((Population["net_density(hab/ha)"] < 800) & (Population["net_density(hab/ha)"] > 600))]


#Population.crime_rate = np.select(conditions, choices, default='Low')

Population.facilities_sum = np.select(conditions2, choices, default='Low')

Population.mort_rate = np.select(conditions3, choices, default='Low')

Population["net_density(hab/ha)"] = np.select(conditions4, choices, default='Low')

In [440]:
#rent_price

choices = ['High', 'Neutral']
conditions5 = [(Population.rent_price >= 5), ((Population.rent_price < 5) & (Population.rent_price > 3))]
Population.rent_price = np.select(conditions5, choices, default='Low')

In [None]:
#choices = ['High', 'Neutral']
#conditions5 = [(Population.rent_price >= 5), ((Population.rent_price < 5) & (Population.rent_price > 3))]
#Population.%immi_total_dist = np.select(conditions5, choices, default='Low')

In [442]:
#eliminate avg_occupation since it is not needed

Population=Population.drop(['avg_occupation'],axis=1)

In [447]:
Population=Population[['dist', 'population','num_immi','net_density(hab/ha)', 'mort_rate',
       'rent_price', 'facilities_sum', 'crime_rate']]

In [452]:
#sort by crime_rate

Population=Population.sort_values("crime_rate",ascending=True)

In [457]:
#reset index

Population.reset_index(inplace=True,drop=True)

In [460]:
Population

Unnamed: 0,dist,population,num_immi,net_density(hab/ha),mort_rate,rent_price,facilities_sum,crime_rate
0,Ciutat Vella,101387,3152.75,High,High,Neutral,Neutral,High
1,Eixample,266416,3174.5,High,Neutral,Neutral,Neutral,High
2,Sant Martí,235513,1272.0,High,Neutral,Neutral,Neutral,High
3,Sants-Montjuïc,181910,1460.38,Neutral,Neutral,Neutral,Low,High
4,Gràcia,121347,1450.8,Neutral,Neutral,Neutral,Neutral,Low
5,Horta-Guinardó,168751,709.0,Low,Neutral,Low,Neutral,Low
6,Les Corts,82033,1458.33,Low,Low,High,High,Low
7,Nou Barris,166579,636.46,Neutral,Neutral,Low,Low,Neutral
8,Sant Andreu,147594,905.0,Neutral,Neutral,Low,Low,Neutral
9,Sarrià-Sant Gervasi,149279,1204.5,Low,Neutral,High,Neutral,Neutral


In [461]:
Population.to_csv("../datasets/Data_filtered/Population_ranked.csv")

## Conclusions

### Question 1

We wanted to define with the data available the quality of a neighbourhood, nonetheless, there are a lot more factors that determine the quality and it is hard to conclude anything unless a certain specific variable such as crime is used.
If we use crime as an indicator, the more secure Neighbourhoods would be (crime seems to be related with number of immigrants in neighbourhood):
- Gràcia
- Les Corts
- Horta-Guinardó
- Sants
<br>```(<8000 cases)```<br><br>
And the less secure:<br><br>
- Eixample
- Ciutat vella
- Sant Martí
<br>```(>20000 cases)```
<br><br><br>

If we use the number of facilities (schools, hospitals, parks,etc), the most equiped are:
- Les Corts
- Ciutat Vella
- Eixample
(closely followed by Sarrià)<br><br>
And the worst equiped:<br><br>
- Sant Andreu
- Sants-Monjuïc
- Nou Barris
<br><br><br>

Finally, if we have a quick look at the rental prices we obtain that the most expensive are:
- Sarrià-Sant Gervasi
- Les Corts
- Ciutat Vella
- Eixample
<br>```(>4€/m2)```<br><br>
And the least expensive:<br><br>
- Horta-Guinardó
- Sant Andreu
- Nou Barris
<br>```(<3€/m2)```<br><br>

So, what conclusions can we take? First, that we would need a deeper analysis in order to stablish the quality of a Neighbourhood. Second, that according to our data and analysis the only Neighbourhood that would be catalogued as High quality due to having a lower crime category and a higher facilities rate is **Les Corts** which has also one of the highest rental prices, that could confirm our hypothesis that is a valued neighbourhood.

### Question 2

Does the quality (bad,normal,good,etc) of the neighbourhood relate with its death rate?<br><br>
The overall quality of the neighbourhoods could not be stablished but data showed that **Les Corts** was a good neighbourhood due to low crime and high quantity of facilities. Interestingly, it is the district with the lowest mortality rate. <br><br>
Nonetheless, there does not seem to be a correlation (0.39) between mortality and crime since two districts with high crime levels such as Eixample and Sant Martí have two of the lowest mortality rates.

### Question 3
Do the neighbourhoods with higher population numbers have more facilities?<br><br>
The districts with highest net density are: 
- Sant Martí
- Ciutat Vella
- Eixample
<br>```(>800hab/ha)```<br><br>
And the better equiped are:<br><br>
- Les Corts
- Ciutat Vella
- Eixample
<br><br>
It seems that there could be a correlation since Ciutat Vella and Eixample are well equiped and have high net density, however, once again there is no correlation

In [486]:
immigrants=pd.read_csv("../datasets/Data_filtered/comparison_table.csv")

In [478]:
immigrants.head()

Unnamed: 0,dist,%immi_total_immi,%immi_total_dist,mort_rate,facilities_rate,crime_rate
0,Ciutat Vella,16.25,9.93,1002.95,0.37,272.956283
1,Eixample,20.15,4.69,794.48,0.36,285.104537
2,Sant Martí,12.9,3.39,777.4,0.23,138.307475
3,Sants-Montjuïc,12.04,4.1,827.7,0.19,124.746346
4,Gràcia,6.95,3.55,805.1,0.25,47.698402


In [487]:
immigrants.sort_values("%immi_total_immi",ascending=False)

Unnamed: 0.1,Unnamed: 0,dist,%immi_total_immi,%immi_total_dist,mort_rate,facilities_rate,crime_rate
1,1,Eixample,20.15,4.69,794.48,0.36,285.104537
0,0,Ciutat Vella,16.25,9.93,1002.95,0.37,272.956283
2,2,Sant Martí,12.9,3.39,777.4,0.23,138.307475
3,3,Sants-Montjuïc,12.04,4.1,827.7,0.19,124.746346
7,7,Nou Barris,8.49,3.16,885.38,0.12,53.226506
5,5,Horta-Guinardó,7.33,2.69,871.53,0.21,48.562169
4,4,Gràcia,6.95,3.55,805.1,0.25,47.698402
9,9,Sarrià-Sant Gervasi,6.13,2.55,821.88,0.31,57.082605
8,8,Sant Andreu,5.84,2.45,819.19,0.2,65.751116
6,6,Les Corts,3.9,2.95,677.3,0.52,45.927682


In [488]:
immigrants=immigrants.drop(["Unnamed: 0"],axis=1)

In [489]:
choices = ['High', 'Neutral']
conditions6 = [(immigrants["%immi_total_dist"] >= 5), ((immigrants["%immi_total_dist"] < 5) & (immigrants["%immi_total_dist"] > 3))]
immigrants["%immi_total_dist"] = np.select(conditions6, choices, default='Low')

In [490]:
choices = ['High', 'Neutral']
conditions7 = [(immigrants["%immi_total_immi"] >= 15), ((immigrants["%immi_total_immi"] < 15) & (immigrants["%immi_total_immi"] > 5))]
immigrants["%immi_total_immi"] = np.select(conditions7, choices, default='Low')

In [497]:
immigrants.rename(columns={"facilities_rate":"facilities_sum"},inplace=True)

In [498]:
immigrants

Unnamed: 0,dist,%immi_total_immi,%immi_total_dist,mort_rate,facilities_sum,crime_rate
0,Ciutat Vella,High,High,1002.95,0.37,272.956283
1,Eixample,High,Neutral,794.48,0.36,285.104537
2,Sant Martí,Neutral,Neutral,777.4,0.23,138.307475
3,Sants-Montjuïc,Neutral,Neutral,827.7,0.19,124.746346
4,Gràcia,Neutral,Neutral,805.1,0.25,47.698402
5,Horta-Guinardó,Neutral,Low,871.53,0.21,48.562169
6,Les Corts,Low,Low,677.3,0.52,45.927682
7,Nou Barris,Neutral,Neutral,885.38,0.12,53.226506
8,Sant Andreu,Neutral,Low,819.19,0.2,65.751116
9,Sarrià-Sant Gervasi,Neutral,Low,821.88,0.31,57.082605


In [499]:
choices = ['High', 'Neutral']

#crime_rate
conditions = [(immigrants.crime_rate >= 100), ((immigrants.crime_rate < 100) & (immigrants.crime_rate > 50))]


#facilities_sum
conditions2 = [(immigrants.facilities_sum >=0.5 ), ((immigrants.facilities_sum < 0.5) & (immigrants.facilities_sum >0.2))]

#mort_rate
conditions3 = [(immigrants.mort_rate >= 900), ((immigrants.mort_rate < 900) & (immigrants.mort_rate > 700))]



immigrants.crime_rate = np.select(conditions, choices, default='Low')

immigrants.facilities_sum = np.select(conditions2, choices, default='Low')

immigrants.mort_rate = np.select(conditions3, choices, default='Low')



In [512]:
immigrants.sort_values("%immi_total_dist",ascending=True)

Unnamed: 0,dist,%immi_total_immi,%immi_total_dist,mort_rate,facilities_sum,crime_rate
0,Ciutat Vella,High,High,High,Neutral,High
5,Horta-Guinardó,Neutral,Low,Neutral,Neutral,Low
6,Les Corts,Low,Low,Low,High,Low
8,Sant Andreu,Neutral,Low,Neutral,Low,Neutral
9,Sarrià-Sant Gervasi,Neutral,Low,Neutral,Neutral,Neutral
1,Eixample,High,Neutral,Neutral,Neutral,High
2,Sant Martí,Neutral,Neutral,Neutral,Neutral,High
3,Sants-Montjuïc,Neutral,Neutral,Neutral,Low,High
4,Gràcia,Neutral,Neutral,Neutral,Neutral,Low
7,Nou Barris,Neutral,Neutral,Neutral,Low,Neutral


In [501]:
immipop = pd.merge(Population,immigrants, how="right", on="dist")

In [502]:
immipop.head()

Unnamed: 0,dist,population,num_immi,net_density(hab/ha),mort_rate_x,rent_price,facilities_sum_x,crime_rate_x,%immi_total_immi,%immi_total_dist,mort_rate_y,facilities_sum_y,crime_rate_y
0,Ciutat Vella,101387,3152.75,High,High,Neutral,Neutral,High,High,High,High,Neutral,High
1,Eixample,266416,3174.5,High,Neutral,Neutral,Neutral,High,High,Neutral,Neutral,Neutral,High
2,Sant Martí,235513,1272.0,High,Neutral,Neutral,Neutral,High,Neutral,Neutral,Neutral,Neutral,High
3,Sants-Montjuïc,181910,1460.38,Neutral,Neutral,Neutral,Low,High,Neutral,Neutral,Neutral,Low,High
4,Gràcia,121347,1450.8,Neutral,Neutral,Neutral,Neutral,Low,Neutral,Neutral,Neutral,Neutral,Low


In [504]:
immipop.columns

Index(['dist', 'population', 'num_immi', 'net_density(hab/ha)', 'mort_rate_x',
       'rent_price', 'facilities_sum_x', 'crime_rate_x', '%immi_total_immi',
       '%immi_total_dist', 'mort_rate_y', 'facilities_sum_y', 'crime_rate_y'],
      dtype='object')

In [509]:
immipop=immipop.drop(['num_immi'],axis=1)

In [508]:
immipop.columns=['dist', 'population', 'num_immi', 'net_density(hab/ha)', 'mort_rate',
       'rent_price', 'facilities_sum', 'crime_rate', '%immi_total_immi',
       '%immi_total_dist']

In [510]:
immipop

Unnamed: 0,dist,population,net_density(hab/ha),mort_rate,rent_price,facilities_sum,crime_rate,%immi_total_immi,%immi_total_dist
0,Ciutat Vella,101387,High,High,Neutral,Neutral,High,High,High
1,Eixample,266416,High,Neutral,Neutral,Neutral,High,High,Neutral
2,Sant Martí,235513,High,Neutral,Neutral,Neutral,High,Neutral,Neutral
3,Sants-Montjuïc,181910,Neutral,Neutral,Neutral,Low,High,Neutral,Neutral
4,Gràcia,121347,Neutral,Neutral,Neutral,Neutral,Low,Neutral,Neutral
5,Horta-Guinardó,168751,Low,Neutral,Low,Neutral,Low,Neutral,Low
6,Les Corts,82033,Low,Low,High,High,Low,Low,Low
7,Nou Barris,166579,Neutral,Neutral,Low,Low,Neutral,Neutral,Neutral
8,Sant Andreu,147594,Neutral,Neutral,Low,Low,Neutral,Neutral,Low
9,Sarrià-Sant Gervasi,149279,Low,Neutral,High,Neutral,Neutral,Neutral,Low
