## Code to generate the necesary JSON File to pass to Backend containing the average rent prices and sell prices by square meter in Barcelona for comercial rents

**Data is downloaded using Barcelona open data API instructions https://opendata-ajuntament.barcelona.cat**

In [1]:
#requirements
import http.client
import json
import requests
from pandas import json_normalize
import json
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# Set up matplotlib to display graphs inline in a Jupyter Notebook
%matplotlib inline

## Sell Prices

In [2]:
connect_opendata = http.client.HTTPSConnection("opendata-ajuntament.barcelona.cat")

headers = {
    'cache-control': "no-cache"
    }

connect_opendata.request("GET", "https://opendata-ajuntament.barcelona.cat/data/api/action/datastore_search?resource_id=e42cf2cf-a76e-4a32-9357-cf90e0ea8ead", headers=headers)

response = connect_opendata.getresponse()
data = response.read()

json_data = json.loads(data.decode('utf-8'))

# Check if the JSON data is nested and needs flattening
if 'result' in json_data and 'records' in json_data['result']:
    # Flatten the JSON data and create a DataFrame
    precio_venta_m2 = json_normalize(json_data['result']['records'])
else:
    # Convert the relevant part of your JSON data to a JSON string and read into a DataFrame
    precio_venta_m2 = pd.read_json(json.dumps(json_data['result']['records']))



# Using regular expression to remove the pattern
precio_venta_m2['BARRIS'] = precio_venta_m2['BARRIS'].str.replace(r'^\d+\.\s+', '', regex=True)



precio_venta_m2.columns = ['Nom_Barri', '2011', '2008', '2010','DTE','2009','ID']

precio_venta_m2 = precio_venta_m2.drop(columns=['DTE', 'ID'])

new_order = ['Nom_Barri', '2008', '2009', '2010','2011']
precio_venta_m2 = precio_venta_m2[new_order]

precio_venta_m2.to_csv("precio_venta_m2.csv")

# Display the DataFrame
precio_venta_m2.head()


Unnamed: 0,Nom_Barri,2008,2009,2010,2011
0,el Raval,3.065,2.773,2.445,2.176
1,el Barri Gòtic,3.828,3.750,3.026,2.632
2,la Barceloneta,--,--,3.005,2.135
3,Sant Pere Santa Caterina i la Ribera,3.450,3.146,2.828,2.482
4,el Fort Pienc,2.893,2.736,2.422,2.224


In [3]:
import numpy as np

# Convert year columns to numeric
year_columns = precio_venta_m2.columns[1:]  # Exclude the first two and last two columns
precio_venta_m2[year_columns] = precio_venta_m2[year_columns].apply(pd.to_numeric, errors='coerce')

# Calculate the average price per sqm over the years for each neighborhood
precio_venta_m2['Average_Price'] = precio_venta_m2[year_columns].mean(axis=1)

precio_venta_m2.to_csv("precio_venta_m2.csv")

# Display the updated dataframe
precio_venta_m2.head()


Unnamed: 0,Nom_Barri,2008,2009,2010,2011,Average_Price
0,el Raval,3.065,2.773,2.445,2.176,2.61475
1,el Barri Gòtic,3.828,3.75,3.026,2.632,3.309
2,la Barceloneta,,,3.005,2.135,2.57
3,Sant Pere Santa Caterina i la Ribera,3.45,3.146,2.828,2.482,2.9765
4,el Fort Pienc,2.893,2.736,2.422,2.224,2.56875


In [4]:
# Descriptive statistics for the dataset
descriptive_stats = precio_venta_m2.describe()

# Display descriptive statistics
descriptive_stats

Unnamed: 0,2008,2009,2010,2011,Average_Price
count,56.0,53.0,56.0,58.0,61.0
mean,2.705982,2.440208,2.28775,2.00481,2.330217
std,0.685203,0.546705,0.55166,0.464295,0.53836
min,1.491,1.387,1.289,1.11,1.43225
25%,2.17425,2.058,1.9095,1.68175,1.90025
50%,2.6705,2.392,2.2825,1.974,2.2515
75%,3.14275,2.773,2.66175,2.264,2.67275
max,4.326,3.75,3.989,3.351,3.84825


The year 2009 has the most missing values (20 instances).
The year 2011 has the fewest missing values (15 instances).
Missing data in other years ranges from 15 to 20 instances.
Now, I'll provide a brief overview of this dataset, including descriptive statistics and any other interesting observations. Let's start with the descriptive statistics. ​​

The descriptive statistics provide insight into the price data for commercial properties in Barcelona from 2008 to 2011. Here are some key observations:

Count: The dataset contains 73 neighborhoods. However, not all years have data for every neighborhood, as indicated by the varying counts (56 for 2008, 53 for 2009, 56 for 2010, and 58 for 2011).

Mean Prices:

The average price per square meter was highest in 2008 (≈2.706 ≈2.706 EUR), and it decreased each subsequent year, reaching its lowest in 2011 (≈2.005≈2.005 EUR).

The overall average price across all years and neighborhoods is approximately 2.330 EUR.
Standard Deviation:

The standard deviation indicates variability in prices across neighborhoods. The highest variability was in 2008 (≈0.685≈0.685) and the lowest in 2011 (≈0.464≈0.464).

Minimum and Maximum Prices:

The minimum price per square meter fluctuated from 1.491 EUR in 2008 to 1.110 EUR in 2011.
The maximum price showed a decreasing trend, from 4.326 EUR in 2008 to 3.351 EUR in 2011.
Quartiles:

The 25th, 50th (median), and 75th percentiles also exhibit a decreasing trend across the years.
This analysis reveals a general decrease in the price per square meter for commercial properties in Barcelona's neighborhoods from 2008 to 2011. The variability in prices also decreased over this period, suggesting a possible stabilization or uniformity in property values across different areas.




In [5]:
precio_venta_m2.drop(['Average_Price'], axis=1)


Unnamed: 0,Nom_Barri,2008,2009,2010,2011
0,el Raval,3.065,2.773,2.445,2.176
1,el Barri Gòtic,3.828,3.750,3.026,2.632
2,la Barceloneta,,,3.005,2.135
3,Sant Pere Santa Caterina i la Ribera,3.450,3.146,2.828,2.482
4,el Fort Pienc,2.893,2.736,2.422,2.224
...,...,...,...,...,...
68,Diagonal Mar i el Front Marítim del Poblenou,3.752,2.445,2.991,2.651
69,el Besòs i el Maresme,3.669,2.842,2.751,2.121
70,Provençals del Poblenou,3.857,2.660,2.516,2.255
71,Sant Martí de Provençals,1.957,2.266,2.175,1.812


In [6]:
precio_venta_m2=pd.melt(precio_venta_m2, id_vars=['Nom_Barri'], value_vars=['2008', '2009','2010','2011'])

precio_venta_m2.columns = ['Nom_Barri', 'Anio', 'Price_m2']
precio_venta_m2['Nom_Barri'] = precio_venta_m2['Nom_Barri'].str.lower()

precio_venta_m2

Unnamed: 0,Nom_Barri,Anio,Price_m2
0,el raval,2008,3.065
1,el barri gòtic,2008,3.828
2,la barceloneta,2008,
3,sant pere santa caterina i la ribera,2008,3.450
4,el fort pienc,2008,2.893
...,...,...,...
287,diagonal mar i el front marítim del poblenou,2011,2.651
288,el besòs i el maresme,2011,2.121
289,provençals del poblenou,2011,2.255
290,sant martí de provençals,2011,1.812


## Rent Prices

In [7]:
connect_opendata = http.client.HTTPSConnection("opendata-ajuntament.barcelona.cat")

headers = {
    'cache-control': "no-cache"
    }

connect_opendata.request("GET", "https://opendata-ajuntament.barcelona.cat/data/api/action/datastore_search?resource_id=97356d26-30b9-436a-8dbb-d0d05f0a87fd", headers=headers)

response = connect_opendata.getresponse()
data = response.read()

json_data = json.loads(data.decode('utf-8'))

# Check if the JSON data is nested and needs flattening
if 'result' in json_data and 'records' in json_data['result']:
    # Flatten the JSON data and create a DataFrame
    precio_alquiler = json_normalize(json_data['result']['records'])
else:
    # Convert the relevant part of your JSON data to a JSON string and read into a DataFrame
    precio_alquiler = pd.read_json(json.dumps(json_data['result']['records']))



# Using regular expression to remove the pattern
precio_alquiler['BARRIS'] = precio_alquiler['BARRIS'].str.replace(r'^\d+\.\s+', '', regex=True)



precio_alquiler.columns = ['Nom_Barri', '2011', '2008', '2010','DTE','2009','ID']

precio_alquiler = precio_alquiler.drop(columns=['DTE', 'ID'])

new_order = ['Nom_Barri', '2008', '2009', '2010','2011']
precio_alquiler = precio_alquiler[new_order]

precio_alquiler.to_csv("precio_alquiler.csv")

# Display the DataFrame
precio_alquiler.head()

Unnamed: 0,Nom_Barri,2008,2009,2010,2011
0,el Raval,1543,1362,1216,1189
1,el Barri Gòtic,1859,1481,1369,1325
2,la Barceloneta,1469,1526,1660,1648
3,Sant Pere Santa Caterina i la Ribera,1585,1374,1291,1225
4,el Fort Pienc,1243,1098,1000,921


In [8]:
precio_alquiler=pd.melt(precio_alquiler, id_vars=['Nom_Barri'], value_vars=['2008', '2009','2010','2011'])

precio_alquiler.columns = ['Nom_Barri', 'Anio', 'Price']
precio_alquiler['Nom_Barri'] = precio_alquiler['Nom_Barri'].str.lower()

precio_alquiler

Unnamed: 0,Nom_Barri,Anio,Price
0,el raval,2008,1543
1,el barri gòtic,2008,1859
2,la barceloneta,2008,1469
3,sant pere santa caterina i la ribera,2008,1585
4,el fort pienc,2008,1243
...,...,...,...
287,diagonal mar i el front marítim del poblenou,2011,1126
288,el besòs i el maresme,2011,--
289,provençals del poblenou,2011,832
290,sant martí de provençals,2011,976


In [9]:
df_precios = pd.merge(precio_alquiler, precio_venta_m2, on=['Nom_Barri', 'Anio'])
df_precios

Unnamed: 0,Nom_Barri,Anio,Price,Price_m2
0,el raval,2008,1543,3.065
1,el barri gòtic,2008,1859,3.828
2,la barceloneta,2008,1469,
3,sant pere santa caterina i la ribera,2008,1585,3.450
4,el fort pienc,2008,1243,2.893
...,...,...,...,...
287,diagonal mar i el front marítim del poblenou,2011,1126,2.651
288,el besòs i el maresme,2011,--,2.121
289,provençals del poblenou,2011,832,2.255
290,sant martí de provençals,2011,976,1.812


In [10]:

connect_opendata = http.client.HTTPSConnection("opendata-ajuntament.barcelona.cat")

headers = {
    'cache-control': "no-cache"
    }

connect_opendata.request("GET", "https://opendata-ajuntament.barcelona.cat/data/api/action/datastore_search?resource_id=c897c912-0f3c-4463-bdf2-a67ee97786ac", headers=headers)

response = connect_opendata.getresponse()
data = response.read()

json_data = json.loads(data.decode('utf-8'))

# Check if the JSON data is nested and needs flattening
if 'result' in json_data and 'records' in json_data['result']:
    # Flatten the JSON data and create a DataFrame
    df_codigos = json_normalize(json_data['result']['records'])
else:
    # Convert the relevant part of your JSON data to a JSON string and read into a DataFrame
    df_codigos = pd.read_json(json.dumps(json_data['result']['records']))

 
df_codigos

df_codigos = df_codigos[['Codi_Districte', 'Nom_Districte', 'Codi_Barri','Nom_Barri']]


df_codigos['Nom_Districte'] = df_codigos['Nom_Districte'].str.lower()
df_codigos['Nom_Barri'] = df_codigos['Nom_Barri'].str.lower()

df_codigos

precios_final = pd.merge(df_precios, df_codigos, on=['Nom_Barri'])

new_order = ['Codi_Districte', 'Nom_Districte', 'Codi_Barri', 'Nom_Barri','Anio','Price','Price_m2']
precios_final = precios_final[new_order]

precios_final.columns = ['codiDistricte', 'nomDistricte', 'codiBarri', 'nomBarri','Anio','rentPrice','sellPriceSqm']

precios_final

json_precios = precios_final.to_json(orient='records', lines=True)


file_path = "json_precios.json"

with open(file_path, 'w') as file:
    file.write(json_precios)

json_precios


'{"codiDistricte":"1","nomDistricte":"ciutat vella","codiBarri":"2","nomBarri":"el barri g\\u00f2tic","Anio":"2008","rentPrice":"1859","sellPriceSqm":3.828}\n{"codiDistricte":"1","nomDistricte":"ciutat vella","codiBarri":"2","nomBarri":"el barri g\\u00f2tic","Anio":"2009","rentPrice":"1481","sellPriceSqm":3.75}\n{"codiDistricte":"1","nomDistricte":"ciutat vella","codiBarri":"2","nomBarri":"el barri g\\u00f2tic","Anio":"2010","rentPrice":"1369","sellPriceSqm":3.026}\n{"codiDistricte":"1","nomDistricte":"ciutat vella","codiBarri":"2","nomBarri":"el barri g\\u00f2tic","Anio":"2011","rentPrice":"1325","sellPriceSqm":2.632}\n{"codiDistricte":"2","nomDistricte":"eixample","codiBarri":"5","nomBarri":"el fort pienc","Anio":"2008","rentPrice":"1243","sellPriceSqm":2.893}\n{"codiDistricte":"2","nomDistricte":"eixample","codiBarri":"5","nomBarri":"el fort pienc","Anio":"2008","rentPrice":"1243","sellPriceSqm":2.893}\n{"codiDistricte":"2","nomDistricte":"eixample","codiBarri":"5","nomBarri":"el fo