# ADM Homework 4 Group 11

## 1) Does basic house information reflect house's description?
In this assignment we will perform a clustering analysis of house announcements in Rome from Immobiliare.it.

Let's start preparing the enironment loading the libraries:

In [1]:
import pandas as pd
from bs4 import BeautifulSoup
from requests import get
import csv
import re

We'll scrape some data from the website starting from this url:

https://www.immobiliare.it/vendita-case/roma/?criterio=rilevanza&pag=1

In the url we can notice a parameter referring to the pagination of the results, divided in pages. Each of this pages contains 25 announces.

In order to reach at least 10.000 announces, we need to scrape at least 400 pages.

First we create the function that returns the urls of the announces inside a page.

In [2]:
def get_announces(url):
    response = get(url)

    html_soup = BeautifulSoup(response.text, 'html.parser')
    announce_containers = html_soup.find_all('p', class_ = 'titolo text-primary')
    
    urls = []
    
    for container in announce_containers:
        if "/nuove_costruzioni/" not in container.a['href']: 
            urls.append(container.a['href'])
        
    return urls

Let's create a list with all the announces urls we need. We save it in a csv file to avoid scraping all the pages again.

In [3]:
#url_list = []

#for i in range(1,450):
#    url = 'https://www.immobiliare.it/vendita-case/roma/?criterio=rilevanza&pag='
#    url_list = url_list + get_announces(url + str(i))

#with open('data/url_list.csv', 'w+', newline='') as myfile:
#    wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
#    for url in url_list:
#        wr.writerow([url])

In [4]:
url_list = pd.read_csv('data/url_list.csv', header=None)
url_list = url_list[0]
url_list.head()

0    https://www.immobiliare.it/53131931-Vendita-Bi...
1    https://www.immobiliare.it/70420586-Vendita-Bi...
2    https://www.immobiliare.it/70288308-Vendita-Ap...
3    https://www.immobiliare.it/70114826-Vendita-Tr...
4    https://www.immobiliare.it/70355074-Vendita-Tr...
Name: 0, dtype: object

Now we define the function to extract the info we need from the announce page

In [5]:
def get_data(url):
    
    id = re.findall(r'(\d+)', url)[0] # Get announce ID parsing the url
    
    response = get(url)

    html_soup = BeautifulSoup(response.text, 'html.parser')
    data_container = html_soup.find('ul', class_ = 'list-inline list-piped features__list')
    
    if data_container is not None:
    
        for item in data_container.children:

            # Locate rooms number
            if item.find('div', class_= 'features__label') and item.find('div', class_= 'features__label').contents[0] == 'locali':
                rooms = item.find('span', class_ = 'text-bold').contents[0]
                rooms = re.sub('[^A-Za-z0-9]+', '', rooms)

            # Locate surface extension
            if item.find('div', class_= 'features__label') and item.find('div', class_= 'features__label').contents[0] == 'superficie':
                area = item.find('span', class_ = 'text-bold').contents[0]
                area = re.sub('[^A-Za-z0-9]+', '', area)

            # Locate bathrooms number    
            if item.find('div', class_= 'features__label') and item.find('div', class_= 'features__label').contents[0] == 'bagni':
                bathrooms = item.find('span', class_ = 'text-bold').contents[0]
                bathrooms = re.sub('[^A-Za-z0-9]+', '', bathrooms)

            # Locate floor number    
            if item.find('div', class_= 'features__label') and item.find('div', class_= 'features__label').contents[0] == 'piano':
                floor = item.find('abbr', class_ = 'text-bold').contents[0]
                floor = re.sub('[^A-Za-z0-9]+', '', floor)

            # Extract the description
            try:
                description = html_soup.find('div', class_ = 'col-xs-12 description-text text-compressed').div.contents[0]
                description = re.sub('[^a-zA-Z0-9-_*. ]', '', description) # Remove special charachters
                description = description.lstrip(' ') # Remove leading blank spaces
            except AttributeError:
                return False
        
    try:
        return [[id,rooms,area,bathrooms,floor],[id,description]]
    except NameError:
        return False   

In [6]:
get_data('https://www.immobiliare.it/70355074-Vendita-Trilocale-viale-Cortina-D-Ampezzo-Roma.html')

False

Now we can iterate the url list extracting all the data to put them in two dataframes.

In order to save execution time for the next runs, we save the two dataframse in two csv files.

In [19]:
data_df = pd.DataFrame(columns = ['ID','Rooms','Area','Bathrooms','Floor'])

description_df = pd.DataFrame(columns = ['ID','Description'])

for i in range(0,len(url_list)):
    
    break # Remove line to execute full code
    
    print(url_list[i])
    
    # This while loop is need to retry the request in case of connection error
    while True:
        try:
            if get_data(url_list[i]):

                # Convert list in dataframe
                row_data = pd.np.asarray(get_data(url_list[i])[0])
                row_data = pd.DataFrame(data=row_data.reshape(1,5), columns= ['ID','Rooms','Area','Bathrooms','Floor'])

                # Append results to data dataframe
                data_df = data_df.append(row_data)

                # Convert list in dataframe
                row_description = pd.np.asarray(get_data(url_list[i])[1])
                row_description = pd.DataFrame(data=row_description.reshape(1,2), columns= ['ID','Description'])

                # Append results to description dataframe
                description_df = description_df.append(row_description)
                
                # Create two csv files line by line
                with open('data/data.csv', 'a') as f:
                    row_data.to_csv(f, header=False)
                with open('data/description.csv', 'a') as f:
                    row_description.to_csv(f, header=False)
        
        # Wait two seconds in case of connection error and retry
        except ConnectionError:
            print('Connection Error')
            time.sleep(2)
            continue
        break

https://www.immobiliare.it/69476824-Vendita-Appartamento-via-Ticino-Roma.html
https://www.immobiliare.it/68946947-Vendita-Appartamento-via-ERGISTO-BEZZI-Roma.html
https://www.immobiliare.it/67988503-Vendita-Quadrilocale-via-Raffaele-Aversa-Roma.html
https://www.immobiliare.it/66995631-Vendita-Attico-Mansarda-via-Peirce-Roma.html
https://www.immobiliare.it/70442000-appartamento-in-asta-via-Andrea-Millevoi-801-Roma.html
https://www.immobiliare.it/70412594-Vendita-Quadrilocale-via-Vincenzo-Viara-de-Ricci-Roma.html
https://www.immobiliare.it/69976270-Vendita-Villa-via-Senofonte-Roma.html
https://www.immobiliare.it/69681206-Vendita-Quadrilocale-via-di-Dragone-Roma.html
https://www.immobiliare.it/67528141-appartamento-in-asta-piazza-Euclide-2-Roma.html
https://www.immobiliare.it/68871847-Vendita-Trilocale-via-Italo-Orto-Roma.html
https://www.immobiliare.it/69180402-Vendita-Monolocale-piazzale-Jonio-Roma.html
https://www.immobiliare.it/67883679-Vendita-Attico-Mansarda-via-di-Vigna-Consorti-Ro

https://www.immobiliare.it/69353336-Vendita-Bilocale-via-dei-Panfili-Roma.html
https://www.immobiliare.it/68869815-Vendita-Attico-Mansarda-via-Pietro-Ercole-Visconti-Roma.html
https://www.immobiliare.it/68166945-Vendita-Trilocale-via-Anagni-Roma.html
https://www.immobiliare.it/68012761-Vendita-Bilocale-viale-del-Sommergibile-Roma.html
https://www.immobiliare.it/67813535-Vendita-Villa-viale-di-Castel-Porziano-493-Roma.html
https://www.immobiliare.it/52209375-Vendita-Trilocale-via-del-Fiume-Bianco-Roma.html
https://www.immobiliare.it/70018740-Vendita-Trilocale-via-Isole-del-Capo-Verde-Roma.html
https://www.immobiliare.it/69409330-Vendita-Appartamento-via-Costantino-Maes-Roma.html
https://www.immobiliare.it/68715187-villa-in-asta-via-Solopaca-299-Roma.html
https://www.immobiliare.it/69175340-Vendita-Bilocale-via-degli-Equi-Roma.html
https://www.immobiliare.it/68834181-appartamento-in-asta-via-Cristoforo-Colombo-185-Roma.html
https://www.immobiliare.it/69783376-Vendita-Trilocale-via-Ferdin

https://www.immobiliare.it/60662846-Vendita-Villa-via-Paolo-Pericoli-Roma.html
https://www.immobiliare.it/67851813-Vendita-Monolocale-via-via-ALATRI-107-Roma.html
https://www.immobiliare.it/60662846-Vendita-Villa-via-Paolo-Pericoli-Roma.html
https://www.immobiliare.it/67851813-Vendita-Monolocale-via-via-ALATRI-107-Roma.html
https://www.immobiliare.it/69034141-Vendita-Trilocale-via-Cenina-Roma.html
https://www.immobiliare.it/65357344-Vendita-Bilocale-via-Placanica-Roma.html
https://www.immobiliare.it/69389486-Vendita-Trilocale-via-Cassia-1684-Roma.html
https://www.immobiliare.it/69270886-Vendita-Bilocale-via-luigi-gastinelli-Roma.html
https://www.immobiliare.it/68793705-Vendita-Bilocale-via-Giuseppe-Sacconi-Roma.html
https://www.immobiliare.it/67915841-Vendita-Appartamento-via-Rubicone-Roma.html
https://www.immobiliare.it/69402006-Vendita-Villa-largo-dell-Olgiata-15-Roma.html
https://www.immobiliare.it/67921449-Vendita-Villa-via-della-giustiniana-120-Roma.html
https://www.immobiliare.it

https://www.immobiliare.it/67540681-Vendita-Bilocale-via-Riviera-d-Adda-7-Roma.html
https://www.immobiliare.it/67474383-Vendita-Bilocale-via-delle-Quinqueremi-115-Roma.html
https://www.immobiliare.it/69709098-Vendita-Trilocale-via-vittorio-maria-butera-Roma.html
https://www.immobiliare.it/69819938-Vendita-Trilocale-via-Ginori-Roma.html
https://www.immobiliare.it/69171602-Vendita-Trilocale-via-Mario-Borsa-Roma.html
https://www.immobiliare.it/70478966-Vendita-Quadrilocale-via-Stignano-Roma.html
https://www.immobiliare.it/69205798-Vendita-Bilocale-via-pallagorio-11-Roma.html
https://www.immobiliare.it/69980446-Vendita-Trilocale-via-Vico-Vigano-1-Roma.html
https://www.immobiliare.it/69171614-Vendita-Trilocale-via-Ludovico-Pasini-Roma.html
https://www.immobiliare.it/67716213-Vendita-Attico-Mansarda-via-san-gimignano-Roma.html
https://www.immobiliare.it/67179985-Vendita-Quadrilocale-via-di-Casal-Selce-389-Roma.html
https://www.immobiliare.it/70467972-Vendita-Quadrilocale-via-Stignano-Roma.ht

https://www.immobiliare.it/68229339-Vendita-Appartamento-corso-di-Francia-Roma.html
https://www.immobiliare.it/62640800-Vendita-Trilocale-via-dei-Fiorrancini-51-Roma.html
https://www.immobiliare.it/70425280-Vendita-Trilocale-via-baldassarre-Longhena-38-Roma.html
https://www.immobiliare.it/68411585-Vendita-Appartamento-via-Nanchino-215-Roma.html
https://www.immobiliare.it/70321412-Vendita-Quadrilocale-via-Quattro-Cantoni-Roma.html
https://www.immobiliare.it/68837441-Vendita-Trilocale-via-Val-d-Aosta-24-Roma.html
https://www.immobiliare.it/65946023-Vendita-Villa-via-GIUSEPPE-BOTTI-Roma.html
https://www.immobiliare.it/70145164-Vendita-Bilocale-via-Magarotto-36-Roma.html
https://www.immobiliare.it/69000361-Vendita-Bilocale-via-Giuseppe-De-Leva-37-Roma.html
https://www.immobiliare.it/68918143-Vendita-Quadrilocale-piazza-Armenia-Roma.html
https://www.immobiliare.it/68649103-Vendita-Villa-via-Della-Giustniana-Roma.html
https://www.immobiliare.it/70393366-Vendita-Bilocale-via-LORENZO-VALLA-Rom

https://www.immobiliare.it/69847698-Vendita-Bilocale-via-di-Santa-Maria-Goretti-28-Roma.html
https://www.immobiliare.it/69522690-Vendita-Bilocale-via-Pietro-Rovetti-139-Roma.html
https://www.immobiliare.it/67510153-Vendita-Quadrilocale-via-del-fosso-di-Roma.html
https://www.immobiliare.it/70342080-Vendita-Trilocale-viale-Dei-Consoli-0-Roma.html
https://www.immobiliare.it/66955375-Vendita-Villetta-a-schiera-largo-dell-Olgiata-Roma.html
https://www.immobiliare.it/65314336-Vendita-Trilocale-via-Giovanni-Gherardini-Roma.html
https://www.immobiliare.it/70467428-Vendita-Trilocale-via-Eudo-Giulioli-Roma.html
https://www.immobiliare.it/69171862-Vendita-Trilocale-circonvallazione-CASILINA-Roma.html
https://www.immobiliare.it/69850516-Vendita-Villetta-a-schiera-via-Padre-Giuseppe-Petrilli-Roma.html
https://www.immobiliare.it/69709096-Vendita-Appartamento-via-comano-Roma.html
https://www.immobiliare.it/65522042-Vendita-Attico-Mansarda-via-Monti-Tiburtini-489-Roma.html
https://www.immobiliare.it/5

https://www.immobiliare.it/69353674-Vendita-Bilocale-via-dei-Panfili-Roma.html
https://www.immobiliare.it/69072816-Vendita-Loft-Open-Space-via-aleardo-aleardi-Roma.html
https://www.immobiliare.it/69766486-Vendita-Bilocale-via-Giovanni-Da-Procida-Roma.html
https://www.immobiliare.it/69942550-Vendita-Appartamento-via-CESARE-BARONIO-Roma.html
https://www.immobiliare.it/68082063-Vendita-Quadrilocale-via-Camillo-Sabatini-Roma.html
https://www.immobiliare.it/68952083-Vendita-Bilocale-via-del-Pastore-Faustolo-Roma.html
https://www.immobiliare.it/68883929-Vendita-Villa-via-Flaminia-Nuova-Roma.html
https://www.immobiliare.it/70397566-Vendita-Bilocale-via-Monteciccardo-Roma.html
https://www.immobiliare.it/69758946-Vendita-Quadrilocale-via-Giuseppe-Flajani-Roma.html
https://www.immobiliare.it/69886750-Vendita-Quadrilocale-via-ANDREA-PITTI-Roma.html
https://www.immobiliare.it/69529548-Vendita-Trilocale-via-Ferdinando-Ughelli-Roma.html
https://www.immobiliare.it/70440532-Vendita-Villetta-a-schiera-

https://www.immobiliare.it/70042462-Vendita-Bilocale-via-Val-Maggia-26-Roma.html
https://www.immobiliare.it/57514858-Vendita-Trilocale-viale-Timocle-Roma.html
https://www.immobiliare.it/65283966-Vendita-Bilocale-via-Capo-d-Africa-Roma.html
https://www.immobiliare.it/69270774-Vendita-Trilocale-via-Pallanza-Roma.html
https://www.immobiliare.it/68923645-Vendita-Quadrilocale-via-Trionfale-13720-Roma.html
https://www.immobiliare.it/67425811-Vendita-Box-Garage-via-degli-Orti-di-Trastevere-Roma.html
https://www.immobiliare.it/57533418-Vendita-Villa-via-SARNICO-Roma.html
https://www.immobiliare.it/70341034-Vendita-Trilocale-via-dei-remi-Roma.html
https://www.immobiliare.it/69110892-Vendita-Trilocale-via-Monte-Grimano-10-Roma.html
https://www.immobiliare.it/53970800-Vendita-Villetta-a-schiera-via-Somma-Lombardo-Roma.html
https://www.immobiliare.it/52358293-Vendita-Appartamento-via-Archimede-Roma.html
https://www.immobiliare.it/70451478-Vendita-Quadrilocale-via-Adolfo-Rava-Roma.html
https://www.

https://www.immobiliare.it/70249406-Vendita-Attico-Mansarda-via-minerbio-Roma.html
https://www.immobiliare.it/70175952-Vendita-Monolocale-via-del-Campo-Roma.html
https://www.immobiliare.it/70140108-Vendita-Bilocale-viale-Val-Padana-Roma.html
https://www.immobiliare.it/69642708-Vendita-Quadrilocale-via-dei-Monti-di-Creta-Roma.html
https://www.immobiliare.it/69396088-Vendita-Bilocale-via-Enrico-dell-Acqua-31-Roma.html
https://www.immobiliare.it/69301884-Vendita-Bilocale-via-Amsterdam-Roma.html
https://www.immobiliare.it/68900371-Vendita-Bilocale-via-Giovanni-Battista-Roma.html
https://www.immobiliare.it/68293063-Vendita-Quadrilocale-via-Granito-Di-Belmonte-Roma.html
https://www.immobiliare.it/67900339-Vendita-Bilocale-via-liberato-sabbati-8-Roma.html
https://www.immobiliare.it/67477367-Vendita-Quadrilocale-via-Giorgio-Scalia-Roma.html
https://www.immobiliare.it/67279691-Vendita-Appartamento-via-Felice-Bisleri-Roma.html
https://www.immobiliare.it/67092713-Vendita-Bilocale-via-Luigi-Gastin

https://www.immobiliare.it/59881398-Vendita-Villa-via-Vignone-131-Roma.html
https://www.immobiliare.it/70306508-Vendita-Trilocale-via-Alberto-Pollio-40-Roma.html
https://www.immobiliare.it/70134138-Vendita-Trilocale-via-del-Vivaio-Roma.html
https://www.immobiliare.it/69257470-Vendita-Appartamento-via-SIMONE-MOSCA-Roma.html
https://www.immobiliare.it/68097595-Vendita-Monolocale-piazza-Oreste-Tommasini-Roma.html
https://www.immobiliare.it/67021259-Vendita-Quadrilocale-viale-Alessandro-Magno-Roma.html
https://www.immobiliare.it/62073682-Vendita-Appartamento-via-Guido-Alfani-32-Roma.html
https://www.immobiliare.it/70320156-Vendita-Villa-via-Attilio-Consolaro-20-Roma.html
https://www.immobiliare.it/70249304-Vendita-Appartamento-via-GREGORIO-VII-Roma.html
https://www.immobiliare.it/69885924-Vendita-Quadrilocale-via-CANDIDO-MANCA-Roma.html
https://www.immobiliare.it/69537554-appartamento-in-asta-via-Temistocle-Calisti-96-Roma.html
https://www.immobiliare.it/69407736-Vendita-Trilocale-via-Fran

https://www.immobiliare.it/68879621-Vendita-Trilocale-via-Salvatore-di-Giacomo-Roma.html
https://www.immobiliare.it/67712523-Vendita-Appartamento-via-Rosa-Raimondi-Garibladi-Roma.html
https://www.immobiliare.it/67510479-Vendita-Trilocale-via-Urbano-II-8-Roma.html
https://www.immobiliare.it/67116447-Vendita-Villa-via-Francesco-Donati-Roma.html
https://www.immobiliare.it/66121195-Vendita-Appartamento-via-Santamaura-Roma.html
https://www.immobiliare.it/65554514-Vendita-Quadrilocale-via-Sebastiano-Ziani-Roma.html
https://www.immobiliare.it/58834900-Vendita-Trilocale-via-Appia-Pignatelli-240-Roma.html
https://www.immobiliare.it/70396402-Vendita-Appartamento-via-Cimone-Roma.html
https://www.immobiliare.it/70088296-Vendita-Villa-via-Cianciana-Roma.html
https://www.immobiliare.it/69028949-appartamento-in-asta-via-Cori-13-Roma.html
https://www.immobiliare.it/67813089-Vendita-Trilocale-via-Bergolo-Roma.html
https://www.immobiliare.it/67759679-Vendita-Bilocale-viale-Opita-Oppio-Roma.html
https://

https://www.immobiliare.it/70357744-Vendita-Trilocale-via-delle-Palme-Roma.html
https://www.immobiliare.it/69806074-Vendita-Appartamento-via-Angelo-Scarenzio-Roma.html
https://www.immobiliare.it/69199202-Vendita-Quadrilocale-via-Walter-Tobagi-Roma.html
https://www.immobiliare.it/68946361-Vendita-Appartamento-via-Cassia-44-Roma.html
https://www.immobiliare.it/68540535-appartamento-in-asta-via-Valle-della-Storta-63-Roma.html
https://www.immobiliare.it/67232033-Vendita-Trilocale-via-Salemi-Roma.html
https://www.immobiliare.it/67095009-Vendita-Trilocale-viale-Marco-Polo-Roma.html
https://www.immobiliare.it/66910795-Vendita-Villa-largo-dell-Olgiata-15-Roma.html
https://www.immobiliare.it/64713620-Vendita-Appartamento-viale-dei-Campioni-Roma.html
https://www.immobiliare.it/59781668-appartamento-in-asta-via-di-Donna-Olimpia-134-Roma.html
https://www.immobiliare.it/70398976-Vendita-Quadrilocale-vicolo-dell-Oro-Roma.html
https://www.immobiliare.it/70395438-Vendita-Appartamento-via-Cutigliano-Ro

https://www.immobiliare.it/69537536-appartamento-in-asta-via-Temistocle-Calisti-96-Roma.html
https://www.immobiliare.it/68904405-Vendita-Quadrilocale-via-Enrico-Giachino-Roma.html
https://www.immobiliare.it/68383917-appartamento-in-asta-via-Danilo-Stiepovich-121-Roma.html
https://www.immobiliare.it/68300545-Vendita-Trilocale-via-Flaminia-Roma.html
https://www.immobiliare.it/68158387-appartamento-in-asta-viale-dell-Acquedotto-Alessandrino-101-Roma.html
https://www.immobiliare.it/67776347-Vendita-Trilocale-via-Antonino-Pagliaro-Roma.html
https://www.immobiliare.it/63884390-Vendita-Trilocale-via-del-Casale-del-Finocchio-8-Roma.html
https://www.immobiliare.it/70440720-Vendita-Trilocale-via-Bertonico-Roma.html
https://www.immobiliare.it/70078524-Vendita-Trilocale-via-Vincenzo-Ciaffi-Roma.html
https://www.immobiliare.it/69749310-Vendita-Trilocale-via-di-Acqua-Bullicante-47-Roma.html
https://www.immobiliare.it/69608104-Vendita-Trilocale-via-di-Casal-Selce-269-Roma.html
https://www.immobiliare

https://www.immobiliare.it/70151980-appartamento-in-asta-via-Lucrino-16-Roma.html
https://www.immobiliare.it/69428504-Vendita-Trilocale-via-Verginia-Tonelli-Roma.html
https://www.immobiliare.it/69082468-Vendita-Quadrilocale-via-di-Monte-Verde-Roma.html
https://www.immobiliare.it/68315049-Vendita-Trilocale-via-Francesco-Caracciolo-Roma.html
https://www.immobiliare.it/68098449-appartamento-in-asta-via-Casilina-329-Roma.html
https://www.immobiliare.it/67764333-Vendita-Trilocale-via-Nocera-Inferiore-Roma.html
https://www.immobiliare.it/66342877-Vendita-Bilocale-via-Biella-Roma.html
https://www.immobiliare.it/65456414-Vendita-Trilocale-via-Vito-Artale-Roma.html
https://www.immobiliare.it/70298856-Vendita-Monolocale-via-DELLE-ALLODOLE-Roma.html
https://www.immobiliare.it/69075476-Vendita-Quadrilocale-via-Nicola-Marchese-Roma.html
https://www.immobiliare.it/68929031-Vendita-Trilocale-via-Santamaura-46-Roma.html
https://www.immobiliare.it/68929031-Vendita-Trilocale-via-Santamaura-46-Roma.html


https://www.immobiliare.it/68422645-appartamento-in-asta-via-Monsano-30-Roma.html
https://www.immobiliare.it/67506761-Vendita-Bilocale-via-di-Casal-Bertone-Roma.html
https://www.immobiliare.it/66958203-Vendita-Trilocale-via-Ugo-Ojetti-Roma.html
https://www.immobiliare.it/64117856-Vendita-Bilocale-via-novaledo-Roma.html
https://www.immobiliare.it/54368200-Vendita-Attico-Mansarda-via-Giuseppe-Valmarana-Roma.html
https://www.immobiliare.it/70398822-Vendita-Appartamento-via-Fratelli-Laurana-Roma.html
https://www.immobiliare.it/70363828-Vendita-Trilocale-via-delle-Ciliegie-Roma.html
https://www.immobiliare.it/70357760-Vendita-Trilocale-via-dei-Casali-del-Drago-Roma.html
https://www.immobiliare.it/70347654-Vendita-Trilocale-via-di-Pietralata-Roma.html
https://www.immobiliare.it/70317646-Vendita-Quadrilocale-via-Tuscolana-Roma.html
https://www.immobiliare.it/70200378-Vendita-Appartamento-via-Antonio-Gramsci-Roma.html
https://www.immobiliare.it/70198166-Vendita-Quadrilocale-via-Aristide-Leonor

https://www.immobiliare.it/68354265-Vendita-Bilocale-via-Cassia-1134-Roma.html
https://www.immobiliare.it/69152252-Vendita-Trilocale-via-Trionfale-13840-Roma.html
https://www.immobiliare.it/70214478-Vendita-Attico-Mansarda-via-Portuense-Roma.html
https://www.immobiliare.it/69773800-Vendita-Monolocale-via-cilea-Roma.html
https://www.immobiliare.it/69726364-Vendita-Quadrilocale-via-busto-arsizio-Roma.html
https://www.immobiliare.it/70466528-Vendita-Quadrilocale-via-Giuseppe-Antonio-Guattani-Roma.html
https://www.immobiliare.it/69293336-Vendita-Villetta-a-schiera-via-Ignazio-Scimonelli-Roma.html
https://www.immobiliare.it/68456401-Vendita-Bilocale-via-Giuseppe-Parini-Roma.html
https://www.immobiliare.it/70249644-Vendita-Trilocale-via-Segrate-Roma.html
https://www.immobiliare.it/69348726-Vendita-Bilocale-via-Massa-San-Giuliano-295-Roma.html
https://www.immobiliare.it/66064847-Vendita-Box-Garage-via-Quirino-Majorana-141-Roma.html
https://www.immobiliare.it/70393190-Vendita-Quadrilocale-via-

https://www.immobiliare.it/67764103-Vendita-Attico-Mansarda-via-Giuseppe-Lucchetti-Rossi-Roma.html
https://www.immobiliare.it/67355033-Vendita-Appartamento-via-Vito-Sinisi-Roma.html
https://www.immobiliare.it/59466630-Vendita-Villa-via-Bernardino-Bolasco-Roma.html
https://www.immobiliare.it/69995786-Vendita-Bilocale-via-di-Torrenova-Roma.html
https://www.immobiliare.it/69847672-Vendita-Quadrilocale-via-DEI-CICLAMINI-Roma.html
https://www.immobiliare.it/69811794-Vendita-Trilocale-via-Alberto-Pollio-40-Roma.html
https://www.immobiliare.it/68350207-Vendita-Bilocale-via-Arduino-61-Roma.html
https://www.immobiliare.it/67783861-Vendita-Bilocale-via-del-Grano-Roma.html
https://www.immobiliare.it/63598378-Vendita-Bilocale-Lungomare-Duca-degli-Abruzzi-Roma.html
https://www.immobiliare.it/70137606-Vendita-Bilocale-via-di-Boccea-Roma.html
https://www.immobiliare.it/70100404-Vendita-Trilocale-via-Erasmo-Gattamelata-0-Roma.html
https://www.immobiliare.it/69157718-Vendita-Monolocale-via-Guglielmo-Gu

https://www.immobiliare.it/68828873-Vendita-Bilocale-via-Tavagnasco-Roma.html
https://www.immobiliare.it/68540087-immobili_commerciali-garage_parcheggio-in-vendita-Roma.html
https://www.immobiliare.it/68196077-Vendita-Appartamento-via-Ugo-Bartolomei-Roma.html
https://www.immobiliare.it/68157579-Vendita-Quadrilocale-via-domenico-panaroli-9-Roma.html
https://www.immobiliare.it/63553718-Vendita-Appartamento-via-via-Flaminia-491-Roma.html
https://www.immobiliare.it/70079456-Vendita-Appartamento-piazzale-Cardinal-Consalvi-Roma.html
https://www.immobiliare.it/66043039-Vendita-Appartamento-via-Archimede-Roma.html
https://www.immobiliare.it/70338616-Vendita-Bilocale-largo-Ludovico-Quaroni-Roma.html
https://www.immobiliare.it/70183540-Vendita-Quadrilocale-via-di-Bravetta-724-Roma.html
https://www.immobiliare.it/67757667-Vendita-Quadrilocale-via-Belcastro-Roma.html
https://www.immobiliare.it/67184797-Vendita-Bilocale-viale-Carmelo-Bene-Roma.html
https://www.immobiliare.it/52209403-Vendita-Triloc

After extraction completes, we can import data from the csv.
We need to do also some cleaning, like removing duplicates and adding columns names.

In [26]:
data_df = pd.read_csv('data/data.csv', header=None)
data_df = data_df.drop([0], axis=1)
data_df.columns = ['ID','Rooms','Area','Bathrooms','Floor']
data_df = data_df.drop_duplicates()
data_df.head()

Unnamed: 0,ID,Rooms,Area,Bathrooms,Floor
0,53131931,2,50,1,1
1,70420586,2,70,1,5
2,70288308,5,140,2,2
3,70114826,3,105,2,1
4,69659060,5,160,2,4


In [27]:
description_df = pd.read_csv('data/description.csv', header=None)
description_df = description_df.drop([0], axis=1)
description_df.columns = ['ID','Description']
description_df = description_df.drop_duplicates()
description_df.head()

Unnamed: 0,ID,Description
0,53131931,PAPILLO EUR in elegante complesso residenziale...
1,70420586,Prenestina Appartamento in Vendita adiacente L...
2,70288308,Nelle vicinanze del Parco dellAppia Antica e d...
3,70114826,Proponiamo in vendita in via Genserico Fontana...
4,69659060,Nel quartiere Prati in una delle vie pi import...
