# ZOOPLA HOUSES PRICE DATA SCRAPING USING BEAUTIFUL SOUP
In this project, I scraped data from a rightmove.co.uk website using BeautifulSoup. I will use this data for analysing various trends in the housing market as compared to last year.

For scraping data from this website, I'll perform the following tasks:

[**Task 1**](#task1): Importing the libraries

[**Task 2**](#task2): Creating the base url and choosing the header

[**Task 3**](#task3): Extracting product links on the first page

[**Task 4**](#task4): Extracting product links on all the pages

[**Task 5**](#task5): Extracting information of the first product

[**Task 6**](#task6): Extracting information of all the products


















<a id='task1'></a>
# Task 1: Importing the libraries

In [1]:
from bs4 import BeautifulSoup
import requests
import pandas as pd

<a id='task2'></a>
# Task 2: Creating the base url and choosing the header



In [2]:
base_url = 'https://www.zoopla.co.uk/'
header = {
    'user-agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36"
}

<a id='task3'></a>
# Task 3: Extracting product links on the first page




In [3]:
source = requests.get('https://www.zoopla.co.uk/new-homes/property/london/?q=London&results_sort=newest_listings&search_source=new-homes')
soup = BeautifulSoup(source.content, 'lxml')

In [4]:
productlist = soup.find_all('div', class_ = 'listing-results-right clearfix')


In [5]:
for item in productlist:
    for link in item.find_all('a', href = True, class_ = 'listing-results-price text-price'):
        print(link['href'])

/new-homes/details/57671483
/new-homes/details/57671066
/new-homes/details/57671056
/new-homes/details/57671434
/new-homes/details/57671435
/new-homes/details/57671439
/new-homes/details/57671442
/new-homes/details/57671444
/new-homes/details/57671430
/new-homes/details/57671432
/new-homes/details/57671426
/new-homes/details/57671431
/new-homes/details/57671433
/new-homes/details/57671436
/new-homes/details/57671437
/new-homes/details/57671438
/new-homes/details/57671440
/new-homes/details/57671441
/new-homes/details/57671443
/new-homes/details/57671445
/new-homes/details/57671034
/new-homes/details/57671020
/new-homes/details/57671018
/new-homes/details/57671024
/new-homes/details/53742286


<a id='task4'></a>
# Task 4: Extracting product links on all the pages




In [6]:
productlinks = []
houseimages =[]
for i in range(1,30):
    source = requests.get(f'https://www.zoopla.co.uk/new-homes/property/london/?identifier=london&q=London&new_homes=only&search_source=new-homes&radius=0&pn={i}')
    soup = BeautifulSoup(source.content, 'lxml')
    productlist = soup.find_all('div', class_ = 'listing-results-right clearfix')
    for item in productlist:
        for link in item.find_all('a', href = True, class_ = 'listing-results-price text-price'):
            productlinks.append(base_url + link['href'])
#print((productlinks))

In [7]:
print(len(productlinks))

725


<a id='task5'></a>
# Task 5: Extracting information of the first product

In [8]:
testlink = 'https://www.zoopla.co.uk/new-homes/details/56572113?search_identifier=79819001d635d79c7f10a7fb6a12fb4e'
source = requests.get(testlink, headers = header)
soup = BeautifulSoup(source.content, 'lxml')

In [9]:
name = soup.find('article', class_ = 'dp-sidebar-wrapper__summary').h1.text
print(name)

3 bed flat for sale


In [10]:
property_address = soup.find('article', class_ = 'dp-sidebar-wrapper__summary').h2.text
print(property_address)

Buckhold Road, Wandsworth, London SW18


In [11]:
guide_price = soup.find('p', class_ = 'ui-pricing__main-price ui-text-t4').text
print(guide_price)

£1,450,000


In [12]:
nob = soup.find('article', class_ = 'dp-sidebar-wrapper__summary').h1.text
nob = nob.split()
nob = nob[0]
print(nob)

3


In [13]:
avg_estimated_price_area = soup.find('span', class_ = 'dp-market-stats__price ui-text-t4').text.strip()
print(avg_estimated_price_area)

£558,616


In [14]:
result = []
for li in soup.find('ul', class_ = 'dp-market-stats__list ui-list-flat').find_all('li'):
    result.append(list(li.stripped_strings))

print(result)

[['Average sale price', '£545,588'], ['Properties sold', '185']]


In [15]:
avg_sale_price_area = result[0].pop()
print(avg_sale_price_area)

£545,588


In [16]:
no_of_properties_sold_area = result[1].pop()
print(no_of_properties_sold_area)

185


In [17]:
rent_area_pm = soup.find('div', class_='dp-market-stats dp-market-stats--border-top').span.text.strip()[:-3]
print(rent_area_pm)

£2,457 


In [18]:
areacode = soup.find('h2', class_ = 'ui-property-summary__address').text
area = areacode.split()
a = area[-1]
print(a)

SW18


In [19]:
agent_name = soup.find('div', class_ = 'ui-agent__text').h4.text
print(agent_name)

Strawberry Star - Nine Elms


In [20]:
agent_address = soup.find('address', class_ = 'ui-agent__address').text
print(agent_address)

Sky Gardens, 157 Wandsworth Road, London, SW8 2GB


In [21]:
agent_contact = soup.find('a', href = True, class_ = 'ui-link')
contact = agent_contact['href'][-11:]
print(contact)

02080337758


In [22]:
property_age = soup.find('span', class_ = 'ui-tag').text.strip()[:-5]
print(property_age)

New


In [23]:
floorplan = soup.find('a', class_ = 'dp-floorplan-assets__thumbnail js-ui-modal-open').img['data-src']
print(floorplan)

https://lid.zoocdn.com/u/480/360/2abe41e8ca989149b41f022576480f984e68527b.jpg


In [24]:
house = {
    'Property Name':name,
    'Property Address':property_address,
    'Property Age': property_age,
    'Area code':a,
    'Guide Price': guide_price,
    'No. of Bedrooms':nob,
    'Average estimated property price in this area in past 12 months':avg_estimated_price_area,
    'Avearge property sale price in this area in past 12 months':avg_sale_price_area,
    'No. of properties sold in this area in past 12 months':no_of_properties_sold_area,
    'Rental price of this property (pcm)':rent_area_pm,
    'Name of agent': agent_name,
    'Address of agent': agent_address,
    'Agent contact number': contact,
    'Floor plan link': floorplan


}
print(house)

{'Property Name': '3 bed flat for sale', 'Property Address': 'Buckhold Road, Wandsworth, London SW18', 'Property Age': 'New', 'Area code': 'SW18', 'Guide Price': '£1,450,000', 'No. of Bedrooms': '3', 'Average estimated property price in this area in past 12 months': '£558,616', 'Avearge property sale price in this area in past 12 months': '£545,588', 'No. of properties sold in this area in past 12 months': '185', 'Rental price of this property (pcm)': '£2,457 ', 'Name of agent': 'Strawberry Star - Nine Elms', 'Address of agent': 'Sky Gardens, 157 Wandsworth Road, London, SW8 2GB', 'Agent contact number': '02080337758', 'Floor plan link': 'https://lid.zoocdn.com/u/480/360/2abe41e8ca989149b41f022576480f984e68527b.jpg'}


<a id='task6'></a>
# Task 6: Extracting information of all the products





In [25]:
houselist = []


for link in productlinks:
    source = requests.get(link, headers = header)
    soup = BeautifulSoup(source.content, 'lxml')
    try:
        name = soup.find('article', class_ = 'dp-sidebar-wrapper__summary').h1.text
    except:
        name = 'No information'
    try:
        property_address = soup.find('article', class_ = 'dp-sidebar-wrapper__summary').h2.text
    except:
        property_address = 'No information'
    
    try:
        guide_price = soup.find('p', class_ = 'ui-pricing__main-price ui-text-t4').text
    except: 
        guide_price = 'No information'
        

    try:     
        nob = soup.find('article', class_ = 'dp-sidebar-wrapper__summary').h1.text
        nob = nob.split()
        nob = nob[0]
    except: 
        nob = 'No information'
    
    try:
        avg_estimated_price_area = soup.find('span', class_ = 'dp-market-stats__price ui-text-t4').text.strip()
    except: 
        avg_estimated_price_area = 'No information'
    
    result = []
    for li in soup.find('ul', class_ = 'dp-market-stats__list ui-list-flat').find_all('li'):
        result.append(list(li.stripped_strings))
    try: 
        avg_sale_price_area = result[0].pop()
    except: 
        avg_sale_price_area = 'No information'
    try: 
        no_of_properties_sold_area = result[1].pop()
    except: 
        no_of_properties_sold_area = 'No information'
    
    try: 
        rent_area_pm = soup.find('div', class_='dp-market-stats dp-market-stats--border-top').span.text.strip()[:-3]
    except: 
        rent_area_pm = 'No information'
    try: 
        areacode = soup.find('h2', class_ = 'ui-property-summary__address').text
        area = areacode.split()
        a = area[-1]
    except: 
        a = 'No information'
    
    try: 
        agent_name = soup.find('div', class_ = 'ui-agent__text').h4.text
    except: 
        agent_name ='No information'
    
    try: 
        agent_address = soup.find('address', class_ = 'ui-agent__address').text
    except:
        agent_address = 'No information'
    
    try: 
        agent_contact = soup.find('a', href = True, class_ = 'ui-link')
        contact = agent_contact['href'][-11:]
    except:
        contact = 'No information'
    
    try:
        property_age = soup.find('span', class_ = 'ui-tag').text.strip()[:-5]
    except: 
        property_age = 'No information'
    
    try:
        floorplan = soup.find('a', class_ = 'dp-floorplan-assets__thumbnail js-ui-modal-open').img['data-src']
    except:
        floorplan = 'Not given'

    house = {
    'Property Name':name,
    'Property Address':property_address,
    'Property Age': property_age,
    'Area code':a,
    'Guide Price': guide_price,
    'No. of Bedrooms':nob,
    'Average estimated property price in this area in past 12 months':avg_estimated_price_area,
    'Avearge property sale price in this area in past 12 months':avg_sale_price_area,
    'No. of properties sold in this area in past 12 months':no_of_properties_sold_area,
    'Rental price of this property (pcm)':rent_area_pm,
    'Name of agent': agent_name,
    'Address of agent': agent_address,
    'Agent contact number': contact,
    'Floor plan link': floorplan
    }
    
    houselist.append(house)
    print('Saving:',house['Property Name'])
    
df = pd.DataFrame(houselist)

Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 2 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 3 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 3 bed flat for sale
Saving: 1 bed flat for sale
Saving: 3 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 3 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 3 bed flat for sale
Saving: 1 bed flat for sale
Saving: 3 bed flat for sale
Saving: 2 bed flat for sale
Saving: Studio for sale
Saving: 1 bed flat for sale
Saving: 3 bed terraced house for sale
Saving: 1 bed 

Saving: 3 bed flat for sale
Saving: 2 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 3 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 3 bed town house for sale
Saving: Studio for sale
Saving: 3 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 3 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 2 bed semi-detached house for sale
Saving: 2 bed flat for sale
Saving: 2 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 4 bed flat for sale
Saving: 2 bed flat for sale
Saving: 2 bed flat for sale
Saving: 3 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Sav

Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: Studio for sale
Saving: 3 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 3 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 3 bed flat for sale
Saving: 2 bed flat for sale
Saving: 3 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 2 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 1 bed flat for sale
Saving: 2 bed flat for sale
Saving: 3 bed flat for sale
Saving: 2 bed flat for sale
Saving: 1 bed flat for s

# FINAL DATA 

In [26]:
df

Unnamed: 0,Property Name,Property Address,Property Age,Area code,Guide Price,No. of Bedrooms,Average estimated property price in this area in past 12 months,Avearge property sale price in this area in past 12 months,No. of properties sold in this area in past 12 months,Rental price of this property (pcm),Name of agent,Address of agent,Agent contact number,Floor plan link
0,1 bed flat for sale,"Church Road, London SE19",New,SE19,"£85,000",1,"£333,045","£385,761",88,"£1,079",Brick By Brick - Auckland Rise & Sylvan Hill,"427 Church Road, Croydon, SE19 2QL",02080332909,https://lid.zoocdn.com/u/480/360/59355026ae544...
1,2 bed flat for sale,"Lionel Road South, Kew Bridge TW8",New,TW8,"£600,000",2,"£476,960","£385,675",54,"£1,959",EcoWorld London - Verdo,"Lionel Road S, London, TW8 0JA",02080223244,https://lid.zoocdn.com/u/480/360/efe194ddbadd3...
2,2 bed flat for sale,"Lionel Road South, Kew Bridge TW8",New,TW8,"£635,000",2,"£476,960","£385,675",54,"£1,959",EcoWorld London - Verdo,"Lionel Road S, London, TW8 0JA",02080223244,https://lid.zoocdn.com/u/480/360/0118da23be282...
3,2 bed flat for sale,"Swift House, Southmere, Thamesmead SE2",New,SE2,"£434,600",2,"£216,237","£216,363",11,£984,CBRE,"Millennium Harbour, 22 Westferry Road, London,...",02081155164,Not given
4,1 bed flat for sale,"Swift Court, Southmere, Thamesmead SE2",New,SE2,"£296,400",1,"£216,237","£216,363",11,£750,CBRE,"Millennium Harbour, 22 Westferry Road, London,...",02081155164,Not given
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
720,1 bed flat for sale,"Calum Court, Central Purley, Purley CR8",New,CR8,"£335,000",1,"£285,034","£265,194",27,£849,Foxtons - New Homes Greater South London,"2 High Street, Croydon, CR0 1YA",02080221811,https://lid.zoocdn.com/u/480/360/c7a6d53940e25...
721,1 bed flat for sale,"Calum Court, Central Purley, Purley CR8",New,CR8,"£345,000",1,"£285,034","£265,194",27,£849,Foxtons - New Homes Greater South London,"2 High Street, Croydon, CR0 1YA",02080221811,https://lid.zoocdn.com/u/480/360/11d2cc3ecb835...
722,1 bed flat for sale,"Arc, Wallington SM6",New,SM6,"£340,000",1,"£275,070","£259,964",62,£975,Foxtons - New Homes South West,"173-177 Clarence Street, Kingston, KT1 1QT",02080220913,https://lid.zoocdn.com/u/480/360/26d19c1c17ab5...
723,2 bed flat for sale,"Calum Court, Central Purley, Purley CR8",New,CR8,"£464,999",2,"£285,034","£265,194",27,"£1,265",Foxtons - New Homes Greater South London,"2 High Street, Croydon, CR0 1YA",02080221811,https://lid.zoocdn.com/u/480/360/43eade80cb5dc...


# CONVERT TO XLSX FILE

In [27]:
df.to_excel("zoopla.xlsx", sheet_name = "House_Data_zoopla")