# UK Property Data Exploration

## Note Dump


Temporal Trends:

Examine temporal trends in property transactions over time using the 'transfer_date' column.
Identify patterns or seasonality in property transactions.
Geospatial Analysis:

Visualize the distribution of property transactions on a map based on the 'postcode,' 'city,' 'district,' or 'county.'
Analyze variations in property prices across different regions.
Property Type Analysis:

Compare average prices and transaction frequencies for different property types.
Explore the distribution of property types in various localities or cities.
New Build vs. Resale Analysis:

Compare prices and transaction volumes between new builds and resale properties.
Investigate whether the new build flag influences property prices.
Duration of Ownership:

Analyze the average duration of ownership ('duration') for different types of properties.
Explore relationships between duration and property prices.
Categorical Analysis:

Investigate the distribution of properties based on categorical features like 'ppd_category' and 'record_status.'
Explore relationships between these categories and property prices.
Address Analysis:

Explore patterns in property transactions based on address-related columns ('street,' 'locality,' 'city,' 'district,' 'county').
Identify popular streets or localities for property transactions.
Record Status Analysis:

Analyze the distribution and characteristics of records based on the 'record_status' column.
Predictive Modeling:

Develop a predictive model to estimate property prices based on relevant features.
Evaluate the model's performance and identify key predictors of property prices.
Remember to clean and preprocess the data appropriately, handle missing values, and ensure the data is in a suitable format for analysis. Additionally, these ideas can be adapted based on specific questions or goals you have in mind for your analysis.

User
What are some examples of external data we could use to compliment this analysis 
ChatGPT
To enhance your analysis and gain more insights, you can integrate external data sources that provide additional context to the property transactions dataset. Here are some examples of external data you could consider:

Economic Indicators:

Include economic indicators such as inflation rates, interest rates, and GDP growth to understand the broader economic context during the time of property transactions.
Population Demographics:

Integrate demographic data for the regions in your dataset. Information on population density, age distribution, and income levels can provide valuable context for property trends.
Real Estate Market Indices:

Include real estate market indices or housing price indices for the specific regions covered in your dataset. This can help you compare local property price trends with broader market trends.
Interest Rates:

Include historical and current interest rates, as changes in interest rates can impact property demand and prices.
Crime Rates:

Explore crime data for the regions in your dataset. Safety and crime rates can influence property values and buyer decisions.
School Ratings:

If available, include data on school ratings in the areas covered by your dataset. Proximity to well-rated schools can affect property prices.
Transportation Infrastructure:

Integrate information on transportation infrastructure, such as the presence of public transportation, highways, and airports. Proximity to transportation hubs can influence property values.
Urban Development Plans:

Include data on urban development plans or major infrastructure projects in the regions covered by your dataset. This can provide insights into future property value trends.
Weather Data:

Consider including weather data, especially if the dataset spans multiple years. Extreme weather events or climate patterns can have localized effects on property markets.
Retail and Entertainment Venues:

Integrate data on the proximity of properties to shopping centers, entertainment venues, and other amenities. Access to such facilities can impact property values.
Employment Data:

Include data on local employment rates, job opportunities, and industry trends. Economic stability and employment opportunities can influence property demand.
Mortgage Rates and Lending Conditions:

Incorporate information on mortgage rates and lending conditions. Changes in lending practices can affect the ability of buyers to enter the market.
Remember to carefully validate and clean any external data before integration. Additionally, consider the relevance of the external data to the specific questions or hypotheses you are investigating with your property transactions dataset.

User
Im reading rich dad poor dad, recomend me some other similar books
ChatGPT
"Rich Dad Poor Dad" by Robert Kiyosaki is a classic personal finance book that emphasizes financial education and building wealth. If you enjoyed that book, here are some other personal finance and wealth-building books that you might find interesting:

"The Millionaire Next Door" by Thomas J. Stanley and William D. Danko:

This book studies the habits and characteristics of millionaires in the United States, challenging common stereotypes and providing insights into building wealth.
"The Richest Man in Babylon" by George S. Clason:

This classic book uses parables set in ancient Babylon to offer timeless principles of wealth-building and financial success.
"Your Money or Your Life" by Vicki Robin and Joe Dominguez:

This book focuses on transforming your relationship with money and achieving financial independence. It offers a holistic approach to managing money and life.
"The Total Money Makeover" by Dave Ramsey:

Dave Ramsey provides a step-by-step plan for getting out of debt, building an emergency fund, and achieving financial stability through his proven principles.
"The Bogleheads' Guide to Investing" by Taylor Larimore, Mel Lindauer, and Michael LeBoeuf:

This book offers practical advice on investing based on the principles of John C. Bogle, the founder of Vanguard Group, emphasizing simplicity and low-cost index fund investing.
"The Intelligent Investor" by Benjamin Graham:

Often referred to as the "bible of investing," this classic by Benjamin Graham provides timeless principles for value investing and sound investment strategies.
"I Will Teach You to Be Rich" by Ramit Sethi:

Ramit Sethi provides a practical, no-nonsense approach to personal finance, covering topics such as budgeting, investing, and automating your financial life.
"The Little Book of Common Sense Investing" by John C. Bogle:

In this book, Bogle advocates for the simplicity and effectiveness of index fund investing, explaining why low-cost, passive investing is a sound strategy for most investors.
"Secrets of the Millionaire Mind" by T. Harv Eker:

T. Harv Eker explores the mindset and beliefs that differentiate wealthy individuals from others, offering insights on how to reprogram your financial blueprint for success.
"Money: Master the Game" by Tony Robbins:

Tony Robbins interviews successful investors and financial experts to provide a comprehensive guide to achieving financial freedom and mastering the game of money.
Remember to approach each book with an open mind and consider how the principles and strategies align with your personal financial goals and values.

## Questions
- What is the distribution of property prices in the dataset?
- What are the summary statistics of house prices
- How have property prices changed over time?
- What are the average property prices for different property types?
- Is there a significant difference in property prices between new builds and resale properties?
- Does the price of new builds decrease faster over time compared to existing properties?
- Which areas have the highest and lowest property prices? 
- How does the property price vary based on postcode?
- Are there any temporal patterns in property transactions, such as seasonality? -
- What is the average duration of property ownership?
    - Does this change over time
- Are there any correlations between property prices and other factors like property size or number of bedrooms?
- How do different property categories ('ppd_category') affect transaction prices?
- Is there a correlation between property prices and proximity to certain amenities or landmarks?
- How does the record status ('record_status') affect property prices?
- What is the overall transaction volume trend over the years?
- Are there certain streets or localities that consistently have high or low property prices?

## Analysis Type Ideas
- Dashboard of descriptive statistics


## Import Data

In [1]:
import pandas as pd 

price_paid_data = pd.read_csv("pp-complete.csv", header=None)

# Rename columns
price_paid_data = price_paid_data.rename(columns={price_paid_data.columns[0]: 'transaction_id',
                                                  price_paid_data.columns[1]: 'price',
                                                  price_paid_data.columns[2]: 'transfer_date',
                                                  price_paid_data.columns[3]: 'postcode',
                                                  price_paid_data.columns[4]: 'property_type',
                                                  price_paid_data.columns[5]: 'new_build_flag',
                                                  price_paid_data.columns[6]: 'duration',
                                                  price_paid_data.columns[7]: 'primary_address',
                                                  price_paid_data.columns[8]: 'secondary_address',
                                                  price_paid_data.columns[9]: 'street',
                                                  price_paid_data.columns[10]: 'locality',
                                                  price_paid_data.columns[11]: 'city',
                                                  price_paid_data.columns[12]: 'district',
                                                  price_paid_data.columns[13]: 'county',
                                                  price_paid_data.columns[14]: 'ppd_category',
                                                  price_paid_data.columns[15]: 'record_status'})

# Convert 'transfer_date' to datetime format
price_paid_data['transfer_date'] = pd.to_datetime(price_paid_data['transfer_date'])

In [2]:
# Convert 'transfer_date' to datetime format
price_paid_data['transfer_date'] = pd.to_datetime(price_paid_data['transfer_date'])

In [None]:
price_paid_data.head()

Unnamed: 0,transaction_id,price,transfer_date,postcode,property_type,new_build_flag,duration,primary_address,secondary_address,street,locality,city,district,county,ppd_category,record_status
0,{F887F88E-7D15-4415-804E-52EAC2F10958},70000,1995-07-07 00:00,MK15 9HP,D,N,F,31,,ALDRICH DRIVE,WILLEN,MILTON KEYNES,MILTON KEYNES,MILTON KEYNES,A,A
1,{40FD4DF2-5362-407C-92BC-566E2CCE89E9},44500,1995-02-03 00:00,SR6 0AQ,T,N,F,50,,HOWICK PARK,SUNDERLAND,SUNDERLAND,SUNDERLAND,TYNE AND WEAR,A,A
2,{7A99F89E-7D81-4E45-ABD5-566E49A045EA},56500,1995-01-13 00:00,CO6 1SQ,T,N,F,19,,BRICK KILN CLOSE,COGGESHALL,COLCHESTER,BRAINTREE,ESSEX,A,A
3,{28225260-E61C-4E57-8B56-566E5285B1C1},58000,1995-07-28 00:00,B90 4TG,T,N,F,37,,RAINSBROOK DRIVE,SHIRLEY,SOLIHULL,SOLIHULL,WEST MIDLANDS,A,A
4,{444D34D7-9BA6-43A7-B695-4F48980E0176},51000,1995-06-28 00:00,DY5 1SA,S,N,F,59,,MERRY HILL,BRIERLEY HILL,BRIERLEY HILL,DUDLEY,WEST MIDLANDS,A,A


In [10]:
import plotly.express as px
import pandas as pd
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
import numpy as np


# Create a Dash web application
app = dash.Dash(__name__)

# Layout of the web application
app.layout = html.Div([
    html.H1("Aggregated Distribution of Property Prices"),
    
    # Dropdown menu for selecting the year
    dcc.Dropdown(
        id='year-dropdown',
        options=[{'label': year, 'value': year} for year in price_paid_data['transfer_date'].dt.year.unique()],
        value=price_paid_data['transfer_date'].dt.year.max(),  # Default to the latest year
        multi=False,
        style={'width': '50%'}
    ),

    # Plotly bar chart
    dcc.Graph(id='property-price-bar-chart'),
])

# Callback to update the bar chart based on the selected year
@app.callback(
    Output('property-price-bar-chart', 'figure'),
    [Input('year-dropdown', 'value')]
)
def update_bar_chart(selected_year):
    # Filter data based on the selected year
    subset = price_paid_data[price_paid_data['transfer_date'].dt.year == selected_year]

    # Define bins and labels for price ranges
    bins = [0, 100000, 200000, 300000, 400000, 500000, 1000000, float('inf')]
    labels = ['0-100k', '100k-200k', '200k-300k', '300k-400k', '400k-500k', '500k-1M', '1M+']
    
    # Create a new column for price ranges
    subset['price_range'] = pd.cut(subset['price'], bins=bins, labels=labels)
    
    # Aggregate data based on price ranges
    aggregated_data = subset.groupby('price_range').size().reset_index(name='count')

    # Create the bar chart using Plotly Express
    fig = px.bar(aggregated_data, x='price_range', y='count', labels={'price_range': 'Price Range', 'count': 'Count'})
    fig.update_layout(title=f'Aggregated Distribution of Property Prices - Year {selected_year}')

    return fig

# Run the Dash web application
if __name__ == '__main__':
    app.run_server(debug=True)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [12]:
import plotly.express as px
import pandas as pd
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
import numpy as np

# Create a Dash web application
app = dash.Dash(__name__)

# Layout of the web application
app.layout = html.Div([
    html.H1("Aggregated Distribution of Property Prices"),
    
    # Dropdown menu for selecting the year
    dcc.Dropdown(
        id='year-dropdown',
        options=[{'label': year, 'value': year} for year in price_paid_data['transfer_date'].dt.year.unique()],
        value=price_paid_data['transfer_date'].dt.year.max(),  # Default to the latest year
        multi=False,
        style={'width': '50%'}
    ),

    # Plotly histogram
    dcc.Graph(id='property-price-histogram'),
])

# Callback to update the histogram based on the selected year
@app.callback(
    Output('property-price-histogram', 'figure'),
    [Input('year-dropdown', 'value')]
)
def update_histogram(selected_year):
    # Filter data based on the selected year
    subset = price_paid_data[price_paid_data['transfer_date'].dt.year == selected_year]

    # Remove outliers using a simple method (adjust as needed)
    subset = subset[subset['price'] < subset['price'].quantile(0.999)]

    # Create the histogram using Plotly Express with nbins argument and log_y
    fig = px.histogram(subset, x='price', labels={'price': 'Property Price', 'count': 'Density'},
                       nbins=100, histnorm='probability density', log_y=True)
    fig.update_layout(title=f'Aggregated Distribution of Property Prices - Year {selected_year}')

    return fig

# Run the Dash web application
if __name__ == '__main__':
    app.run_server(debug=True)


## Import required libraries

In [15]:
import pandas as pd

## Import data and fix data issues

### Data links and info
https://www.gov.uk/government/statistical-data-sets/price-paid-data-downloads
https://www.gov.uk/guidance/about-the-price-paid-data#data-excluded-from-price-paid-data

In [18]:
price_paid_data.transfer_date.max()

'2023-09-29 00:00'

In [19]:
price_paid_data.transfer_date.min()

'1995-01-01 00:00'

In [5]:
price_paid_data.loc[price_paid_data.postcode == "SW17 9DT"]

Unnamed: 0,transaction_id,price,transfer_date,postcode,property_type,new_build_flag,duration,primary_address,secondary_address,street,locality,city,district,county,ppd_category,record_status
67261,{E9331642-EC29-44F8-BC5D-D56A8272B6BE},51000,1995-07-28 00:00,SW17 9DT,F,N,L,87B,,LYVEDEN ROAD,LONDON,LONDON,MERTON,GREATER LONDON,A,A
289037,{81BA2F5F-C690-4C70-9D3E-376EBABDC4E1},86000,1995-05-12 00:00,SW17 9DT,S,N,F,35,,LYVEDEN ROAD,LONDON,LONDON,MERTON,GREATER LONDON,A,A
350894,{D288430A-4CF0-42AB-8487-7312CC41679B},53000,1995-08-24 00:00,SW17 9DT,F,N,L,25,FLAT 2,LYVEDEN ROAD,LONDON,LONDON,MERTON,GREATER LONDON,A,A
629018,{20D1F1EE-9BA0-4772-9161-77B19E0F4AC8},67500,1995-05-25 00:00,SW17 9DT,T,N,F,109,,LYVEDEN ROAD,LONDON,LONDON,MERTON,GREATER LONDON,A,A
686951,{182A6629-0680-438E-B682-99AE03F72E39},61750,1995-02-03 00:00,SW17 9DT,T,N,F,73,,LYVEDEN ROAD,LONDON,LONDON,MERTON,GREATER LONDON,A,A
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
27214284,{CFC9085C-FB4D-9A70-E053-6B04A8C09D6A},560000,2021-06-11 00:00,SW17 9DT,T,N,F,65,,LYVEDEN ROAD,,LONDON,MERTON,GREATER LONDON,A,A
27305243,{D707E536-9023-0AD9-E053-6B04A8C067CC},550000,2021-11-23 00:00,SW17 9DT,T,N,F,109,,LYVEDEN ROAD,,LONDON,MERTON,GREATER LONDON,B,A
27442723,{E53EDD2E-2F09-83EC-E053-6B04A8C03A59},418000,2022-06-24 00:00,SW17 9DT,F,N,L,81,FLAT 1,LYVEDEN ROAD,,LONDON,MERTON,GREATER LONDON,A,A
27484009,{D93B27B1-4855-3100-E053-6C04A8C08887},415000,2022-01-27 00:00,SW17 9DT,F,N,L,81B,,LYVEDEN ROAD,,LONDON,MERTON,GREATER LONDON,A,A


In [6]:
price_paid_data.loc[price_paid_data.postcode == "GU22 8QR"]

Unnamed: 0,transaction_id,price,transfer_date,postcode,property_type,new_build_flag,duration,primary_address,secondary_address,street,locality,city,district,county,ppd_category,record_status
1330262,{2672C88F-FD94-4F2B-BE40-ABE20235537C},322000,1996-11-21 00:00,GU22 8QR,D,N,F,GRAN BAITA,,PYRFORD WOODS ROAD,WOKING,WOKING,WOKING,SURREY,A,A
4236661,{5B60D6C1-1A89-49F4-9808-6BD997C01878},460000,1999-03-17 00:00,GU22 8QR,D,N,F,CHATEAUGUAY,,PYRFORD WOODS ROAD,,WOKING,WOKING,SURREY,A,A
5248769,{B91DA968-1A44-45E0-BAB0-885D6A424280},499950,2000-11-24 00:00,GU22 8QR,D,N,F,BIRCH BURROW,,PYRFORD WOODS ROAD,,WOKING,WOKING,SURREY,A,A
5326579,{FF333375-6C02-4941-A4A5-3E604D707522},412500,2000-09-05 00:00,GU22 8QR,D,N,F,INCHMORE,,PYRFORD WOODS ROAD,,WOKING,WOKING,SURREY,A,A
6931928,{5C517103-A1FE-499C-B207-E562A7DA7FEF},425000,2001-07-02 00:00,GU22 8QR,D,N,F,SENANG,,PYRFORD WOODS ROAD,WOKING,WOKING,WOKING,SURREY,A,A
8803998,{77952907-3F84-49AA-BABE-441A023BFFDF},485000,2002-11-12 00:00,GU22 8QR,D,N,F,RESTWYNDE,,PYRFORD WOODS ROAD,WOKING,WOKING,WOKING,SURREY,A,A
11683864,{829E69B5-4C44-4DF2-A6BC-28E710AA22FE},753000,2005-03-18 00:00,GU22 8QR,D,N,F,RESTWYNDE,,PYRFORD WOODS ROAD,WOKING,WOKING,WOKING,SURREY,A,A
11871206,{FB18CE9D-20B7-4ADD-8129-0E1851BFAA47},660000,2005-09-14 00:00,GU22 8QR,D,N,F,NEW HAVEN,,PYRFORD WOODS ROAD,WOKING,WOKING,WOKING,SURREY,A,A
15950477,{A74C7EB5-9BF7-4874-9EEB-353810571BCE},925000,2009-11-26 00:00,GU22 8QR,D,N,F,CHATEAUGUAY,,PYRFORD WOODS ROAD,,WOKING,WOKING,SURREY,A,A
20176657,{02E92A0F-FC38-4F6E-8348-AD3317728FDE},1000000,2015-03-31 00:00,GU22 8QR,D,N,F,BROADWAY,,PYRFORD WOODS ROAD,,WOKING,WOKING,SURREY,A,A


In [7]:
price_paid_data.loc[(price_paid_data.postcode == "TN6 2HN") & (price_paid_data.street=="OSBORNE ROAD")].head(70)

Unnamed: 0,transaction_id,price,transfer_date,postcode,property_type,new_build_flag,duration,primary_address,secondary_address,street,locality,city,district,county,ppd_category,record_status
97913,{E24370D9-80C9-43CA-BEF5-C041CAEF9150},192450,1995-11-16 00:00,TN6 2HN,D,Y,F,2,,OSBORNE ROAD,CROWBOROUGH,CROWBOROUGH,WEALDEN,EAST SUSSEX,A,A
102436,{740801F1-EF9C-43B2-B4CC-CB702550F2F9},197000,1995-05-26 00:00,TN6 2HN,D,Y,F,23,,OSBORNE ROAD,CROWBOROUGH,CROWBOROUGH,WEALDEN,EAST SUSSEX,A,A
116751,{94C60664-EE7C-4EBB-8509-3D19D13C2B12},188000,1995-01-27 00:00,TN6 2HN,D,Y,F,19,,OSBORNE ROAD,CROWBOROUGH,CROWBOROUGH,WEALDEN,EAST SUSSEX,A,A
152414,{F2A089D0-8484-4915-A130-486A7B72B428},192000,1995-06-30 00:00,TN6 2HN,D,Y,F,16,,OSBORNE ROAD,CROWBOROUGH,CROWBOROUGH,WEALDEN,EAST SUSSEX,A,A
211086,{21F9454E-C7DA-41F5-867F-5FBDBE879E45},188500,1995-03-24 00:00,TN6 2HN,D,Y,F,15,,OSBORNE ROAD,CROWBOROUGH,CROWBOROUGH,WEALDEN,EAST SUSSEX,A,A
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
25718693,{B82222EC-BB16-6691-E053-6B04A8C02FB2},645000,2020-11-09 00:00,TN6 2HN,D,N,F,IVY LODGE,,OSBORNE ROAD,,CROWBOROUGH,WEALDEN,EAST SUSSEX,A,A
26125984,{D707E535-2183-0AD9-E053-6B04A8C067CC},687500,2021-08-31 00:00,TN6 2HN,D,N,F,14,,OSBORNE ROAD,,CROWBOROUGH,WEALDEN,EAST SUSSEX,A,A
26429714,{CB0035E5-B77E-58AE-E053-6B04A8C091AF},795000,2021-04-28 00:00,TN6 2HN,D,N,F,19,,OSBORNE ROAD,,CROWBOROUGH,WEALDEN,EAST SUSSEX,A,A
26730125,{D22473F6-7A19-7B40-E053-6C04A8C0A630},550000,2021-06-28 00:00,TN6 2HN,D,N,F,FERNDELL,,OSBORNE ROAD,,CROWBOROUGH,WEALDEN,EAST SUSSEX,B,A


In [8]:
price_paid_data.loc[(price_paid_data.postcode == "SW9 0NU") & (price_paid_data.primary_address=="131")]

Unnamed: 0,transaction_id,price,transfer_date,postcode,property_type,new_build_flag,duration,primary_address,secondary_address,street,locality,city,district,county,ppd_category,record_status
1965720,{2A289E9D-32D1-CDC8-E050-A8C063054829},46250,1997-04-24 00:00,SW9 0NU,F,N,L,131,,CRANWORTH GARDENS,,LONDON,LAMBETH,GREATER LONDON,A,A
4412078,{86F65D75-6E28-402F-82B6-C73AD36288BE},184500,1999-11-22 00:00,SW9 0NU,F,N,L,131,,CRANWORTH GARDENS,,LONDON,LAMBETH,GREATER LONDON,A,A
7501987,{025517EC-0094-4EC0-BA50-498DEEDA3D2D},250000,2002-05-10 00:00,SW9 0NU,F,N,L,131,,CRANWORTH GARDENS,LONDON,LONDON,LAMBETH,GREATER LONDON,A,A


In [9]:
price_paid_data.loc[(price_paid_data.postcode == "TN6 2JD")]

Unnamed: 0,transaction_id,price,transfer_date,postcode,property_type,new_build_flag,duration,primary_address,secondary_address,street,locality,city,district,county,ppd_category,record_status
27082048,{D93B27B1-C99F-3100-E053-6C04A8C08887},378000,2021-09-10 00:00,TN6 2JD,O,N,F,ST FLORIAN HOUSE,UNIT 1,BROOKLANDS PARK,,CROWBOROUGH,WEALDEN,EAST SUSSEX,B,A


In [10]:
import plotly.express as px

crowborough_house_prices = price_paid_data.loc[price_paid_data.city=='CROWBOROUGH']
crowborough_house_prices['year'] = pd.DatetimeIndex(crowborough_house_prices['transfer_date']).year
crowborough_house_prices = crowborough_house_prices.groupby('year')['price'].mean().reset_index(name ='mean_price')

fig = px.line(crowborough_house_prices, x="year", y="mean_price", title='Average House Price in Crowborough')
fig.show()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  crowborough_house_prices['year'] = pd.DatetimeIndex(crowborough_house_prices['transfer_date']).year


In [11]:
import plotly.express as px

crowborough_house_prices = price_paid_data.loc[price_paid_data.city=='CROWBOROUGH']
crowborough_house_prices['year'] = pd.DatetimeIndex(crowborough_house_prices['transfer_date']).year
crowborough_house_prices = crowborough_house_prices.groupby('year')['price'].median().reset_index(name ='median_price')

fig = px.line(crowborough_house_prices, x="year", y="median_price", title='Median House Price in Crowborough')
fig.show()



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [12]:
import plotly.express as px

crowborough_house_prices = price_paid_data.loc[price_paid_data.city=='CROWBOROUGH']
crowborough_house_prices=crowborough_house_prices.query('price == price.max()')
crowborough_house_prices

Unnamed: 0,transaction_id,price,transfer_date,postcode,property_type,new_build_flag,duration,primary_address,secondary_address,street,locality,city,district,county,ppd_category,record_status
22315678,{9FF0D96A-75B7-11ED-E053-6C04A8C06383},10000000,2017-10-02 00:00,TN6 3FQ,O,N,F,1,,KINGFISHER DRIVE,,CROWBOROUGH,WEALDEN,EAST SUSSEX,B,A


In [13]:
import plotly.express as px

london_house_prices = price_paid_data.loc[price_paid_data.city=='LONDON']
london_house_prices['year'] = pd.DatetimeIndex(london_house_prices['transfer_date']).year
london_house_prices = london_house_prices.groupby('year')['price'].mean().reset_index(name ='mean_price')

fig = px.line(london_house_prices, x="year", y="mean_price", title='Average House Price in London')
fig.show()



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [14]:
price_paid_data.transfer_date.max()

'2023-09-29 00:00'

## Analysis Ideas
- Find a measure of size and quality
    - council tax bracket data?
    - size data or number of bedroom data?
    - WebScrape RightMove for number of bedroom etc
    - Ammeneties density nearby
- 