## Compare subtype of property with price
In this project, we will analyze some property data from a CSV file using pandas and plotly.express. The data contains information about the location, type, subtype, area, number of rooms, and price of different properties in Belgium. In this example we will use only subtype and price of properties

### Importing libraries

In [17]:
import pandas as pd
import plotly.express as px

### Loading The Dataset
Next, we load our dataset using pandas' read_csv function.


In [18]:
df = pd.read_csv("property_data.csv")

## Step 1: Data Cleaning
First, we need to clean our data to ensure its quality and enhance the accuracy of our results.

### Removing Duplicates
We start by removing any duplicate rows in the dataset.

In [19]:
df = df.drop_duplicates()

### Stripping Whitespaces
Next, we remove any leading or trailing whitespaces from our string data.

In [20]:
df = df.applymap(lambda x: x.strip() if isinstance(x, str) else x)

### Handling Missing Values
We fill missing values in the 'Price of property in euro' column with 0, and in the 'Subtype of property' column with 'unknown'.

In [21]:
df = df.fillna({'Price of property in euro': 0, 'Subtype of property': 'unknown'})

### Dropping Null Values
Finally, we drop any rows with null values in the 'Price of property in euro' column.



In [22]:
df = df.dropna(subset=['Price of property in euro'])

## Step 2: Data Analysis
Once our data is cleaned, we move on to analyze it.

### Grouping Data
First, we group our data by 'Subtype of property' and sort it by the maximum price.

In [23]:
sorted_subtypes = df.groupby('Subtype of property')['Price of property in euro'].mean().sort_values(ascending=False).index


## Data Visualization
We use Plotly to create a Box Plot visualizing how the price of properties varies according to different subtypes of properties. The x-axis represents the 'Subtype of property' and the y-axis represents the 'Price of property in euro'.

In [25]:
fig = px.box(df, x='Subtype of property', y='Price of property in euro',
             category_orders={"Subtype of property": sorted_subtypes},
             color='Subtype of property',
             color_discrete_sequence=px.colors.qualitative.Pastel)

fig.update_layout(
    title={
        'text': "Subtype of Property vs. Price of Property in Euro (Sorted by Average Price)",
        'x': 0.5,
        'font': {'size': 24, 'color': 'black', 'family': 'Arial'}
    },
    xaxis={'title': {'text': 'Subtype of Property', 'font': {'size': 14, 'color': 'black', 'family': 'Arial'}}},
    yaxis={'title': {'text': 'Price of Property in Euro', 'font': {'size': 14, 'color': 'black', 'family': 'Arial'}}},
    boxmode='group',  # Grouped box plots
    width=800,  # Adjust the width of the figure
    height=600,  # Adjust the height of the figure
    plot_bgcolor='rgba(0,0,0,0.02)',  # Light gray background color
    paper_bgcolor='white'  # White background color
)

fig.show()
