# Relationship Between House Properties and Prices in Saudi Arabia



## 1. Introduction

This report provides a comprehensive analysis of the Saudi Arabian real estate market using the **SA_Aqar** dataset. The primary objective is to uncover insights, trends, and patterns in the data without the use of machine learning techniques. The analysis focuses on data cleaning, exploratory data analysis (EDA), and deriving actionable insights through statistical methods. The findings and recommendations aim to assist stakeholders in making data-driven decisions to maximize profitability and meet buyer demands.


## 2. Data Overview

### Location Features
- **city**: The city where the property is located.
- **district**: The district within the city.

### Property Characteristics
- **size**: The size of the property in square feet.
- **property_age**: The age of the property in years.
- **bedrooms**: The number of bedrooms.
- **bathrooms**: The number of bathrooms.
- **livingrooms**: The number of living rooms.
- **kitchen**: The number of kitchens.
- **garage**: Whether the property has a garage (1 for Yes, 0 for No).
- **driver_room**: Whether the property has a driver's room (1 for Yes, 0 for No).
- **maid_room**: Whether the property has a maid's room (1 for Yes, 0 for No).

- **furnished**: Whether the property is furnished (1 for Yes, 0 for No).
- **ac**: Whether the property has air conditioning (1 for Yes, 0 for No).
- **roof**: Whether the property has a roof (1 for Yes, 0 for No).
- **pool**: Whether the property has a pool (1 for Yes, 0 for No).
- **frontyard**: Whether the property has a front yard (1 for Yes, 0 for No).
- **basement**: Whether the property has a basement (1 for Yes, 0 for No).
- **duplex**: Whether the property is a duplex (1 for Yes, 0 for No).
- **stairs**: Whether the property has stairs (1 for Yes, 0 for No).
- **elevator**: Whether the property has an elevator (1 for Yes, 0 for No).
- **fireplace**: Whether the property has a fireplace (1 for Yes, 0 for No).

### Target Variable
- **price**: The price of the property.

### Additional Details
- **details**: A text description of the property.

## 3. Data Cleaning


### 3.1 Handling Null Values
- There were no null values in the columns we are going to use

### 3.2 Handling Outliers
- Outliers were identified in the `size` column using the Interquartile Range (IQR) method.
- Outliers were removed to ensure the data was clean and suitable for analysis, improving the accuracy of the statistical analysis.

![image.png](attachment:image.png)  ![image-2.png](attachment:image-2.png)

## 4. Exploratory Data Analysis (EDA)

### 4.1 Univariate Analysis

#### 4.1.1 Distribution of Numerical Features
- The distributions of numerical features such as `size`, `property_age`, `bedrooms`, and `bathrooms` were analyzed.
- Most houses had:
  - Sizes between 200 and 400 square meters.
  - Ages between 1 and 8 years.
  - **3 to 7 bedrooms** and **3 to 5 bathrooms.**

![image-2.png](attachment:image-2.png)

#### 4.1.2 Distribution of Houses by City
- The highest number of houses listed was in **(Al Khobar)**.
- Other cities with significant listings included **Riyadh** and **Jeddah**.

![image-2.png](attachment:image-2.png)

#### 4.1.3 Most Frequent Words in the `details` Column
- The most frequent words in property descriptions included:
  - **rent**
  - **floor**
  - **Hall**
- These words are likely used in marketing campaigns to attract buyers and highlight key features of the properties.


![image-3.png](attachment:image-3.png)

### 4.2 Bivariate Analysis

#### 4.2.1 Average Prices by City
- **Jeddah** had the highest average house prices, followed by **Riyadh** and **Al Khobar**.
- This could be due to higher demand or premium locations in these cities.

![image-3.png](attachment:image-3.png)

#### 4.2.2 Average Property Age by City
- **Riyadh** had the oldest houses on average, indicating a need for renewance or maintenance.
- **Jeddah** and **Dammam** had relatively newer properties.

![image-3.png](attachment:image-3.png)

#### 4.2.3 Price vs. Property Age
- Houses aged **30 to 35 years** had significantly higher prices compared to other age groups.
- This could be due to unique features such as larger sizes, additional amenities (e.g., driver_room, maid_room, fireplace), or prime locations.

![image.png](attachment:image.png)

### 4.3 Multivariate Analysis

#### 4.3.1 Correlation Matrix
- The correlation matrix revealed that:
  - `size` and `AC` had the strongest positive correlation with `price`.
  - Features like`bathrooms`, and `livingrooms` also showed moderate positive correlations with `price`.

![image.png](attachment:image.png)

#### 4.3.2 Scatter Plots
- Scatter plots between numerical features and `price` confirmed the trends observed in the correlation matrix.
- Larger houses with more bedrooms and bathrooms tended to have higher prices.

![image-2.png](attachment:image-2.png)

#### 4.3.3 Facet Plot
To further explore the relationships between key numerical variables (size, price) in each city, a facet plot was created using Seaborn. This visualization provides a view of how these variables relate with each other.

![image-2.png](attachment:image-2.png)

## 5. My Insights

1. **Location Matters:**
   - Cities like **Jeddah** and **Riyadh** have higher average prices, likely due to higher demand and premium locations.
   - **Al Khobar** has the oldest houses, indicating a potential market for renovations and redevelopment.
   - District-level analysis reveals that certain districts within these cities have even higher demand, suggesting targeted investments in these areas could yield higher returns.


2. **Size and Amenities Drive Prices:**
   - Larger houses with more bedrooms, bathrooms, and amenities (e.g., fireplace, maid_room, pool) command significantly higher prices.
   - Houses aged **30 to 35 years** are priced higher due to their unique features, such as larger plots, historical value, or prime locations.
   - Properties with luxury amenities (e.g., elevators, duplexes, and fireplaces) are in high demand, especially among affluent buyers.


3. **Marketing Words Influence Buyers:**
   - Words like **شقق (apartments)**, **فيلا(villa)**, and **مفروشة (furnished)** are frequently used in property descriptions and may influence buyer decisions.
   - Properties described as **مفروشة (furnished)** or **فيلا(villa)** tend to attract higher interest, suggesting that marketing campaigns should emphasize these features.
   - The presence of **شقق (apartments)** in descriptions indicates a strong market for smaller, more affordable housing options.

## 6. Recommendations

1. **Focus on High-Demand Cities and Districts:**
   - Real estate developers should prioritize investments in high-demand cities like **Jeddah** and **Riyadh**, particularly in districts with the highest average prices.
   - Conduct micro-market analysis to identify specific neighborhoods or districts within these cities that offer the best ROI.

2. **Renovate and Reposition Older Properties:**
   - Older houses in **Al Khobar** present a unique opportunity for renovation and repositioning as premium properties.
   - Focus on modernizing these properties by adding luxury amenities (e.g., smart home features, energy-efficient systems) to attract buyers willing to pay a premium.

3. **Highlight Key Words in Marketing Campaigns:**
   - Emphasize features like size, bedrooms, bathrooms, and amenities in marketing campaigns to justify higher prices.
   - Use data-driven insights to identify which features resonate most with buyers in specific cities or districts.

4. **Use Data-Driven Pricing Strategies:**
   - Use statistical insights to set competitive and suitable pricing based on property characteristics, market trends, and buyer preferences.

## 7. Conclusion

This analysis provides valuable insights into the Saudi Arabian real estate market, highlighting key factors that influence house prices. By leveraging these insights and recommendations, stakeholders can make informed decisions to maximize profitability and meet buyer demands. The findings emphasize the importance of location, property features, and marketing strategies in driving real estate success. With a focus on high-demand cities, targeted renovations, and tailored marketing campaigns, stakeholders can unlock new opportunities and stay competitive in the evolving market landscape.

## 8. Next Steps

1. **Expand the Dataset:**
   - Collect more data, including additional features like proximity to schools, hospitals, and transportation hubs, to improve the depth of analysis.

2. **Monitor Market Trends:**
   - Continuously update the dataset to reflect changing market conditions and trends.

3. **Explore Advanced Statistical Methods:**
   - Experiment with more advanced statistical techniques to uncover deeper insights and patterns in the data.

**Prepared by: Ziad Elkafoury**  
**Contact: [Linkedin](https://www.linkedin.com/in/ziad-elkafoury)**