# Data Analysis and Machine Learning Report

**Date of Report:** [Insert report date]

## Executive Summary

This report presents a comprehensive analysis of the "USA Housing Listings" dataset, which contains detailed information about property listings for sale in the United States. The objective of this analysis is to gain insights into the real estate market in the USA, including property types, pricing, amenities, and geographical locations. Additionally, this dataset serves as the foundation for the development of machine learning models for predictive analysis.

### Dataset Overview

- **Number of Records:** [Insert number of records]
- **Data Collection Period:** The dataset is updated periodically.
- **Available Information:** The dataset encompasses data such as property prices, types, square footage, bedrooms, bathrooms, and various amenities like pet allowances, wheelchair accessibility, and more.

This report not only delves into exploratory data analysis but also highlights the application of machine learning models for predictive purposes, including price prediction and property type classification.

## Introduction

The "USA Housing Listings" dataset, sourced from Craigslist, is a rich repository of information essential for a holistic understanding of the real estate landscape in the United States. Beyond providing valuable insights, it serves as the basis for the development of machine learning models, adding a predictive dimension to the analysis.

[Include any specific objectives, goals, or additional context here.]

## Exploratory Data Analysis

[Insert sections for EDA, data cleaning, visualizations, and insights.]

## Machine Learning Models

[Detail the machine learning models being employed, their objectives, and methodologies.]

## Conclusion

[Summarize key findings, insights, and any future directions.]

## Property Listings Table

For reference, the following table describes the columns within the "USA Housing Listings" dataset:


| Column Name               | Description                                       |
|---------------------------|---------------------------------------------------|
| id                        | Unique identifier for each listing.              |
| url                       | Listing URL for more details.                   |
| region                    | Geographical region or city where the property is located. |
| region_url                | URL of the region where the property is located. |
| price                     | Property sale price.                             |
| type                      | Property type (house, apartment, etc.).         |
| sqfeet                    | Property square footage in square feet.         |
| beds                      | Number of bedrooms in the property.             |
| baths                     | Number of bathrooms in the property.            |
| cats_allowed              | Indication if cats are allowed (1 for allowed, 0 for not allowed). |
| dogs_allowed              | Indication if dogs are allowed (1 for allowed, 0 for not allowed). |
| smoking_allowed           | Indication if smoking is allowed (1 for allowed, 0 for not allowed). |
| wheelchair_access         | Indication if the property is wheelchair accessible (1 for accessible, 0 for not accessible). |
| electric_vehicle_charge   | Indication if the property offers electric vehicle charging (1 for available, 0 for not available). |
| comes_furnished           | Indication if the property is furnished (1 for furnished, 0 for unfurnished). |
| laundry_options           | Laundry facilities available on the property (e.g., on-site, in-unit, etc.). |
| parking_options           | Parking options available on the property (e.g., garage, street, etc.). |
| image_url                 | URL of an image of the property.                |
| description               | Description of the property provided in the listing. |
| lat                       | Latitude of the property's location.             |
| long                      | Longitude of the property's location.            |
| state                     | State in which the property is located.        |

#### Real Table
| id | url | region | region_url | price | type | sqfeet | beds | baths | cats_allowed | dogs_allowed | smoking_allowed | wheelchair_access | electric_vehicle_charge | comes_furnished | laundry_options | parking_options | image_url | description | lat | long | state |
|----|-----|--------|------------|-------|------|--------|------|-------|-------------|-------------|-----------------|-------------------|-------------------------|----------------|-----------------|----------------|-----------|-------------|-----|------|-------|
| 1  | [Link](url) | New York | [New York Region](region_url) | $500,000 | House | 2000 | 3 | 2 | 1 | 1 | 0 | 1 | 1 | 1 | On-site | Garage | [Image Link](image_url) | Beautiful 3-bedroom house in New York City | 40.7128 | -74.0060 | NY |
| 2  | [Link](url) | Los Angeles | [LA Region](region_url) | $750,000 | Apartment | 1200 | 2 | 2 | 0 | 1 | 1 | 0 | 0 | 1 | In-unit | Street | [Image Link](image_url) | Modern 2-bedroom apartment in the heart of LA | 34.0522 | -118.2437 | CA |
| 3  | [Link](url) | Chicago | [Chicago Region](region_url) | $400,000 | Condo | 1000 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | On-site | Covered | [Image Link](image_url) | Cozy 1-bedroom condo in downtown Chicago | 41.8781 | -87.6298 | IL |
| ...| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |




In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import missingno as msno

In [None]:
dataset = pd.read_csv('housing.csv')
dataset.head()

Unnamed: 0,id,url,region,region_url,price,type,sqfeet,beds,baths,cats_allowed,...,wheelchair_access,electric_vehicle_charge,comes_furnished,laundry_options,parking_options,image_url,description,lat,long,state
0,7049044568,https://reno.craigslist.org/apa/d/reno-beautif...,reno / tahoe,https://reno.craigslist.org,1148,apartment,1078,3,2.0,1,...,0,0,0,w/d in unit,carport,https://images.craigslist.org/01616_daghmBUvTC...,Ridgeview by Vintage is where you will find al...,39.5483,-119.796,ca
1,7049047186,https://reno.craigslist.org/apa/d/reno-reduced...,reno / tahoe,https://reno.craigslist.org,1200,condo,1001,2,2.0,0,...,0,0,0,w/d hookups,carport,https://images.craigslist.org/00V0V_5va0MkgO9q...,Conveniently located in the middle town of Ren...,39.5026,-119.789,ca
2,7043634882,https://reno.craigslist.org/apa/d/sparks-state...,reno / tahoe,https://reno.craigslist.org,1813,apartment,1683,2,2.0,1,...,0,0,0,w/d in unit,attached garage,https://images.craigslist.org/00t0t_erYqC6LgB8...,2BD | 2BA | 1683SQFTDiscover exceptional servi...,39.6269,-119.708,ca
3,7049045324,https://reno.craigslist.org/apa/d/reno-1x1-fir...,reno / tahoe,https://reno.craigslist.org,1095,apartment,708,1,1.0,1,...,0,0,0,w/d in unit,carport,https://images.craigslist.org/00303_3HSJz75zlI...,MOVE IN SPECIAL FREE WASHER/DRYER WITH 6 OR 12...,39.4477,-119.771,ca
4,7049043759,https://reno.craigslist.org/apa/d/reno-no-long...,reno / tahoe,https://reno.craigslist.org,289,apartment,250,0,1.0,1,...,1,0,1,laundry on site,,https://images.craigslist.org/01616_fALAWFV8zQ...,"Move In Today: Reno Low-Cost, Clean & Furnishe...",39.5357,-119.805,ca


In [None]:
dataset.columns 

Index(['id', 'url', 'region', 'region_url', 'price', 'type', 'sqfeet', 'beds',
       'baths', 'cats_allowed', 'dogs_allowed', 'smoking_allowed',
       'wheelchair_access', 'electric_vehicle_charge', 'comes_furnished',
       'laundry_options', 'parking_options', 'image_url', 'description', 'lat',
       'long', 'state'],
      dtype='object')

In [None]:
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 384977 entries, 0 to 384976
Data columns (total 22 columns):
 #   Column                   Non-Null Count   Dtype  
---  ------                   --------------   -----  
 0   id                       384977 non-null  int64  
 1   url                      384977 non-null  object 
 2   region                   384977 non-null  object 
 3   region_url               384977 non-null  object 
 4   price                    384977 non-null  int64  
 5   type                     384977 non-null  object 
 6   sqfeet                   384977 non-null  int64  
 7   beds                     384977 non-null  int64  
 8   baths                    384977 non-null  float64
 9   cats_allowed             384977 non-null  int64  
 10  dogs_allowed             384977 non-null  int64  
 11  smoking_allowed          384977 non-null  int64  
 12  wheelchair_access        384977 non-null  int64  
 13  electric_vehicle_charge  384977 non-null  int64  
 14  come

In [None]:

msno.matrix(dataset)
plt.show()
