<center><h1>An Introduction to the Dataset</h1>
    <center><h3>New York Housing Market</h3><br>
<center>Welcome to this introductory exploration of the "New York Housing Market" dataset – a comprehensive compilation providing a detailed portrait of the city's real estate dynamics. Covering broker titles, property types, bedrooms, bathrooms, and spatial coordinates, this dataset offers a holistic view of New York's housing landscape. Unearth compelling narratives within the data, exploring correlations between property square footage and prices, deciphering the influence of house types, and revealing geographical patterns that shape housing trends across the state. Whether you're a real estate professional, data enthusiast, or a curious observer, this notebook serves as an introduction to analyze, visualize, and gain valuable insights into the diverse and dynamic world of New York houses. Join us in decoding the intricacies, identifying trends, and navigating the complexities of the "New York Housing Market."
    
   **Dataset:** [CLICK HERE](https://www.kaggle.com/datasets/nelgiriyewithana/new-york-housing-market/data)

In [1]:
import pandas as pd
import plotly.express as px

# Load Dataset

In [2]:
df = pd.read_csv("/kaggle/input/new-york-housing-market/NY-House-Dataset.csv")
df.head()

Unnamed: 0,BROKERTITLE,TYPE,PRICE,BEDS,BATH,PROPERTYSQFT,ADDRESS,STATE,MAIN_ADDRESS,ADMINISTRATIVE_AREA_LEVEL_2,LOCALITY,SUBLOCALITY,STREET_NAME,LONG_NAME,FORMATTED_ADDRESS,LATITUDE,LONGITUDE
0,Brokered by Douglas Elliman -111 Fifth Ave,Condo for sale,315000,2,2.0,1400.0,2 E 55th St Unit 803,"New York, NY 10022","2 E 55th St Unit 803New York, NY 10022",New York County,New York,Manhattan,East 55th Street,Regis Residence,"Regis Residence, 2 E 55th St #803, New York, N...",40.761255,-73.974483
1,Brokered by Serhant,Condo for sale,195000000,7,10.0,17545.0,Central Park Tower Penthouse-217 W 57th New Yo...,"New York, NY 10019",Central Park Tower Penthouse-217 W 57th New Yo...,United States,New York,New York County,New York,West 57th Street,"217 W 57th St, New York, NY 10019, USA",40.766393,-73.980991
2,Brokered by Sowae Corp,House for sale,260000,4,2.0,2015.0,620 Sinclair Ave,"Staten Island, NY 10312","620 Sinclair AveStaten Island, NY 10312",United States,New York,Richmond County,Staten Island,Sinclair Avenue,"620 Sinclair Ave, Staten Island, NY 10312, USA",40.541805,-74.196109
3,Brokered by COMPASS,Condo for sale,69000,3,1.0,445.0,2 E 55th St Unit 908W33,"Manhattan, NY 10022","2 E 55th St Unit 908W33Manhattan, NY 10022",United States,New York,New York County,New York,East 55th Street,"2 E 55th St, New York, NY 10022, USA",40.761398,-73.974613
4,Brokered by Sotheby's International Realty - E...,Townhouse for sale,55000000,7,2.373861,14175.0,5 E 64th St,"New York, NY 10065","5 E 64th StNew York, NY 10065",United States,New York,New York County,New York,East 64th Street,"5 E 64th St, New York, NY 10065, USA",40.767224,-73.969856


# Scatter Plot - Price vs. Property Square Footage

In [3]:
scatter_plot_multicolor_log = px.scatter(df, x='PROPERTYSQFT', y='PRICE', color='TYPE',
                                          title='Price vs. Property Square Footage (Logarithmic Scale)',
                                          labels={'PROPERTYSQFT': 'Property Square Footage', 'PRICE': 'Price', 'TYPE': 'House Type'},
                                          log_x=True, log_y=True)
scatter_plot_multicolor_log.show()

# Price Distribution by House Type (Logarithmic Scale)

In [4]:
box_plot_log = px.box(df, x='TYPE', y='PRICE', color='TYPE' ,title='Price Distribution by House Type (Logarithmic Scale)',
                      labels={'TYPE': 'House Type', 'PRICE': 'Price'},
                      log_y=True)  
box_plot_log.show()

# Geographical Distribution of Houses

In [5]:
import plotly.graph_objects as go

map_layout = go.Layout(
    mapbox_style="satellite",  
    mapbox_zoom=10,  
    mapbox_center={"lat": 40.7128, "lon": -74.0060}, 
)

scatter_map = go.Scattermapbox(
    lat=df['LATITUDE'],
    lon=df['LONGITUDE'],
    mode='markers',
    marker=go.scattermapbox.Marker(size=8),
    text=df['ADDRESS'], 
)

fig = go.Figure(data=[scatter_map], layout=map_layout)

fig.show()


# Price vs. Bedrooms vs. Bathrooms

In [6]:
scatter_3d_multicolor = px.scatter_3d(df, x='BEDS', y='BATH', z='PRICE', title='Price vs. Bedrooms vs. Bathrooms',
                                       labels={'BEDS': 'Bedrooms', 'BATH': 'Bathrooms', 'PRICE': 'Price'},
                                       color='PRICE')
scatter_3d_multicolor.show()

# Categorization of Houses

In [7]:
sunburst_chart = px.sunburst(df, path=['STATE', 'LOCALITY', 'SUBLOCALITY'], title='Categorization of Houses')
sunburst_chart.show()


# Distribution of Property Square Footage by House Type (Logarithmic Scale)

In [8]:
violin_plot_multicolor_log = px.violin(df, x='TYPE', y='PROPERTYSQFT', title='Distribution of Property Square Footage by House Type (Logarithmic Scale)',
                                       labels={'TYPE': 'House Type', 'PROPERTYSQFT': 'Property Square Footage'},
                                       color='TYPE',  
                                       log_y=True) 

violin_plot_multicolor_log.show()