# Exploratory Data Analysis (EDA) on NYC Airbnb Dataset
In this notebook, we'll be performing an exploratory data analysis on the Airbnb dataset. We will use `wandb` to log our results and `ydata_profiling` for generating data profiles.

## 1. Import Required Libraries

In [None]:
import wandb
import os
import warnings
import pandas as pd
from ydata_profiling import ProfileReport

# Ignore warnings for cleaner output
warnings.filterwarnings('ignore')

## 2. Initialize W&B Run

In [None]:
# Start a W&B run to log EDA steps and save code
run = wandb.init(project='nyc_airbnb', group='eda', save_code=True)

## 3. Download and Load Data

In [None]:
# Download the latest version of the artifact from W&B and read it
local_path = wandb.use_artifact('sample.csv:latest').download()

# Load the data into a pandas DataFrame
df = pd.read_csv(os.path.join(local_path, 'sample1.csv'))

## 4. Generate Profile Report for Data Overview

In [None]:
# Generate an interactive profile report to explore the dataset
profile = ProfileReport(df)
profile.to_widgets()

## 5. Drop Price Outliers

In [None]:
# Set the minimum and maximum price range to remove outliers
min_price = 10
max_price = 350

# Filter out listings with prices outside the specified range
idx = df['price'].between(min_price, max_price)
df = df[idx].copy()

## 6. Convert `last_review` to Datetime

In [None]:
# Convert 'last_review' column to datetime format
df['last_review'] = pd.to_datetime(df['last_review'])

# Display the updated profile report after cleaning the data
profile.to_widgets()

## 7. Inspect Data

In [None]:
# Display basic information about the cleaned dataset
print(df.info())

## 8. Finish W&B Run

In [None]:
# End the W&B run to log the results
run.finish()