This project performs a detailed exploratory data analysis (EDA) on a landslide dataset using Python, pandas, matplotlib, and seaborn. The goal is to uncover meaningful patterns, trends, and outliers related to landslide occurrences, helping researchers or decision-makers understand environmental and human factors.
- Correlation heatmap of all numerical features
- Monthly frequency of landslides
- Environmental condition distributions:
- Precipitation
- Temperature
- Slope Angle
- Population density vs. Landslide category
- Types of landslides
- Top 10 locations with most landslides
- Landslides by day of the week
- Landslides by hour of the day
- Outlier detection using IQR for:
- Fatalities
- Injuries
- Economic loss
β All visualizations display automatically in sequence with fallback (dummy) data generated for missing columns.
- Python 3.x
- pandas
- seaborn
- matplotlib
- numpy
The dataset used is expected to be a CSV file with a minimum column named:
event_date(used to extract month, hour, day)
Optional (but auto-filled if missing):
precipitationtemperatureslope_anglepopulation_densitylandslide_categorylandslide_typelocationfatalities,injuries,economic_loss
- Clone the repository:
git clone https://github.com/your-username/landslide-eda.git cd landslide-eda