Skip to content
This repository has been archived by the owner on Mar 15, 2024. It is now read-only.

Latest commit

 

History

History
37 lines (34 loc) · 1.4 KB

README.md

File metadata and controls

37 lines (34 loc) · 1.4 KB

EDA On Meteorite Landing Sites

This project was created as part of our 3rd semester Introduction to Data Science (UE18CS203) final project. The dataset is hosted on my Google Drive with slight preprocessing: https://drive.google.com/file/d/1nLCVDfQy8NUu9NnD55meyj54ynkxiowp/view

Modules utilized for the project

  • pandas
  • numpy
  • sklearn
  • statsmodels
  • seaborn
  • matplotlib
  • mpl_toolkits
  • scipy
  • cython
  • pydrive
  • google
  • oauth2client

Contents of the project

  • Data Cleaning
  • Data Normalization - using StandardScaler()
  • Visualizations
    • Box plot - Check for outliers
    • Histogram - Check for normalization
    • q-q plot - Check for normalization
    • Map visualizations - Visualize a heat map for landing sites
    • Pie charts - Distribution of different types of meteorites
    • Heat map - Confusion matrix for correlation graph
    • Scatter plot - Visualization for correlation graph
  • Correlation Graph - Find correlations between columns using Heat Map generated
  • Hypothesis testing -
    H0: The difference between mean of sample mass and population mass mass is a statistical fluctuation.
    H1: The difference between mean of sample mass and population mass mass is significatn and not a mere case of statistical fluctuation.

Team