Skip to content

Exploratory Data Analysis (EDA) on the Titanic dataset using Python. Includes summary statistics, visualizations (histograms, boxplots, correlation heatmap), missing value analysis, and feature-level insights. Tools used: Pandas, NumPy, Matplotlib, Seaborn.

Notifications You must be signed in to change notification settings

ad980096/Exploratory-data-analysis-task2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Exploratory-data-analysis-task2

Overview:-This project performs Exploratory Data Analysis (EDA) on the Titanic dataset to understand data patterns, correlations, missing values, and feature behavior. The goal is to analyze the dataset using statistical summaries and visualizations.

🎯 Objectives Generate summary statistics (mean, median, std, etc.) Visualize numeric features using histograms and boxplots Analyze feature relationships using a correlation matrix Detect patterns, trends, and anomalies Make basic feature-level inferences from the data 🧰 Tools & Libraries:-Python,Pandas,NumPy,Matplotlib,Seaborn (only for loading dataset or optional visuals) πŸ“‚ Dataset The project uses the Titanic dataset. You may load it through Seaborn or use a local titanic.csv. import seaborn as sns df = sns.load_dataset("titanic") Or place titanic.csv in the project folder. πŸ“Š What This Project Includes:- βœ” Summary statistics (numeric & categorical) βœ” Missing value analysis βœ” Histograms for numeric variables βœ” Boxplots grouped by survival βœ” Correlation heatmap βœ” Scatter plots for key features βœ” Categorical value counts βœ” Basic insights like survival rate by sex and class STRUCTURE:- β”œβ”€β”€ eda_titanic.ipynb # Jupyter Notebook with full EDA β”œβ”€β”€ titanic.csv # Dataset (if used) β”œβ”€β”€ titanic_summary.csv # Generated summary file └── README.md # Project documentation

Install required libraries: pip install pandas numpy matplotlib seaborn Open the notebook: jupyter notebook Run all cells.

πŸ“ˆ Results The EDA provides: Key numerical distributions Relationship between passenger class, gender, age, and survival Data quality insights (missing values, outliers) Correlations among numeric features πŸ“¬ Author Project completed as part of a Data Analytics/ML learning task.

About

Exploratory Data Analysis (EDA) on the Titanic dataset using Python. Includes summary statistics, visualizations (histograms, boxplots, correlation heatmap), missing value analysis, and feature-level insights. Tools used: Pandas, NumPy, Matplotlib, Seaborn.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published