Skip to content

RickyH22/pandas_getting_started

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Pandas Learning Project

This repository contains my pandas learning exercises, following the official pandas tutorial.

Files

  • titanic_data.py - Basic pandas DataFrame and Series operations
  • Practice with Titanic passenger data
  • DataFrame creation, filtering, and statistical analysis

What I Learned

  • Creating DataFrames from dictionaries
  • Working with pandas Series
  • Basic statistical operations (.describe(), .max(), .mean())
  • Filtering data
  • Understanding DataFrame vs Series

How to Run

python titanic_data.py

Sample Output

The script demonstrates:

  • DataFrame creation with passenger data
  • Basic statistics on Age column
  • Series creation and manipulation
  • Comparison between DataFrame columns and standalone Series

Next Steps

  • Load real CSV data
  • Data cleaning and manipulation
  • Advanced filtering and grouping
  • Data visualization with matplotlib

Files

  • titanic_data.py - Basic pandas DataFrame and Series operations with sample data
  • titanic.csv.py - Comprehensive real Titanic dataset analysis

titanic.csv.py Features

This advanced analysis includes:

  • Smart data loading (local → online fallback)
  • Complete demographic survival analysis
  • Age group categorization with pd.cut()
  • Statistical summaries and missing data handling
  • Real-world dataset exploration techniques

Sample Analysis Output

The script provides insights like:

  • Overall survival rate: ~38.4%
  • Female survival rate: ~74.2%
  • Male survival rate: ~18.9%
  • 1st class survival rate: ~62.9%
  • 3rd class survival rate: ~24.2%

About

Pandas Getting Started

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages