Skip to content

Learn end-to-end Data Analysis: import data from CSV, SQL, Excel, and more; process it with NumPy & Pandas; visualize using Matplotlib & Seaborn; and clean it for reporting. Includes a hands-on Jupyter Notebook tutorial and a quick Python refresher.

Notifications You must be signed in to change notification settings

OmarHeshamShehab/Data-Analysis-with-Python

Repository files navigation

🧠 Data Analysis with Python

Python Jupyter License Status

A complete hands-on guide to mastering data analysis in Python — from NumPy and Pandas to data cleaning, data sourcing, and practical projects.


📚 Table of Contents


🌟 Overview

This repository is a comprehensive Python data analysis learning path designed for beginners and intermediate learners who want to build strong foundations in data science.

Through interactive Jupyter notebooks, you’ll explore:

  • 🧮 Numerical computing with NumPy
  • 📊 Data manipulation with Pandas
  • 🧹 Data cleaning and preprocessing
  • 📁 Reading from multiple data sources (CSV, Excel, JSON, APIs)
  • 🐍 Python programming recap
  • 💡 Real-world practical examples

Each module contains theory, examples, and exercises to help you develop intuition and practical skills in data analysis.


🧱 Project Structure

Data-Analysis-with-Python/
│
├── 1- Intro to Numpy/
│   ├── 1. NumPy.ipynb
│   └── 2. NumPy exercises.ipynb
│
├── 2- Intro to Pandas/
│   ├── Pandas Basics.ipynb
│   ├── DataFrames and Series.ipynb
│   ├── Pandas Exercises.ipynb
│   └── README.md
│
├── 3- Intro to Data Cleaning/
│   ├── Missing Values.ipynb
│   ├── Outlier Detection.ipynb
│   ├── Data Transformation.ipynb
│   └── README.md
│
├── 4- Reading From Different Data Source/
│   ├── CSV_Files.ipynb
│   ├── Excel_Files.ipynb
│   ├── JSON_and_APIs.ipynb
│   └── README.md
│
├── 5- Python Recap/
│   ├── Python_Basics.ipynb
│   ├── Functions_and_Loops.ipynb
│   └── README.md
│
├── 6- Practical Examples/
│   ├── Sales_Data_Analysis.ipynb
│   ├── Customer_Data_Insights.ipynb
│   ├── Visualization_with_Matplotlib.ipynb
│   └── README.md
│
├── .gitignore
└── README.md

🧭 Learning Roadmap

Stage Focus Area Key Skills
1️⃣ NumPy Arrays, broadcasting, vectorization
2️⃣ Pandas DataFrames, indexing, filtering, grouping
3️⃣ Data Cleaning Missing values, outlier detection, transformation
4️⃣ Data Sources Reading from CSV, Excel, JSON, and APIs
5️⃣ Python Recap Loops, functions, conditionals, list comprehensions
6️⃣ Practical Examples Real-world analytics and visualization

🛠️ Installation

Step 1: Clone the repository

git clone https://github.com/<your-username>/Data-Analysis-with-Python.git
cd Data-Analysis-with-Python

Step 2: Create a virtual environment

conda create --name myenv python=3.11
conda activate myenv

Step 3: Install dependencies

conda env create --file environment.yml --name newenv

🚀 Usage

Run Jupyter Notebook:

jupyter notebook

Then open any module (e.g. 1- Intro to Numpy/2. NumPy.ipynb) and start exploring.


🧩 Modules Overview

1️⃣ Intro to NumPy

  • Learn the basics of data science and numerical computing
  • Understand NumPy arrays, vectorization, and matrix operations
  • Perform element-wise computations efficiently
    📘 Key files: 2. NumPy.ipynb, 3. NumPy exercises.ipynb

2️⃣ Intro to Pandas

  • Explore Series and DataFrames
  • Learn indexing, grouping, and aggregation
  • Handle missing data and compute statistics
    📘 Key files: Pandas Basics.ipynb, Pandas Exercises.ipynb

3️⃣ Intro to Data Cleaning

  • Identify and handle missing or inconsistent values
  • Detect outliers using IQR and z-score
  • Standardize and transform data for analysis
    📘 Key files: Missing Values.ipynb, Outlier Detection.ipynb

4️⃣ Reading From Different Data Sources

  • Load data from CSV, Excel, JSON, and API endpoints
  • Understand data import best practices and performance
    📘 Key files: CSV_Files.ipynb, JSON_and_APIs.ipynb

5️⃣ Python Recap

  • Review essential Python programming concepts
  • Learn about data types, loops, functions, and file handling
    📘 Key files: Python_Basics.ipynb, Functions_and_Loops.ipynb

6️⃣ Practical Examples

  • Real-world scenarios combining all learned skills
  • Sales analysis, customer segmentation, and data visualization
    📘 Key files: Sales_Data_Analysis.ipynb, Visualization_with_Matplotlib.ipynb

🧠 Skills You’ll Gain

  • Data wrangling with Pandas
  • Efficient computation with NumPy
  • Data cleaning and preprocessing
  • Working with structured/unstructured data sources
  • Exploratory Data Analysis (EDA)
  • Python scripting and visualization

🌱 Future Improvements

  • Add machine learning module using scikit-learn
  • Include interactive dashboards (Plotly, Dash)
  • Add SQL integration examples
  • Integrate with real datasets (Kaggle / APIs)

🤝 Contributing

Contributions are welcome!
If you’d like to improve examples, fix bugs, or add new lessons:

  1. Fork this repo
  2. Create a feature branch
  3. Submit a pull request


👤 Author

Omar Shehab
💼 GitHub: github.com/OmarHeshamShehab

About

Learn end-to-end Data Analysis: import data from CSV, SQL, Excel, and more; process it with NumPy & Pandas; visualize using Matplotlib & Seaborn; and clean it for reporting. Includes a hands-on Jupyter Notebook tutorial and a quick Python refresher.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published