# Python for Data Analytics
--- 

**Skill Level:** 
Absolute beginners to intermediate learners

**Career Paths:** 
Data Analyst, Business Analyst, Junior Data Scientist.

**Goal:** 
Equip learners with job-ready skills in Python for data analytics, from programming fundamentals to applied analytics, data wrangling, visualization, and storytelling.

**Tools:** 
Jupyter Notebook, pandas, NumPy, matplotlib, seaborn, and scikit-learn.

**Project Format:** 
Mini-projects, real-world datasets, code reviews

## Phase 1: Python Programming Foundations
***(Weeks 1–4)-Build foundational Python skills for data work.***


### Week 1: Python Basics & Environment Setup
**Topics:**
* Installing Python and Jupyter Notebooks
* Data types, variables, inputs, outputs
* Basic arithmetic and string formatting

**Mini Project:**
* BMI calculator 
* Name Formatter
* Currency Convert 

**Project:**
* “Data Entry Sanitizer" Clean and format text inputs like phone numbers or names.


### Installing Python and Jupyter Notebooks

To begin Python programming, you need:

Python: Install from https://python.org

**JupyterLab**
```python
#Install JupyterLab with pip:
pip install jupyterlab

#launch jupyterlab
jupyter lab 

```
**Jupyter Notebook**
```python
#Install the classic Jupyter Notebook with:
pip install notebook

#To run the notebook:
jupyter notebook 

```

## Markdown Basics

Markdown is a lightweight markup language that uses characters like # for headings and * for emphasis to format text simply and intuitively.

| Element        | Markdown Syntax                        |
|----------------|----------------------------------------|
| Heading        | `# H1`<br>`## H2`<br>`### H3`          |
| Bold           | `**bold text**`                        |
| Italic         | `*italicized text*`                    |
| Blockquote     | `> blockquote`                         |
| Ordered List   | `1. First item`<br>`2. Second item`<br>`3. Third item` |
| Unordered List | `- First item`<br>`- Second item`<br>`- Third item`   |
| Code           | `` `code` ``                           |
| Horizontal Rule| `---`                                  |
| Link           | `[title](https://www.example.com)`     |
| Image          | `![alt text](image.jpg)`               |

[Here is a more info on Markdown](https://www.markdownguide.org/basic-syntax/)

# Data types, variables, inputs, outputs

### Common Data Types

In [5]:
# Integer
age = 25

# Float
height = 5.9

# String
name = "Alice"

# Boolean
is_student = True


### Getting User Input & Displaying Output

In [None]:
# Input
user_name = input("Enter your name: ")

# Output
print("Hello,", user_name)


# Basic arithmetic and string formatting

### Arithmetic Operators

In [None]:
a = 10
b = 3

print("Addition:", a + b)
print("Subtraction:", a - b)
print("Multiplication:", a * b)
print("Division:", a / b)
print("Modulus:", a % b)
print("Exponent:", a ** b)


### String Formatting

In [None]:
name = "Alex"
age = 30

# Using f-strings
print(f"My name is {name} and I am {age} years old.")


# Mini Project:

### 1. BMI calculator 

In [1]:
# BMI = weight (kg) / height (m)^2

weight = float(input("Enter your weight in kg: "))
height = float(input("Enter your height in meters: "))

bmi = weight / (height ** 2)

print(f"Your BMI is {bmi:.2f}")


### 2. Name Formatter

In [2]:
full_name = input("Enter your full name: ")

# Clean and format
clean_name = full_name.strip().title()

print(f"Formatted Name: {clean_name}")


### 3. Currency Convert 

In [3]:

usd = float(input("Enter amount in USD: "))
exchange_rate = 75  # Example: 1 USD = 75 INR

inr = usd * exchange_rate

print(f"{usd} USD = {inr:.2f} INR")


# Project:
“Data Entry Sanitizer" Clean and format text inputs like phone numbers or names.


In [8]:
# Names (remove extra spaces, capitalize properly)
def sanitize_name(name):
    return name.strip().title()

# Example
raw_name = input("Enter a name: ")
cleaned_name = sanitize_name(raw_name)
print(f"Sanitized Name: {cleaned_name}")


Enter a name:  drani Godfrey 


Sanitized Name: Drani Godfrey


In [10]:
# Phone numbers (remove dashes/spaces, ensure valid format)
def sanitize_phone_number(phone):
    # Remove spaces, dashes, parentheses
    phone = phone.replace(" ", "").replace("-", "").replace("(", "").replace(")", "")
    
    # Check if starts with country code; if not, add one
    if len(phone) == 10:
        phone = "+256" + phone  # Example for India
    elif not phone.startswith("+"):
        phone = "+" + phone
    
    return phone

# Example
raw_phone = input("Enter phone number: ")
clean_phone = sanitize_phone_number(raw_phone)
print(f"Sanitized Phone Number: {clean_phone}")


Enter phone number:  242342432423r23434r234r234r324


Sanitized Phone Number: +242342432423r23434r234r234r324


In [None]:
name = input("Enter your name: ")
phone = input("Enter your phone number: ")

print("\nSanitized Outputs:")
print("Name:", sanitize_name(name))
print("Phone:", sanitize_phone_number(phone))


### Week 2: Control Flow & Logic in Data Contexts
**Topics:**
* Conditional logic: if, elif, else logic
* Loops: for, while, range()
* Boolean expressions and comparison operators
* Simple data validation (email, age, salary)

**Project:** 
* "Survey Quality Checker" Write a script to flag bad entries in a mock survey dataset.


### Week 3: Functions, Reusability & Error Handling
**Key concepts:**

* Defining functions (def)
* Parameters, return, default values
* Built-in functions vs user-defined
* Default arguments, scope
* Lambda expressions for quick filtering
* Error handling: Try/except, assert, basic logging

**Mini Project:**
* Reusable data cleaning and transformation functions

**Project:** 
* "Data Cleaning Toolbox" – Build reusable functions for tasks like removing whitespace, fixing dates, capitalizing names.


### Week 4: Lists, Dictionaries & Working with Nested Data
**Topics:**
* Data structures: list, dict, tuple, set
* Indexing, slicing, updating
* Nested structures and Iteration
* List comprehensions (intro)
  
**Mini Project:**
* Frequency counters
* Filtering and grouping

**Project:** 
* "Mini CRM System" – Store, search, and update mock customer records using nested dictionaries/lists.


## Phase 2: Data Analytics With Python
**(Weeks 5–10) - Hands-on with real-World Data Workflows**

### Week 5: File I/O and Data Ingestion
**Topics:**
* Reading/writing: .txt, .csv using open(), csv module
* Working with file paths
* JSON parsing with json module
* Basic exception handling for file errors

**Project:** 
"Sales Data Summary" - Load CSV and calculate totals, averages, handle missing/error rows.


### Week 6: NumPy for Efficient Numerical Computing
**Topics:**
* Why Numpy: performance vs Lists
* Arrays, slicing, reshaping
* Element-wise operations and broadcasting
* Aggregation: sum(), mean(), std()
* Boolean indexing for filtering

**Project:** 
*Analyze a small dataset using Numpy arrays for transformation and filtering


### Week 7: Pandas Fundamentals
**Topics:**
* DataFrame vs Series
* Loading CSV/Excel data
* Exploring datasets: .head( ), .info( ), .describe( )
* Selecting/filtering with .loc[ ], .iloc[ ]

**Min project:**
* Filter high-value transactions or recent records

**Project:** 
* “HR Analytics Explorer” – Analyze headcount, attrition rate, salary bands


### Week 8: Data Cleaning with Pandas
**Topics:**
* Handling nulls: isna( ), fillna( )
* String operations: .str.lower( ), .strip( ), .replace( )
* Date parsing and type conversion
* Renaming columns, dropping duplicates

**Project:** 
* “CRM Export Cleaner” – Clean a messy dataset for sales or marketing team


### Week 9: Data Transformation & Feature Engineering
**Topics:**
* Creating new columns with apply() & lambda
* Binning & categorization (pd.cut(), qcut())
* Dummy variables (get_dummies)
* Mapping values (e.g., scoring systems)

*Project:* 
* Customer segmentation or performance tiering (e.g., "Gold", "Silver", "Bronze")


### Week 10: Aggregation, Grouping, and Pivoting
**Topics:**
* Grouping with .groupby() and .agg()
* Pivot tables with .pivot_table()
* Sorting and filtering summaries
* Multi-level aggregation

**Project:** 
* “Revenue by Product Category” – Analyze revenue trends and generate pivot tables by region/month


### Phase 3: Visualization & Analytical Storytelling
***(Weeks 11–13)-Communicate insights through data visuals***


### Week 11: Matplotlib Basics for Plotting
**Topics:**
* Line, bar, scatter, histogram
* Titles, labels, ticks, legends
* Saving charts as images

**Project:**
* “Revenue Dashboard” – Visualize monthly revenue and product trends


### Week 12: Seaborn for Statistical Visualization
**Topics:**
* Distribution plots: distplot, boxplot, violinplot
* Relationship plots: scatterplot, pairplot, heatmap
* Themes and color palettes

**Project:** 
* “Product Analytics Visuals” – Visualize sales vs pricing vs customer rating


### Week 13: Exploratory Data Analysis (EDA) Projects
**Topics:**
* Combining pandas + matplotlib/seaborn
* Detect outliers, correlations, missing data
* Business storytelling: crafting a data narrative

**Capstone Project:**
* “EDA Case Study” – Titanic dataset, HR dataset, or a Kaggle dataset Include:
    * Data cleaning
    * Visual exploration
    * Insight summary in Markdown or slides


# Phase 4: Advanced Analytics & Projects 
**(Weeks 14–16)-Apply complete workflow and optionally explore ML**


### Week 14: Time Series & Date Analysis
* Using .dt accessor for date components
* Resampling: daily, weekly, monthly
* Rolling averages, time windowing

**Mini-Project:**
* Analyze and visualize website traffic or sales trends over time


### Week 15: Intro to Predictive Modeling (Optional)
* Linear Regression with scikit-learn
* Model training, evaluation basics
* Avoiding overfitting, test/train split
  
**Mini-Project:** 
* Predict future sales or employee attrition using linear regression


### Week 16: Final Capstone Project
**Choose one real dataset and business problem:**
* Retail Sales Dashboard
* HR Attrition Analysis
* Marketing Funnel Performance
* Financial KPIs Tracker
* Survey Sentiment and NPS Analysis

**Deliverables:**
* Cleaned dataset
* Data transformation notebook
* 3–5 meaningful visualizations
* Insightful summary (Markdown / PowerPoint)
* GitHub repo or downloadable PDF
