# Level 13: Real-World Projects (Practice)

Congratulations on making it through all the levels! You've learned the core concepts of Pandas, from the basics of DataFrames to advanced topics like time series and performance optimization.

The absolute best way to solidify these skills and become a proficient Pandas user is to apply them to real-world datasets. This final notebook provides ideas for projects and points you to resources where you can find interesting data to explore.

## Project Ideas

Here are some ideas for projects. For each one, try to follow the full data analysis workflow: **Load -> Clean -> Transform -> Aggregate -> Visualize**.

### 1. Sales Data Analysis
- **Goal:** Analyze sales data to find top-selling products, identify regional performance, and visualize monthly sales trends.
- **Skills Used:** `read_csv`, `to_datetime`, `groupby`, `agg`, `sort_values`, `.dt` accessor, plotting.
- **Potential Dataset:** [Superstore Sales Dataset on Kaggle](https://www.kaggle.com/datasets/rohitsahoo/sales-forecasting)

### 2. Log File Processing
- **Goal:** Parse web server logs to find the most frequent IP addresses, count 404 errors, and determine peak traffic hours.
- **Skills Used:** `read_csv` (with custom separators), string methods (`.str.extract`), `to_datetime`, `groupby`, `value_counts`.
- **Potential Dataset:** You can find sample Apache log files online or generate your own.

### 3. Survey Data Cleanup
- **Goal:** Clean a messy survey dataset by handling missing responses, converting categorical answers to numerical codes, and analyzing demographic breakdowns.
- **Skills Used:** `isna`, `fillna`, `dropna`, `map`, `astype`, `pd.get_dummies`, `crosstab`.
- **Potential Dataset:** [Kaggle Developer Survey](https://www.kaggle.com/datasets/kaggle/kaggle-survey-2022)

### 4. Stock Price Analysis
- **Goal:** Analyze historical stock data (Open, High, Low, Close) to calculate daily returns, 30-day moving averages, and volatility.
- **Skills Used:** `read_csv`, `to_datetime`, `set_index`, `shift`, `rolling`, plotting.
- **Potential Dataset:** You can download historical data from Yahoo Finance or use a library like `yfinance`.

### 5. Web Scraping + Pandas
- **Goal:** Scrape a table from a website (e.g., a list of tallest buildings from Wikipedia), load it into a DataFrame, clean it up, and perform analysis.
- **Skills Used:** A scraping library like `BeautifulSoup` or `requests`, `pd.read_html`, string methods, `astype`.
- **Potential Project:** Scrape the "List of largest companies by revenue" page on Wikipedia.

## Recommended Resources for Datasets

- **[Kaggle Datasets](https://www.kaggle.com/datasets):** A huge collection of datasets on almost any topic imaginable.
- **[Data.gov](https://data.gov/):** The home of the U.S. Government’s open data.
- **[Awesome Public Datasets on GitHub](https://github.com/awesomedata/awesome-public-datasets):** A curated list of high-quality public datasets.
- **[Google Dataset Search](https://datasetsearch.research.google.com/):** A search engine for datasets.

## ✅ Learning Path Summary

Throughout this journey, you have learned to:

✅ **Install & load data:** Get data from files like CSV, Excel, and JSON into a DataFrame.
✅ **Explore DataFrame basics:** Understand Series and DataFrames, and use attributes like `.shape`, `.info()`, and `.describe()`.
✅ **Select & filter data:** Use `.loc`, `.iloc`, boolean indexing, and `.query()` to access specific parts of your data.
✅ **Handle missing values & strings:** Clean your data using methods for handling NaNs, duplicates, data types, and string manipulation.
✅ **Group, aggregate, & pivot:** Summarize your data using `groupby`, `pivot_table`, and `crosstab`.
✅ **Merge & reshape:** Combine and restructure data with `concat`, `merge`, `melt`, and `pivot`.
✅ **Work with dates:** Master time series data with `to_datetime`, `DateTimeIndex`, resampling, and rolling windows.
✅ **Optimize performance:** Write faster, more memory-efficient code.
✅ **Build end-to-end projects:** Apply your skills to solve real-world problems.
✅ **Master method chaining & pipelines:** Write clean, reusable, and professional-grade code with `.pipe()`.

**Keep practicing, and happy coding!**