This repository contains the full course materials and projects for Python for Data Science, designed for learners who want to explore data manipulation, visualization, and machine learning using Python.
This course teaches participants both the theory and practice of data science. Starting from the foundations of Python programming, students will explore data analysis, visualization, and machine learning.
The course emphasizes working with real datasets while introducing key Python libraries such as NumPy, Pandas, Matplotlib, Seaborn, and Scikit-Learn, as well as advanced topics like web scraping, deep learning, and natural language processing (NLP).
By the end, participants will be able to manage the full data science workflow: from data collection and preprocessing to building and evaluating machine learning models.
By the end of this course, students will be able to:
- Analyze and explore data using Python libraries.
- Create clear, informative, and professional data visualizations.
- Preprocess datasets: handle missing values, feature scaling, and encoding.
- Build and evaluate supervised and unsupervised machine learning models.
- Implement real-world projects involving data collection, analysis, and model building.
- Gain proficiency in Python libraries: NumPy, Pandas, Matplotlib, Seaborn, Scikit-Learn, BeautifulSoup.
- Perform web scraping to collect data from websites.
- Apply advanced ML techniques: ensemble methods & hyperparameter tuning.
- Understand fundamentals of Deep Learning using TensorFlow/Keras.
- Apply Natural Language Processing (NLP) for text analysis.
| Session | Topic | Key Focus |
|---|---|---|
| 1 | Python & Data Science Intro | Python setup, Jupyter, variables, control structures |
| 2 | Data Structures & Functions | Lists, tuples, dictionaries, sets, custom modules |
| 3 | Working with NumPy | Arrays, operations, math & statistics |
| 4 | Data Manipulation with Pandas | DataFrames, cleaning, merging, grouping |
| 5 | Data Visualization | Matplotlib & Seaborn plots, customization |
| 6 | Advanced Pandas | Time series, missing values, feature scaling, encoding |
| 7 | Web Scraping & EDA | BeautifulSoup, parsing HTML, data exploration |
| 8 | Intro to Machine Learning | ML types, Scikit-Learn, preprocessing |
| 9 | Supervised Learning | Linear/logistic regression, decision trees, evaluation |
| 10 | Unsupervised Learning | K-Means, hierarchical clustering, PCA |
| 11 | Advanced ML Techniques | Ensemble methods, hyperparameter tuning |
| 12 | Final Project | End-to-end project: plan, collect, analyze, model, present |
- House Price Predictor: Predict housing prices using regression models.
- Customer Segmentation Tool: Cluster customers using unsupervised learning.
- Stock Market Analyzer: Visualize and analyze stock trends with Pandas & Matplotlib.
- Movie Review Classifier: Apply NLP to classify movie reviews as positive/negative.
- Weather Scraper: Collect and analyze weather data from online sources.
- Python 3.x (Anaconda recommended)
- IDE: PyCharm or Jupyter Notebook
- Text Editor: Visual Studio Code
- Libraries: NumPy, Pandas, Matplotlib, Seaborn, Scikit-Learn, BeautifulSoup, TensorFlow/Keras
- Hardware: Laptop (Core i5, RAM โฅ 8GB, Windows 10 64-bit or later)
- AI Publishing. Python Crash Course for Data Analysis.
- Galea, Alex. Applied Data Science with Python and Jupyter.
- Morgan, Peters. Data Analysis from Scratch with Python.
- Leonard, Apeltsin. Data Science Bookcamp: Five Real-World Python Projects.
- Cielen, Davy, Meysman, A. D. B., & Ali, Mohammed. Introduction to Data Science.
| Component | Weight |
|---|---|
| Attendance & Participation | 20% |
| Assignments & Activities | 20% |
| Final Project โ Implementation | 30% |
| Final Project โ Presentation | 30% |
- Familiarity with computer use and Windows (especially file management).
- Prior programming experience is preferred but not mandatory.
- Web Development with Python
- Python for Machine Learning
- Python for Fullstack with DevSecOps
"# Python-for-Data-Science"