This repository contains a collection of Python scripts that serve as a practical workshop for learning data science and machine learning. The code is structured to guide a user from Python fundamentals to building and evaluating basic machine learning models.
This workshop covers the essential libraries and concepts in the Python data science stack:
- Python Fundamentals: Core data structures (lists, dictionaries, sets), functions, error handling, and list comprehensions.
- NumPy: Creating and manipulating numerical arrays for scientific computing.
- Pandas: Data manipulation and analysis using Series and DataFrames, including handling missing data and file I/O.
- Data Visualization: Creating static plots and charts using Matplotlib and statistical visualizations with Seaborn.
- Machine Learning with Scikit-Learn:
- Supervised Learning fundamentals.
- Linear Regression for predicting continuous values.
- Logistic Regression for classification tasks.
- Decision Trees and Support Vector Machines (SVM) for more complex classification.
- Model evaluation using metrics like accuracy, confusion matrix, and ROC curves.