In this repository you can see the coding examples form book machine learning mastery with python by Jason Brownlee
- Define Problem: Investigate and characterize the problem in order to better understand the goals of the project.
- Analyze Data: Use descriptive statistics and visualization to better understand the data you have available.
- Prepare Data: Use data transforms in order to better expose the structure of the prediction problem to modeling algorithms.
- Evaluate Algorithms: Design a test harness to evaluate a number of standard algorithms on the data and select the top few to investigate further.
- Improve Results: Use algorithm tuning and ensemble methods to get the most out of well-performing algorithms on your data.
- Present Results: Finalize the model, make predictions and present results.
- Lesson 1: Python Ecosystem for Machine Learning.
- Lesson 2: Python and SciPy Crash Course.
- Lesson 3: Load Datasets from CSV.
- Lesson 4: Understand Data With Descriptive Statistics. (Analyze Data)
- Lesson 5: Understand Data With Visualization. (Analyze Data)
- Lesson 6: Pre-Process Data. (Prepare Data)
- Lesson 7: Feature Selection. (Prepare Data)
- Lesson 8: Resampling Methods. (Evaluate Algorithms)
- Lesson 9: Algorithm Evaluation Metrics. (Evaluate Algorithms)
- Lesson 10: Spot-Check Classification Algorithms. (Evaluate Algorithms)
- Lesson 11: Spot-Check Regression Algorithms. (Evaluate Algorithms)
- Lesson 12: Model Selection. (Evaluate Algorithms)
- Lesson 13: Pipelines. (Evaluate Algorithms)
- Lesson 14: Ensemble Methods. (Improve Results)
- Lesson 15: Algorithm Parameter Tuning. (Improve Results)
- Lesson 16: Model Finalization. (Present Results)
- How to work through a small to medium sized dataset end-to-end.
- How to deliver a model that can make accurate predictions on new unseen data.
- How to complete all subtasks of a predictive modeling problem with Python.
- How to learn new and different techniques in Python and SciPy.
- How to get help with Python machine learning.
SciPy is an ecosystem of Python libraries for mathematics, science and engineering. It is an add-on to Python that you will need for machine learning. The SciPy ecosystem is comprised of the following core modules relevant to machine learning:
- NumPy: A foundation for SciPy that allows you to efficiently work with data in arrays. (you can see the required basics of numpy in my repository !!!linear Algebra!!!)
- Matplotlib: Allows you to create 2D charts and plots from data.
- Pandas: Tools and data structures to organize and analyze your data.
- Load CSV Files with the Python Standard Library.
- Load CSV Files with NumPy.
- Load CSV Files with Pandas.
It is avialable for free on UCI machine learning repository. https://archive.ics.uci.edu/ml/index.php
else we can find any github repositories that contain such datasets.
- Rescale data.
- Standardize data.
- Normalize data.
- Binarize data.
- Univariate Selection.
- Recursive Feature Elimination.
- Principle Component Analysis.
- Feature Importance.