A predictive modeling machine learning project can be broken down into 6 top-level tasks:

* `Define Problem:` Investigate and characterize the problem in order to better understand
the goals of the project.

* `Analyze Data:` Use descriptive statistics and visualization to better understand the data
you have available.

* `Prepare Data:` Use data transforms in order to better expose the structure of the
prediction problem to modeling algorithms.

* `Evaluate Algorithms:` Design a test harness to evaluate a number of standard algorithms
on the data and select the top few to investigate further.

* `Improve Results:` Use algorithm tuning and ensemble methods to get the most out of
well-performing algorithms on your data.

* `Present Results:` Finalize the model, make predictions and present results.

The Python ecosystem is growing and may become the dominant platform for machine learning.
The primary rationale for adopting Python for machine learning is because it is a general
purpose programming language that you can use both for R&D and in production.

* Python and its rising use for machine learning.

* SciPy and the functionality it provides with NumPy, Matplotlib and Pandas.

* scikit-learn that provides all of the machine learning algorithms.

* How to setup your Python ecosystem for machine learning and what versions to use

#### The philosophy of Python is captured in the Zen of Python which includes phrases like:

* Beautiful is better than ugly.
* Explicit is better than implicit.
* Simple is better than complex.
* Complex is better than complicated.
* Flat is better than nested.
* Sparse is better than dense.
* Readability counts.

In [12]:
import this

It is also widely used for machine learning and data science because of the excellent library
support and because it is a general purpose programming language (unlike R or Matlab).

## SciPy

`SciPy` is an ecosystem of Python libraries for `mathematics`, `science` and `engineering`. It is an
add-on to Python that you will need for `machine learning`. The `SciPy` ecosystem is comprised of
the following core modules relevant to `machine learning`:

* NumPy: A foundation for SciPy that allows you to efficiently work with data in arrays.
    
* Matplotlib: Allows you to create 2D charts and plots from data.
    
* Pandas: Tools and data structures to organize and analyze your data.

## scikit-learn

The `scikit-learn` library is how you can develop and practice machine learning in Python. It is
built upon and requires the `SciPy` ecosystem. The name `scikit` suggests that it is a SciPy plug-in
or toolkit. The focus of the library is `machine learning algorithms` for `classification`, `regression`,
`clustering` and more. It also provides tools for related tasks such as `evaluating models`, `tuning
parameters` and `pre-processing data.`

Some Installed Libraries 

In [15]:
# scipy
import scipy
print('scipy: {}'.format(scipy.__version__))

scipy: 1.4.1


In [16]:
# numpy
import numpy
print('numpy: {}'.format(numpy.__version__))

numpy: 1.16.2


In [17]:
# matplotlib
import matplotlib
print('matplotlib: {}'.format(matplotlib.__version__))

matplotlib: 3.0.3


In [18]:
# pandas
import pandas
print('pandas: {}'.format(pandas.__version__))

pandas: 0.25.3


In [19]:
# scikit-learn
import sklearn
print('sklearn: {}'.format(sklearn.__version__))

sklearn: 0.20.3
