A computer program is said to learn from experience E with respect to some class of tasks T, and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. 

Thus there are many different kinds of machine learning, depending on 
- the nature of the tasks T we wish the system to learn, 
- the nature of the performance measure P we use to evaluate the system, 
- and the nature of the training signal or experience E we give it

In this book, we will cover the most common types of ML, but from a probabilistic perspective. Roughly speaking, this means that we treat all unknown quantities (e.g., predictions about the future value of some quantity of interest, such as tomorrow’s temperature, or the parameters of some model) as random variables, that are endowed with probability distributions which describe a weighted set of possible values the variable may have.

There are two main reasons we adopt a probabilistic approach. 
- First, it is the optimal approach to decision making under uncertainty. 
- Second, probabilistic modeling is the language used by most other areas of science and engineering, and thus provides a unifying framework between these fields.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import os
try:
    import probml_utils as pml
except ModuleNotFoundError:
    %pip install -qq git+https://github.com/probml/probml-utils.git
    import probml_utils as pml
import seaborn as sns;
sns.set(style="ticks", color_codes=True)

from sklearn.datasets import load_iris
iris = load_iris()

# Extract numpy arrays
X = iris.data 
y = iris.target

# Convert to pandas dataframe 
df = pd.DataFrame(data=X, columns=iris.feature_names)
df['label'] = pd.Series(iris.target_names[y], dtype='category')

# we pick a color map to match that used by decision tree graphviz 
#cmap = ListedColormap(['#fafab0','#a0faa0', '#9898ff']) # orange, green, blue/purple
#cmap = ListedColormap(['orange', 'green', 'purple']) 
palette = {'setosa': 'orange', 'versicolor': 'green', 'virginica': 'purple'}

g = sns.pairplot(df, vars = df.columns[0:4], hue="label", palette=palette)
#g = sns.pairplot(df, vars = df.columns[0:4], hue="label")
pml.savefig('iris_scatterplot_purple.pdf')
plt.show()


# Change colum names
iris_df = df.copy()
iris_df.columns =  ['sl', 'sw', 'pl', 'pw'] + ['label'] 

g = sns.pairplot(iris_df, vars = iris_df.columns[0:4], hue="label")
plt.tight_layout()
pml.savefig('iris_pairplot.pdf')
plt.show()


sns.stripplot(x="label", y="sl", data=iris_df, jitter=True)
pml.savefig('iris_sepal_length_strip_plot.pdf')
plt.show()

ModuleNotFoundError: No module named 'numpy'