## Road Map for machine learning


1. Preprocessing – getting data into shape

Raw data rarely comes in the form and shape that is necessary for the optimal performance of a learning algorithm. Thus, the preprocessing of the data is one of the most crucial steps in any machine learning application.

Many machine learning algorithms also require that the selected features are on the same scale for optimal performance, which is often achieved by transforming the features in the range [0, 1] or a standard normal distribution with zero mean and unit variance.

Some of the selected features may be highly correlated and therefore redundant to a certain degree. In those cases, dimensionality reduction techniques are useful for compressing the features onto a lower dimensional subspace. Reducing the dimensionality of our feature space has the advantage that less storage space is required, and the learning algorithm can run much faster.

2. Training and selecting a predictive model

But before we can compare different models, we first have to decide upon a metric to measure performance.

3. Evaluating models and predicting unseen data instances

After we have selected a model that has been fitted on the training dataset, we can use the test dataset to estimate how well it performs on this unseen data to estimate the generalization error. If we are satisfied with its performance, we can now use this model to predict new, future data.

## Perceptron Algorithm

In [2]:
#enumerate- Iterate over indices and items of a list
alist = ['a1', 'a2', 'a3']

for i, a in enumerate(alist):
    print (i, a)

0 a1
1 a2
2 a3


In [4]:
#zip- Iterate over two lists in parallel

alist = ['a1', 'a2', 'a3']
blist = ['b1', 'b2', 'b3']

for a, b in zip(alist, blist):
    print (a, b)

a1 b1
a2 b2
a3 b3


In [6]:
# iterate over two lists and their indices using enumerate together with zip:
alist = ['a1', 'a2', 'a3']
blist = ['b1', 'b2', 'b3']

for i, (a, b) in enumerate(zip(alist, blist)):
    print (i, a, b)

0 a1 b1
1 a2 b2
2 a3 b3
