# [How to Implement the Backpropagation Algorithm From Scratch In Python](https://machinelearningmastery.com/implement-backpropagation-algorithm-scratch-python/)

## [Courtesy of Jason Brownlee at [Machine Learning Mastery](https://machinelearningmastery.com/). Thanks Jason!

In [9]:
import pandas as pd

## Description

This section provides a brief introduction to the Backpropagation algorithm and the Wheat Seeds dataset that we will be using in this tutorial.

### Backpropagation Algorithm

The Backpropagation algorithm is a supervised learning method for multilayer feed-forward networks from the field of Artificial Neural Networks.  
Feed-forward neural networks are inspired by the information processing of one or more neural cells, called a neuron.  
A neuron accepts input signals via its dendrites, which pass the electrical signal down to the cell body.  
The axon carries the signal out to synapses, which are the connections of a cell’s axon to other cell’s dendrites.  
The principle of the backpropagation approach is to model a given function by modifying internal weightings of input signals to produce an expected output signal.  
The system is trained using a supervised learning method, where the error between the system’s output and a known expected output is presented to the system and used to modify its internal state.

Technically, the backpropagation algorithm is a method for training the weights in a multilayer feed-forward neural network.  
As such, it requires a network structure to be defined of one or more layers where one layer is fully connected to the next layer.  
A standard network structure is one input layer, one hidden layer, and one output layer.  
Backpropagation can be used for both classification and regression problems, but we will focus on classification in this tutorial.  
In classification problems, best results are achieved when the network has one neuron in the output layer for each class value.  
For example, a 2-class or binary classification problem with the class values of A and B.  
These expected outputs would have to be transformed into binary vectors with one column for each class value.  
Such as [1, 0] and [0, 1] for A and B respectively.  
This is called a one hot encoding.

### Wheat Seeds Dataset

The seeds dataset involves the prediction of species given measurements seeds from different varieties of wheat.  
There are 201 records and 7 numerical input variables.  
It is a classification problem with 3 output classes.  
The scale for each numeric input value vary, so some data normalization may be required for use with algorithms that weight inputs like the backpropagation algorithm.  
Using the Zero Rule algorithm that predicts the most common class value, the baseline accuracy for the problem is 28.095%.  
You can learn more and download the seeds dataset from the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/seeds).  
Download the seeds dataset and place it into your current working directory with the filename seeds_dataset.csv.  
The dataset is in tab-separated format, so you must convert it to CSV using a text editor or a spreadsheet program.

In [10]:
seeds_dataset = pd.read_csv('seeds_dataset.csv', header=None)

Below is a sample of the first 5 rows of the seeds dataset.

In [11]:
seeds_dataset[:5]

Unnamed: 0,0,1,2,3,4,5,6,7
0,15.26,14.84,0.871,5.763,3.312,2.221,5.22,1
1,14.88,14.57,0.8811,5.554,3.333,1.018,4.956,1
2,14.29,14.09,0.905,5.291,3.337,2.699,4.825,1
3,13.84,13.94,0.8955,5.324,3.379,2.259,4.805,1
4,16.14,14.99,0.9034,5.658,3.562,1.355,5.175,1


## Tutorial