<a href="https://colab.research.google.com/github/Gona358/Application-File/blob/main/Perceptron.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Explanation of the intuition behind the Perceptron algorithm.

The Perceptron algorithm is one of the fundamental building blocks of machine learning and artificial neural networks. It is a simple supervised learning algorithm used for binary classification, meaning it's used to separate data into two distinct classes. Developed by Frank Rosenblatt in the late 1950s, the Perceptron is considered one of the earliest neural network models.

Here are the key components and steps of the Perceptron algorithm:

1. **Initialization:** The algorithm starts by initializing the weights (often denoted as "w") and the bias (also known as the threshold, denoted as "b"). These are typically set to small random values or initialized to zeros.

2. **Input Data:** For each training example, the algorithm takes a feature vector as input. The feature vector represents the data to be classified.

3. **Weighted Sum:** The Perceptron calculates the weighted sum of the input features by multiplying each feature by its corresponding weight and summing them up. The formula for the weighted sum is:

   $z = \sum_{i=1}^{n} w_i \cdot x_i + b$

   where $z$ is the weighted sum, $w_i$ are the weights, $x_i$ are the input features, $n$ is the number of features, and $b$ is the bias.

4. **Activation Function:** The weighted sum $z$ is then passed through an activation function. In the classic Perceptron, the activation function is a step function or sign function. If $z$ is greater than or equal to zero, the Perceptron outputs one class (e.g., +1); otherwise, it outputs the other class (e.g., -1). The step function essentially acts as a binary threshold.

5. **Updating Weights and Bias:** If the Perceptron makes an incorrect prediction, it updates the weights and bias to minimize the error. The update is based on the misclassified data point, and the algorithm tries to push the decision boundary in the correct direction. The update rule is as follows:

   $w_i \leftarrow w_i + \alpha \cdot (y - \hat{y}) \cdot x_i$
  $ b \leftarrow b + \alpha \cdot (y - \hat{y})$

   Where $w_i$ is the weight for feature $i$, $alpha$ is the learning rate (a hyperparameter that controls the step size in weight updates), $y$ is the true label, and $\hat{y}$ is the predicted label.

6. **Iteration:** Steps 3 to 5 are repeated for each data point in the training set. This process continues for a set number of iterations (epochs) or until no misclassifications occur.

The Perceptron's main idea is to learn a linear decision boundary that separates two classes. If the data is linearly separable, the Perceptron can find a decision boundary to correctly classify the data. However, if the data is not linearly separable, the Perceptron may not converge to a solution. In such cases, more advanced algorithms like the Support Vector Machine (SVM) are often used.

The Perceptron is a simple yet important concept, forming the basis for more complex neural network architectures. While it's not suitable for all types of problems, it played a crucial role in the development of machine learning and the history of artificial intelligence.


#Pseudocode

Function PerceptronTraining(training_data, number_of_epochs, learning_rate):
    # Initialize weights and bias
    for each weight in weights:
        weight = random_initial_value()  # You can initialize them randomly or to zero
    bias = random_initial_value()  # You can also initialize it randomly or to zero

    # Perceptron training
    for each epoch in range(number_of_epochs):
        for each example in training_data:
            # Calculate weighted sum
            weighted_sum = 0
            for each feature in example:
                weighted_sum += weight * feature

            weighted_sum += bias

            # Apply the activation function (can be the sign function)
            prediction = sign_function(weighted_sum)

            # Update weights and bias if the prediction is incorrect
            if prediction != example.class:
                for each weight in weights:
                    weight += learning_rate * (example.class - prediction) * feature
                bias += learning_rate * (example.class - prediction)

    return weights, bias

Function sign_function(value):
    if value >= 0:
        return 1
    else:
        return -1

#Algorithm

In [None]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([0, 1, 1, 0])

model = keras.Sequential([
    keras.layers.Dense(units=1, input_shape=(2,), activation='sigmoid')
])

model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])

model.fit(X, y, epochs=1000, verbose=0)

loss, accuracy = model.evaluate(X, y)
print(f'Loss: {loss:.4f}, Accuracy: {accuracy * 100:.2f}%')

predictions = model.predict(X)
print('Predictions:')
for i in range(len(X)):
    print(f'Input: {X[i]}, Predicted Output: {predictions[i][0]:.2f}')


Loss: 0.7015, Accuracy: 50.00%
Predictions:
Input: [0 0], Predicted Output: 0.41
Input: [0 1], Predicted Output: 0.52
Input: [1 0], Predicted Output: 0.46
Input: [1 1], Predicted Output: 0.57


# Understanding Loss Function and Optimization in Perceptrons

In the context of Perceptrons, it's important to discuss the loss function and optimization method used for training.

## Loss Function:

In Perceptrons, the most commonly used loss function is the mean squared error (MSE) loss, which is defined as:

$ L(y, \hat{y}) = (y - \hat{y})^2 $

Where:
- $ L(y, \hat{y}) $ is the loss function.
- $ y $ is the true label of the training example (1 or -1 in binary classification).
- $ \hat{y} $ is the output or prediction of the Perceptron.

The goal of learning in the Perceptron is to minimize this loss function, which involves adjusting the weights and bias to make predictions as close as possible to the true labels.

## Optimization Method:

Perceptrons use a simple form of optimization known as "Stochastic Gradient Descent" (SGD). SGD is used to iteratively adjust the weights and bias to minimize the loss function.

The learning algorithm for Perceptrons iteratively updates the weights and bias in the direction of the gradient descent of the loss function. The weight and bias updates are defined as follows:

$ w_i \leftarrow w_i + \alpha \cdot (y - \hat{y}) \cdot x_i $
$ b \leftarrow b + \alpha \cdot (y - \hat{y}) $

Where:
- $ w_i $ is the weight associated with feature $ x_i $.
- $ \alpha $ is the learning rate, controlling the step size for weight updates.
- $ y $ is the true label.
- $ \hat{y} $ is the Perceptron's prediction.
- $ x_i $ is the $ i $-th feature of the training example.
- $ b $ is the bias.

The SGD optimization is an iterative process where weights and bias are adjusted to minimize the loss function in each step. This process continues until a fixed number of epochs is reached or until there are no classification errors on the training data.
