In [1]:
# Step 1: Import necessary libraries
import numpy as np
import tensorflow as tf
from tensorflow import keras

# Step 2: Generate synthetic data
# Let's create synthetic data for a binary classification problem
np.random.seed(0)
X_train = np.random.rand(1000, 10)  # 1000 samples with 10 features
y_train = np.random.randint(2, size=1000)  # Binary labels (0 or 1)

X_test = np.random.rand(200, 10)  # 200 samples for testing
y_test = np.random.randint(2, size=200)

# Step 3: Build the ANN Model
model = keras.Sequential([
    keras.layers.Dense(64, activation='relu', input_shape=(10,)),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')
])

# Step 4: Compile the Model
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Step 5: Train the Model
model.fit(X_train, y_train, epochs=10, batch_size=32, verbose=1)

# Step 6: Evaluate the Model
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=1)
print('Test accuracy:', test_acc)

# Step 7: Test the Model
# Let's generate new synthetic data for testing
X_new = np.random.rand(10, 10)  # 10 new samples
predictions = model.predict(X_new)
print("Predictions for new data:")
print(predictions)




Epoch 1/10


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.4699999988079071
Predictions for new data:
[[0.5053637 ]
 [0.38357165]
 [0.5373104 ]
 [0.5237094 ]
 [0.53734785]
 [0.45743692]
 [0.52177835]
 [0.51759934]
 [0.48160008]
 [0.48362982]]




#### Theory Behind ANN:

Artificial Neural Networks (ANNs) are a class of machine learning models inspired by the structure and functioning of the human brain. They consist of interconnected nodes, called neurons, organized in layers. In a typical ANN, there are three types of layers:

    Input Layer: Receives input data.
    Hidden Layers: Perform computations on the input data.
    Output Layer: Produces the final output.

Each connection between neurons has a weight associated with it, which determines the strength of influence of one neuron on another. During training, the network adjusts these weights to minimize the difference between the actual and predicted outputs using an optimization algorithm like gradient descent.

Steps to Build the ANN Model:

    Import Necessary Libraries: We'll need TensorFlow and NumPy for this task.
    Generate Synthetic Data: We'll create synthetic data for a simple classification task.
    Build the ANN Model: We'll define the architecture of the ANN using TensorFlow's Keras API.
    Compile the Model: Specify the loss function, optimizer, and metrics for training.
    Train the Model: Fit the model to the synthetic data.
    Evaluate the Model: Check the model's performance on the test data.
    Test the Model: Use the trained model to make predictions on new data.


Activation Functions:
Activation functions are mathematical functions applied to the output of each neuron in a neural network. They introduce non-linearities to the network, allowing it to learn complex patterns in the data. Here are some commonly used activation functions:

Sigmoid Function:
Range: (0, 1)
Used in the output layer of binary classification problems.
ReLU (Rectified Linear Unit):

ReLU(x)=max(0,x)
Range: [0, +∞)
Most widely used activation function for hidden layers due to its simplicity and effectiveness.

Tanh (Hyperbolic Tangent):
Range: (-1, 1)
Similar to the sigmoid function but centered around zero, often used in hidden layers.

Softmax Function:​
Range: (0, 1)
Used in the output layer of multi-class classification problems to obtain probabilities.

Dense Layer:

A dense layer, also known as a fully connected layer, is the basic building block of a neural network. It consists of multiple neurons (or nodes), each connected to every neuron in the previous layer, hence "dense." The computation performed in a dense layer is a linear operation followed by an activation function. The number of neurons in the layer determines the output dimensionality.

Sequential Model:

The sequential model is a linear stack of layers in Keras, a high-level neural networks API running on top of TensorFlow. It allows you to create models layer-by-layer, where each layer has exactly one input tensor and one output tensor. The sequential model is ideal for building simple neural networks where the data flows sequentially from one layer to the next.

In the provided code example:

keras.Sequential() creates a sequential model.
keras.layers.Dense() adds a dense layer to the model.
activation parameter in Dense() specifies the activation function to be used.
input_shape parameter in the first layer specifies the shape of input data.
By stacking dense layers with activation functions, the sequential model transforms the input data through multiple layers to produce output predictions.


    Optimizer:
    An optimizer is an algorithm used to minimize the loss function during training by adjusting the weights of the neural network. It determines how the weights are updated in each iteration of the training process. Some popular optimizers include:

    Gradient Descent: The basic optimization algorithm that adjusts the weights in the direction of the steepest descent of the loss function.

    Adam (Adaptive Moment Estimation): An adaptive learning rate optimization algorithm that combines the advantages of two other extensions of gradient descent: AdaGrad and RMSProp. It adapts the learning rates for each parameter based on estimates of first and second moments of the gradients.

    SGD (Stochastic Gradient Descent): A variant of gradient descent that updates the weights using a small subset of the training data (mini-batches) at each iteration, making it computationally efficient.

    Loss Function (Binary Cross-Entropy):
    The loss function, also known as the cost function or objective function, quantifies how well the model's predictions match the actual labels during training. In binary classification tasks (where there are only two classes), binary cross-entropy loss is commonly used. It measures the dissimilarity between the actual and predicted probability distributions of the binary classes.

    In the binary cross-entropy loss function, for each instance, the model computes the cross-entropy loss between the true binary label and the predicted probability distribution. The formula for binary cross-entropy loss is:




    Epoch:
    An epoch refers to one complete pass through the entire training dataset during the training phase. In other words, one epoch is completed when the model has processed every sample in the training dataset once. Training typically involves running multiple epochs, allowing the model to learn from the data iteratively.

    Batch Size:
    Batch size refers to the number of training examples utilized in one iteration. Instead of feeding the entire dataset into the model at once, training is typically done in smaller batches. Each batch consists of a subset of the training data, and the model updates its weights based on the gradients computed on this batch.

    Using mini-batches (batch size > 1) instead of processing the entire dataset at once offers several advantages:

    It reduces memory requirements, especially for large datasets.
    It introduces noise into the optimization process, which can help the model generalize better.
    It allows for parallelization on hardware like GPUs, leading to faster training times.
    Verbose:
    The verbose parameter controls how much information about the training process is displayed during training. It accepts different values (0, 1, or 2):

    verbose=0: Silent mode. The training process will be silent, and no output will be displayed.
    verbose=1: Default mode. Progress bars will be displayed showing the training progress for each epoch.
    verbose=2: Verbose mode. One line per epoch will be printed, displaying the progress and the training metrics.
    Example Usage in Code:


    epochs=10: Specifies that the model should be trained for 10 epochs.
    batch_size=32: Specifies that each training iteration should process 32 samples at a time.
    verbose=1: Specifies that progress bars will be displayed during training.

