<a href="https://colab.research.google.com/github/MariaMuu/AI/blob/main/simple_text_summarizer_transformers.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# importing transformers

!pip install transformers



In [2]:
# importing dependencies

from transformers import pipeline

In [3]:
# loading summarization pipeline

summarizer = pipeline('summarization')

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
Device set to use cpu


In [7]:
# summarize our text

text = """Certainly! Here's a comprehensive and detailed explanation of how neural networks work, encompassing their structure, functionality, training processes, and various architectures:

---

A neural network is a computational model inspired by the human brain's structure and function. It consists of interconnected units called neurons, organized into layers: an input layer, one or more hidden layers, and an output layer. Each neuron receives input, processes it, and passes the output to the next layer.

**Structure and Functionality:**

* **Neurons and Layers:** Each neuron in a layer is connected to neurons in adjacent layers through weighted connections. The input layer receives raw data, hidden layers perform computations, and the output layer produces the final result.

* **Weights and Biases:** Connections between neurons have associated weights, which determine the strength of the signal transmitted. Each neuron also has a bias value that adjusts the output along with the weighted sum of inputs.

* **Activation Functions:** After computing the weighted sum and adding the bias, the result is passed through an activation function. This function introduces non-linearity into the model, enabling it to learn complex patterns. Common activation functions include:

  * **Sigmoid:** Outputs values between 0 and 1.
  * **Tanh:** Outputs values between -1 and 1.
  * **ReLU (Rectified Linear Unit):** Outputs zero for negative inputs and the input itself for positive inputs.([khanacademy.org][1])

**Training Process:**

* **Forward Propagation:** Input data is passed through the network layer by layer. Each neuron computes its output based on inputs, weights, biases, and the activation function. The final output is compared to the actual target to compute the error.

* **Loss Function:** The error between the predicted output and the actual target is quantified using a loss function. Common loss functions include Mean Squared Error for regression tasks and Cross-Entropy Loss for classification tasks.

* **Backpropagation:** To minimize the loss, the network adjusts its weights and biases. Backpropagation computes the gradient of the loss function with respect to each weight and bias by applying the chain rule of calculus. This process propagates the error backward through the network.

* **Gradient Descent:** Using the gradients computed during backpropagation, the network updates its weights and biases in the direction that reduces the loss. The learning rate determines the size of these updates. Variants like Stochastic Gradient Descent (SGD), Adam, and RMSprop are commonly used optimization algorithms.

**Architectures:**

* **Feedforward Neural Networks (FNN):** The simplest type, where connections do not form cycles. Data moves in one direction from input to output.([en.wikipedia.org][2])

* **Convolutional Neural Networks (CNN):** Specialized for processing grid-like data such as images. They use convolutional layers to extract spatial features, pooling layers to reduce dimensionality, and fully connected layers for classification.([en.wikipedia.org][3])

* **Recurrent Neural Networks (RNN):** Designed for sequential data. They have connections that form cycles, allowing information to persist across time steps. Variants like Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) address issues like vanishing gradients.

* **Residual Neural Networks (ResNet):** Introduce skip connections that allow gradients to flow directly through the network, enabling the training of very deep networks.

* **Transformer Networks:** Utilize self-attention mechanisms to process sequential data without recurrence. They have become the foundation for many state-of-the-art models in natural language processing.

**Regularization Techniques:**

To prevent overfitting, various regularization methods are employed:

* **Dropout:** Randomly disables a fraction of neurons during training to prevent co-adaptation.

* **L1 and L2 Regularization:** Add penalty terms to the loss function based on the magnitude of weights, encouraging simpler models.

* **Early Stopping:** Halts training when performance on a validation set starts to degrade.

**Applications:**

Neural networks are applied in various domains:

* **Computer Vision:** Image classification, object detection, and segmentation.

* **Natural Language Processing:** Language translation, sentiment analysis, and text generation.

* **Speech Recognition:** Converting spoken language into text.

* **Time Series Prediction:** Forecasting stock prices, weather, and other temporal data.

* **Healthcare:** Disease diagnosis, medical image analysis, and personalized treatment recommendations.

**Challenges and Considerations:**

* **Data Requirements:** Neural networks often require large amounts of labeled data for effective training.

* **Computational Resources:** Training deep networks can be computationally intensive, necessitating specialized hardware like GPUs.

* **Interpretability:** Understanding the decision-making process of neural networks can be challenging, leading to research in explainable AI.

* **Hyperparameter Tuning:** Selecting appropriate hyperparameters (e.g., learning rate, number of layers, activation functions) is crucial for model performance and often requires experimentation.

**Advancements:**

Recent developments have led to more efficient and powerful neural network models:

* **Transfer Learning:** Leveraging pre-trained models on new tasks, reducing data and computational requirements.

* **Neural Architecture Search (NAS):** Automating the design of neural network architectures for optimal performance.

* **Self-Supervised Learning:** Learning representations from unlabeled data, expanding the applicability of neural networks.

"""

In [8]:
print(len(text))

5797


In [25]:
# i want a summary of 100-500 words

summary = summarizer(text, max_length = 500, min_length = 100, do_sample = False, truncation=True)


In [28]:
# Access the summary text from the list and dictionary structure
# and then apply the strip() method

cleaned_summary = summary[0]['summary_text'].strip()

cleaned_summary

"A neural network is a computational model inspired by the human brain's structure and function . It consists of interconnected units called neurons, organized into layers: an input layer, one or more hidden layers, and an output layer . Each neuron receives input, processes it, and passes the output to the next layer . The loss function is quantified using a loss function . The training process propagates the error backward through the network . The network updates its weights and biases in the direction that reduces the loss ."

In [30]:
final_cleaned_summary = cleaned_summary.replace(' .', '.')
final_cleaned_summary

"A neural network is a computational model inspired by the human brain's structure and function. It consists of interconnected units called neurons, organized into layers: an input layer, one or more hidden layers, and an output layer. Each neuron receives input, processes it, and passes the output to the next layer. The loss function is quantified using a loss function. The training process propagates the error backward through the network. The network updates its weights and biases in the direction that reduces the loss."