In [13]:
from langchain_huggingface import HuggingFaceEndpoint,ChatHuggingFace
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel

load_dotenv()

llm1 = HuggingFaceEndpoint(
    repo_id='google/gemma-2-2b-it',
    task='text-generation'
)

llm2 = HuggingFaceEndpoint(
    repo_id="meta-llama/Llama-3.2-3B-Instruct",
    task="text-generation"
)


model1 = ChatHuggingFace(llm=llm1)
model2 = ChatHuggingFace(llm=llm2)


In [14]:
prompt1=PromptTemplate(
    template='Give me notes in simple words with real world examples on the following text \n {text1}',
    input_variables=['text1']
)

prompt2=PromptTemplate(
    template='Generate 5 question answers from the following text \n {text2}',
    input_variables=['text2']
)

prompt3=PromptTemplate(
    template='Merge the provided notes and quiz to a single document \n notes -> {notes} and Quiz -> {quiz}',
    input_variables=['notes','quiz']
)

In [15]:
parser=StrOutputParser()

In [16]:
parrallel_chain=RunnableParallel({
    'notes': prompt1 | model1 | parser,
    'quiz': prompt2 | model2 | parser
})

merge_chain= prompt3 | model2 | parser

In [17]:
chain= parrallel_chain | merge_chain

In [18]:
text1 = """
Recurrent Neural Networks (RNNs) – Detailed Notes
1. Introduction
Recurrent Neural Networks (RNNs) are a class of neural networks designed to handle sequential data. Unlike feedforward neural networks, RNNs have connections that form directed cycles, allowing information to persist and enabling them to maintain a form of memory of previous inputs. This makes them particularly suitable for tasks where context and order matter, such as natural language processing, time series forecasting, and speech recognition.

2. Why RNNs?
Traditional neural networks do not have an internal memory and treat each input independently. This is inefficient for sequential tasks where the output depends on the previous computations or inputs. RNNs address this limitation by:

Processing inputs one step at a time (sequentially).

Maintaining a hidden state that gets updated with each input.

Sharing parameters across all time steps.

This allows RNNs to learn patterns over time and make predictions based on both current input and prior context.

3. Architecture of RNN
At each time step t, an RNN receives:

An input vector xₜ

The previous hidden state hₜ₋₁

And computes:

The current hidden state hₜ using the formula:
hₜ = tanh(Wₕₕ·hₜ₋₁ + Wₓₕ·xₜ + bₕ)

An optional output yₜ, often computed as:
yₜ = Wₕy·hₜ + b_y

Where:

Wₓₕ is the weight matrix for input to hidden state

Wₕₕ is the weight matrix for hidden to hidden state

Wₕy is the weight matrix for hidden to output

bₕ and b_y are biases

All time steps share these weights, which significantly reduces the number of parameters.

4. Sequence Modeling with RNNs
RNNs can be configured in different ways depending on the task:

One-to-One: A single input produces a single output (e.g., image classification).

One-to-Many: A single input produces a sequence of outputs (e.g., image captioning).

Many-to-One: A sequence of inputs produces a single output (e.g., sentiment analysis).

Many-to-Many: A sequence of inputs produces a sequence of outputs (e.g., machine translation).

5. Backpropagation Through Time (BPTT)
Training an RNN involves unfolding it in time and applying backpropagation, a process known as Backpropagation Through Time (BPTT). In BPTT:

The loss is computed at each time step.

Gradients are propagated backward through each time step.

Weight updates are made by accumulating gradients over time.

This allows RNNs to learn long-term dependencies, but also introduces challenges due to the recursive nature of the network.

6. Challenges with RNNs
Despite their strengths, RNNs have several limitations:

a. Vanishing Gradient Problem
During BPTT, gradients can become very small as they are multiplied across many time steps. This leads to extremely slow learning or an inability to learn long-range dependencies.

b. Exploding Gradient Problem
Gradients can also grow exponentially, causing instability in training.

c. Short-Term Memory
Due to vanishing gradients, RNNs tend to remember only short-term dependencies unless special mechanisms are introduced.

7. Variants of RNNs
To address the limitations of vanilla RNNs, several improved architectures were developed:

a. Long Short-Term Memory (LSTM)
LSTM introduces memory cells and gates (input, forget, output gates) that control the flow of information, allowing the network to remember or forget information over longer periods.

b. Gated Recurrent Unit (GRU)
GRUs simplify LSTM by combining the forget and input gates into a single update gate, reducing complexity while retaining the ability to learn long-term dependencies.

c. Bidirectional RNN
These RNNs process data in both forward and backward directions, which helps in capturing context from both past and future.

8. Applications of RNNs
RNNs are widely used in domains where sequential information is crucial:

Natural Language Processing (NLP): Sentiment analysis, machine translation, text generation, question answering.

Speech Recognition: Transcribing audio signals into text.

Time Series Prediction: Forecasting stock prices, weather, or sales.

Music and Audio: Music composition and audio generation.

Video Processing: Action recognition and captioning.

9. Limitations in Modern Context
While RNNs laid the foundation for sequential modeling, they are increasingly being replaced or complemented by models like Transformers, which handle sequences using self-attention mechanisms rather than recurrence. Transformers offer better parallelization, handle long-range dependencies more effectively, and have become state-of-the-art in many NLP tasks.

10. Conclusion
Recurrent Neural Networks are powerful tools for modeling sequential data. Their ability to maintain hidden states and learn patterns over time makes them suitable for a variety of temporal and linguistic tasks. However, due to their training difficulties and limitations with long-term dependencies, they are often augmented or replaced by advanced architectures like LSTM, GRU, or Transformers in modern applications.

Understanding RNNs is fundamental to grasping the evolution of deep learning models for sequence data.
"""

In [19]:
text2="""
Convolutional Neural Networks (CNNs) – Detailed Theoretical Notes
1. Introduction
Convolutional Neural Networks (CNNs) are a specialized type of artificial neural networks designed primarily for processing structured grid data, such as images. Unlike traditional feedforward neural networks, CNNs are particularly efficient in handling spatial hierarchies in images through the use of convolutional operations.

CNNs were inspired by the visual cortex of animals, where individual neurons respond to stimuli only in a restricted region of the visual field known as the receptive field.

2. Motivation Behind CNNs
In classical machine learning and fully connected networks, each neuron connects to every input pixel, which becomes computationally infeasible for high-dimensional data like images. Additionally, such models ignore the spatial structure of the data. CNNs overcome these limitations by:

Leveraging local connections

Applying weight sharing

Using hierarchical feature extraction

These characteristics reduce the number of parameters and preserve spatial relationships.

3. Key Components of a CNN
a. Input Layer
The input to a CNN is typically a multi-dimensional array (e.g., a 2D grayscale image or a 3D color image). Each image has:

Width (W)

Height (H)

Depth or Channels (C), such as 3 for RGB images

b. Convolutional Layer
This layer is the core building block of a CNN. It involves:

A set of learnable filters or kernels (e.g., 3×3 or 5×5) that slide over the input image.

At each location, the filter performs an element-wise multiplication followed by a summation, producing a feature map.

Important concepts:

Stride: The step size with which the filter moves.

Padding: Adding borders to the input so that the output feature map retains dimensionality. There are two common types: "valid" (no padding) and "same" (output size equals input size).

Receptive field: The region in the input that a filter looks at.

The goal of convolution is to extract features such as edges, textures, shapes, etc.

c. Activation Function
After each convolution operation, a non-linear activation function (most commonly ReLU: Rectified Linear Unit) is applied to introduce non-linearity and allow the network to learn complex patterns.

Formula for ReLU:
f(x) = max(0, x)

Other activation functions include Leaky ReLU, Tanh, and Sigmoid, though ReLU is most common due to its simplicity and performance.

d. Pooling Layer (Subsampling or Downsampling)
Pooling layers reduce the spatial size of feature maps, which helps:

Decrease computation

Reduce overfitting

Introduce translation invariance

Types of pooling:

Max Pooling: Takes the maximum value in each region (e.g., 2×2 window)

Average Pooling: Takes the average value

Max pooling is more commonly used because it highlights the most prominent features.

e. Fully Connected Layer (Dense Layer)
After several convolutional and pooling layers, the output is flattened and passed to one or more fully connected layers. These layers act as a classifier and produce the final output, such as class scores in classification tasks.

f. Output Layer
For classification tasks, the final layer often uses:

Softmax for multi-class classification

Sigmoid for binary classification

4. CNN Architecture Summary
A typical CNN architecture looks like this:

mathematica
Copy
Edit
Input → [Conv → Activation → Pool]*N → Flatten → Fully Connected → Output
Here, [Conv → Activation → Pool]*N represents N repetitions of convolution, activation, and pooling.

5. Parameter Sharing and Sparsity
CNNs are highly efficient due to:

Parameter Sharing: The same filter (set of weights) is used across the entire input image, greatly reducing the number of parameters compared to fully connected layers.

Sparse Connectivity: Each neuron in the convolutional layer is connected to only a small region of the input.

This makes CNNs scalable and trainable on large image datasets.

6. Advantages of CNNs
Capture spatial and temporal dependencies

Fewer parameters compared to fully connected networks

Require minimal preprocessing

Perform well in image-related tasks

Translation invariance due to pooling

7. Training a CNN
CNNs are trained using the same approach as other neural networks:

Forward Propagation: Input is passed through the layers to compute output.

Loss Calculation: Using functions like Cross-Entropy for classification.

Backpropagation: Compute gradients of loss with respect to weights.

Gradient Descent Optimization: Update weights using optimizers like SGD or Adam.

Techniques such as data augmentation, dropout, and batch normalization are often used to improve performance and reduce overfitting.

8. Applications of CNNs
CNNs have revolutionized computer vision and are widely used in:

Image classification (e.g., recognizing objects in images)

Object detection (e.g., identifying multiple objects with bounding boxes)

Facial recognition

Medical image analysis (e.g., tumor detection in MRIs or X-rays)

Self-driving cars (e.g., lane detection, pedestrian recognition)

Image segmentation

Style transfer and image generation

9. Variants and Enhancements
Several advanced CNN architectures have been proposed to improve performance:

LeNet: One of the earliest CNNs, used for digit recognition.

AlexNet: Popularized CNNs after winning ImageNet in 2012.

VGGNet: Used very deep networks with 3x3 convolutions.

GoogLeNet/Inception: Used parallel convolutional layers (Inception modules).

ResNet: Introduced skip connections (residual learning) to combat vanishing gradients and allow deeper networks.

These architectures differ in design but all follow the core principles of CNNs.

10. Limitations of CNNs
Despite their success, CNNs have certain limitations:

Lack of ability to model global relationships explicitly

Depend heavily on labeled data for supervised training

Require large amounts of computational power and memory

Not ideal for sequential data (where RNNs or Transformers are better suited)

Sensitive to adversarial attacks in security-critical applications

11. Recent Trends
Modern deep learning has seen CNNs being combined with or replaced by newer models:

Vision Transformers (ViTs): These use self-attention instead of convolutions to model images.

Hybrid Models: Combine CNNs with transformers or recurrent layers.

Self-supervised learning: Uses unlabelled data to pretrain CNNs before fine-tuning.

While CNNs remain foundational, they are now part of a broader ecosystem of computer vision models.

12. Conclusion
Convolutional Neural Networks are powerful tools for processing visual and spatial data. Their architecture, inspired by biological systems, efficiently captures local features while reducing computational complexity. From simple pattern recognition to complex object detection and scene understanding, CNNs are central to many modern AI applications. Understanding CNN theory is critical for working in fields like computer vision, medical imaging, and even audio and signal processing.
"""

In [20]:
res = chain.invoke({'text1':text1,'text2':text2})

In [21]:
print(res)

## Recurrent Neural Networks (RNNs): Simplified Notes and Quiz

### What are RNNs?

Imagine a computer that remembers what happened before in a story. That's like how RNNs work! They're designed to understand sequential data, which means things happening in order. Think about sentences, music, or even stock market movements.

### Why RNNs?

Traditional algorithms don't care about the order things happen. They work like this:
* **Sequence problem:** A movie doesn't make sense out of context. Imagine trying to watch it without knowing the previous details!
* **RNN answer:** They have "memory", storing how past things impact the present. This lets them understand the context of words/sounds/trends.

### How do they work? 

1. **Taking in input:** They treat each part of the sequence as a "step" and get initial information.
2. **Remembering past:** The model remembers the current input, but also how it's connected to previous information ("molecular chart"). 
3. **Generating output:** Thin

# We can use same notes in both models to create notes and quiz i have used different just to learn how it is done 