# How LangChain Works

In [None]:
!pip install groq
!pip install autocorrect nltk



## Setting up GROQ_API_KEY and Initialize the Groq client

**GROQ API key:** https://console.groq.com/keys


In [None]:
import os
from groq import Groq
from autocorrect import Speller
import nltk
import re
nltk.download('words')
from nltk.corpus import words

# Place Your API Key Here
os.environ["GROQ_API_KEY"] = "Place Your API Key Here"
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))
print("Initialized Groq client.")

Initialized Groq client.


[nltk_data] Downloading package words to /root/nltk_data...
[nltk_data]   Package words is already up-to-date!


### Step 1: User Input – This is where everything starts.
The user provides a query

In [None]:
user_query = " whatIS  FineTuning  ?? "
print(f"Original User Input: {user_query}")

Original User Input:  whatIS  FineTuning  ?? 


### Step 2: Preprocessing Chain – This prepares and transforms the user's input before passing it to the LLM.


In [None]:
def preprocess_query(query):
    spell = Speller(lang="en")
    english_words = set(words.words())

    # Step 1: Trim whitespace and lowercase
    query = query.strip().lower()

    # Step 2: Correct spelling for each word
    words_in_query = query.split()
    corrected_words = [spell(word) for word in words_in_query]
    corrected_query = " ".join(corrected_words)

    # Step 3: Detect and split combined words if they're not recognized English words
    def split_combined_words(text):
        tokens = text.split()
        split_tokens = []

        for token in tokens:
            if token not in english_words:
                for i in range(1, len(token)):
                    left, right = token[:i], token[i:]
                    if left in english_words and right in english_words:
                        split_tokens.extend([left, right])
                        break
                else:
                    split_tokens.append(token)
            else:
                split_tokens.append(token)

        return " ".join(split_tokens)

    corrected_query = split_combined_words(corrected_query)

    # Step 4: Clean any remaining extra spaces and adjust punctuation
    corrected_query = " ".join(corrected_query.split())
    corrected_query = corrected_query.replace(" ?", "?").replace(" !", "!").replace(" .", ".")
    return corrected_query

preprocessed_query = preprocess_query(user_query)
print(f"Preprocessed Query: {preprocessed_query}")

Preprocessed Query: what is fine tuning??


### Step 3: LLM Query – The language model processes the question and searches for a relevant answer.
Create a chat completion request to the language model

In [None]:
chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": preprocessed_query,
        }
    ],
    model="llama3-8b-8192",
)
print("Query sent to LLM.")

Query sent to LLM.


### Step 4: Postprocessing Chain – The answer is formatted by using methods such as filtering, summarization, and answer enhancement.

Extract and enhance the response from the LLM

In [None]:
response = chat_completion.choices[0].message.content
print("Raw Response from LLM received.")

def postprocess_response(response_text):
    response_text = response_text.strip()
    enhanced_response = response_text.replace("**", "")
    return enhanced_response

enhanced_response = postprocess_response(response)
print(f"Enhanced Response:\n{enhanced_response}")

Raw Response from LLM received.
Enhanced Response:
Fine-tuning refers to the process of making adjustments or adjustments to a system, model, or algorithm to achieve a specific outcome, goal, or performance. It's a common practice in various fields, including science, engineering, and machine learning.

In the context of machine learning, fine-tuning typically involves adjusting the parameters of a pre-trained model to fit a new, specific task or dataset. This can be done to improve the model's performance on the target task, adapt to a new domain or dataset, or achieve a better balance between different objectives.

Fine-tuning can be done in various ways, such as:

1. Parameter adjustment: Adjusting specific model parameters to better fit the new task or dataset.
2. Iteration: Re-training a model on a small dataset or a subset of the original dataset to adapt to the new task or domain.
3. Ensemble methods: Combining multiple models or averaging their predictions to improve overall pe

### Step 5: Output – The final result is presented to the user.

In [None]:
def generate_final_output(enhanced_response):
    final_output = f"Here's what I found:\n{enhanced_response}\n\nIf you have more questions, feel free to ask!"
    return final_output

# Generate Final Output
final_output = generate_final_output(enhanced_response)
print(f"Final Output:\n{final_output}")

Final Output:
Here's what I found:
Fine-tuning refers to the process of making adjustments or adjustments to a system, model, or algorithm to achieve a specific outcome, goal, or performance. It's a common practice in various fields, including science, engineering, and machine learning.

In the context of machine learning, fine-tuning typically involves adjusting the parameters of a pre-trained model to fit a new, specific task or dataset. This can be done to improve the model's performance on the target task, adapt to a new domain or dataset, or achieve a better balance between different objectives.

Fine-tuning can be done in various ways, such as:

1. Parameter adjustment: Adjusting specific model parameters to better fit the new task or dataset.
2. Iteration: Re-training a model on a small dataset or a subset of the original dataset to adapt to the new task or domain.
3. Ensemble methods: Combining multiple models or averaging their predictions to improve overall performance.
4. Hy

# LangChain Components

In [None]:
!pip install groq langchain

Collecting groq
  Downloading groq-0.11.0-py3-none-any.whl.metadata (13 kB)
Collecting langchain
  Downloading langchain-0.3.5-py3-none-any.whl.metadata (7.1 kB)
Collecting httpx<1,>=0.23.0 (from groq)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-core<0.4.0,>=0.3.13 (from langchain)
  Downloading langchain_core-0.3.13-py3-none-any.whl.metadata (6.3 kB)
Collecting langchain-text-splitters<0.4.0,>=0.3.0 (from langchain)
  Downloading langchain_text_splitters-0.3.1-py3-none-any.whl.metadata (2.3 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain)
  Downloading langsmith-0.1.137-py3-none-any.whl.metadata (13 kB)
Collecting httpcore==1.* (from httpx<1,>=0.23.0->groq)
  Downloading httpcore-1.0.6-py3-none-any.whl.metadata (21 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->groq)
  Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain-core<0.4.0,>=0.3.13->langchain)
  Downloadi

### Setting up GROQ_API_KEY and Initialize the Groq client


**GROQ API key:** https://console.groq.com/keys


In [None]:
import os
from groq import Groq

os.environ["GROQ_API_KEY"] = "Place Your API Key Here"
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))
print("Initialized Groq client with API key.")

Initialized Groq client with API key.


### 1. LLMs - Text Generation

In [None]:
# User-defined theme for the story
theme = "A Student learning the basics of LangChain"
print(f"User-defined theme: {theme}")

# Generate story using LLM
story_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": f"Generate a creative story about {theme}.",
        }
    ],
    model="llama3-8b-8192",
)
print("Story generation request sent to the LLM.")

print("Generated Story:")
print(story_completion.choices[0].message.content)

User-defined theme: A Student learning the basics of LangChain
Story generation request sent to the LLM.
Generated Story:
**The Mysterious Box of LangChain**

It was a typical Friday morning for Emily, a student in her second year of college, huddled in the campus library's corner, surrounded by stacks of books and notes. She had spent the entire week stressing about the upcoming exams, and her professor, Dr. Patel, had just assigned a challenging project that required mastering the basics of LangChain, a revolutionary natural language processing (NLP) framework.

As Emily scrolled through her email, her eyes widened at the sight of an unmarked package with her name on it. Curious, she ripped open the box to find a sleek, futuristic device called the "LangChain Primer."

The device, about the size of a smartwatch, glowed with an eerie blue light. Suddenly, a holographic interface materialized before her, beckoning her to touch the screen. Emily hesitated for a moment before gingerly pl

### 2. Prompts - Simplifying Information

In [None]:
from langchain.prompts import PromptTemplate

# Defining a prompt template
template = """
I want you to act as a sports expert.
Can you list some popular games for the sport: {sport}?
"""

# Creating the prompt template
prompt = PromptTemplate(
    input_variables=["sport"],
    template=template,
)

# Format the prompt with a specific game related to sports
formatted_prompt = prompt.format(sport="cricket")

response = client.chat.completions.create(
    messages=[{"role": "user", "content": formatted_prompt}],
    model="llama3-8b-8192",
)

ai_response = response.choices[0].message.content
cleaned_response = ai_response.replace("**", "")

print("Model Response:")
print(cleaned_response)

Model Response:
As a sports expert, I'd be happy to help you with that! Cricket is a popular sport with a huge following globally. Here are some of the most popular formats of the game:

1. One-Day Internationals (ODIs): A limited-overs format where each team plays one innings, with a maximum of 50 overs (300 balls).
2. T20 Internationals (T20Is): A limited-overs format where each team plays one innings, with a maximum of 20 overs (120 balls).
3. Test Cricket: A longer format of the game, where teams play for several days, with two innings per team, and a match can last up to 5 days.
4. First-Class Cricket: Another long-format version of the game, played between teams at the domestic level, often with four-day matches.
5. List A Cricket: A limited-overs format, similar to ODIs, but played at the domestic level, often with a maximum of 50 overs per team.
6. Twenty20 Cup: A domestic T20 league in England and Wales, similar to the Indian Premier League (IPL) or Big Bash League (BBL).
7. C

### 3. Chains - Data Processing Workflow

In [None]:
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains import LLMChain, SimpleSequentialChain

llm = client

# Step 1: Summarization Chain - summarize user input
summarization_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an assistant that summarizes text."),
        ("human", "{input}"),
    ]
)
summarization_chain = LLMChain(llm=llm, prompt=summarization_prompt)

# Step 2: Sentiment Analysis Chain - analyze the sentiment of the summarized text
sentiment_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an assistant that detects the sentiment of the following text."),
        ("human", "{input}"),
    ]
)
sentiment_chain = LLMChain(llm=llm, prompt=sentiment_prompt)

# Step 3: Response Generation Chain - generate a response based on sentiment
response_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a friendly assistant that responds appropriately based on sentiment."),
        ("human", "{input}"),
    ]
)
response_chain = LLMChain(llm=llm, prompt=response_prompt)

# Combine all chains into a SimpleSequentialChain
combined_chain = SimpleSequentialChain(
    chains=[summarization_chain, sentiment_chain, response_chain], verbose=True
)

# Run the combined chain with initial user text input
user_text = "I can't believe how terrible my experience was. The service was prolonged. This is honestly the worst experience I've had, and I'm really disappointed."
final_response = combined_chain.run(user_text)

# Print the final response generated by the chain
print("Final Response to User:")
print(final_response)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mThe user reports a terrible experience with a service. They describe it as prolonged and the worst they've ever had, expressing disappointment.[0m
[33;1m[1;3mThe sentiment of the text you provided appears to be negative. The user describes their experience with a service as terrible, prolonged, and the worst they've ever had. They also express disappointment, which further indicates a negative sentiment. I'm here to help with any questions or concerns you might have, so please let me know how I can assist you further.[0m
[38;5;200m[1;3mI'm really sorry to hear about the negative experience you've had. I assure you that I will do my best to assist you and address your concerns. Could you please provide me with more details about the service that you found terrible and prolonged? This will help me better understand the situation and provide a more appropriate response. Thank you.[0m

[1m> Finished chain.[0m
Fin

## 4. Agents - Decision-Making

In [None]:
# User query that needs a decision
user_query = "I'm having trouble with my order."
print(f"User query: {user_query}")

# Simulate agent decision-making
agent_response = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": f"User query: '{user_query}'. Should I respond or escalate this issue?",
        }
    ],
    model="llama3-8b-8192",
)
print("Request for agent decision sent to the LLM.")

# Print agent's decision
print("Agent's Decision:")
print(agent_response.choices[0].message.content)

User query: I'm having trouble with my order.
Request for agent decision sent to the LLM.
Agent's Decision:
A timely question!

If a customer is having trouble with their order, it's always best to respond promptly and professionally to address their concern. Many customers will simply want to resolve the issue quickly and easily, so a swift and helpful response can go a long way in keeping them satisfied and loyal.

Before escalating the issue, consider offering support and troubleshooting steps to help resolve the problem. This might include:

1. Asking the customer to provide more information about the issue, such as order details or a description of the problem they're experiencing.
2. Offering to resend or cancel the order, depending on the customer's needs.
3. Providing troubleshooting steps or explanations to help the customer resolve the issue themselves.

Some examples of responses you could send:

* "Sorry to hear you're having trouble with your order. Can you please provide 

### 5. Memory - Chat History Recall

In [None]:
class ChatMessageHistory:
    def __init__(self):
        self.messages = []

    def add_user_message(self, message):
        self.messages.append({"role": "user", "content": message})

    def add_ai_message(self, message):
        self.messages.append({"role": "assistant", "content": message})

history = ChatMessageHistory()

history.add_user_message("Hi!")
history.add_ai_message("What's up?")

# Print current messages in history
print("Current Messages in History:")
for msg in history.messages:
    print(f"{msg['role']}: {msg['content']}")

Current Messages in History:
user: Hi!
assistant: What's up?


In [None]:
def generate_response(history):
    response = client.chat.completions.create(
        messages=history.messages,
        model="llama3-8b-8192"
    )
    return response.choices[0].message.content

history.add_user_message("what is deep learning?")
ai_response = generate_response(history)
ai_response_cleaned = ai_response.replace("**", "")

In [None]:
history.add_ai_message(ai_response_cleaned)

print("\nUpdated Messages in History:")
for msg in history.messages:
    print(f"{msg['role']}: {msg['content']}")


Updated Messages in History:
user: Hi!
assistant: What's up?
user: what is deep learning?
assistant: Deep learning is a subfield of artificial intelligence (AI) that involves the use of neural networks with multiple layers to analyze and interpret data. The term "deep" refers to the number of layers in the network, which can be hundreds or even thousands.

In traditional machine learning, algorithms are trained to recognize features or patterns in data. For example, a machine learning model might be trained to recognize images of cats and dogs based on their shapes, colors, and textures. However, these models have limitations when it comes to analyzing complex data, such as images with multiple objects, sounds with multiple frequencies, or text with multiple contexts.

Deep learning models, on the other hand, are designed to learn complex patterns and relationships in data by iterating through multiple layers of neural networks. Each layer processes the data and extracts features that

### 5. Memory - Chat History Recall to find out the k value

In [None]:
class ChatMessageHistory:
    def __init__(self):
        self.messages = []

    def add_user_message(self, message):
        self.messages.append({"role": "user", "content": message})

    def add_ai_message(self, message):
        self.messages.append({"role": "assistant", "content": message})

    def calculate_token_length(self):
        return sum(len(msg['content']) for msg in self.messages)

history = ChatMessageHistory()

history.add_user_message("Hi!")
history.add_ai_message("What's up?")

def generate_response(history):
    total_token_length = history.calculate_token_length()
    print(f"Total token length: {total_token_length}")
    response = client.chat.completions.create(
        messages=history.messages,
        model="llama3-8b-8192"
    )
    return response.choices[0].message.content

In [None]:
# Open-ended loop to keep accepting user input
interaction_count = 1
while True:
    user_input = input("You: ")
    history.add_user_message(user_input)
    try:
        ai_response = generate_response(history)
        history.add_ai_message(ai_response)
    except Exception as e:
        print("\nError occurred:", e)
        break

    print("\nCurrent Messages in History (Interaction {}):".format(interaction_count))
    for msg in history.messages:
        print(f"{msg['role']}: {msg['content']}")

    print("\nAI:", ai_response)

    # Check for hallucination indicators
    if "unrelated" in ai_response.lower() or "nonsense" in ai_response.lower():
        print("\nModel might be hallucinating. Stopping interaction.")
        break

    interaction_count += 1

You: hi
Total token length: 15

Current Messages in History (Interaction 1):
user: Hi!
assistant: What's up?
user: hi
assistant: Hi again! How's it going?

AI: Hi again! How's it going?
You: hello
Total token length: 45

Current Messages in History (Interaction 2):
user: Hi!
assistant: What's up?
user: hi
assistant: Hi again! How's it going?
user: hello
assistant: Hello again! It's nice to chat with you. Is there something on your mind that you'd like to talk about, or are you just saying hello?

AI: Hello again! It's nice to chat with you. Is there something on your mind that you'd like to talk about, or are you just saying hello?
You: do you know about india
Total token length: 201

Current Messages in History (Interaction 3):
user: Hi!
assistant: What's up?
user: hi
assistant: Hi again! How's it going?
user: hello
assistant: Hello again! It's nice to chat with you. Is there something on your mind that you'd like to talk about, or are you just saying hello?
user: do you know about in

### 6. Retrieval - Simulating retrieval from an external source

In [None]:
!pip install -qU langchain-groq langchain-community arxiv

In [None]:
from langchain_community.retrievers import ArxivRetriever
from langchain_groq import ChatGroq
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

# Initialize the Arxiv retriever to fetch relevant documents
retriever = ArxivRetriever(
    load_max_docs=2,
    get_ful_documents=True,
)

# Fetch documents based on a query
query = "What is the transformers?"
docs = retriever.invoke(query)

# Display metadata and content of the retrieved document
print("Document Metadata:", docs[0].metadata)
print("Document Content Snippet:", docs[0].page_content[:1500])

Document Metadata: {'Entry ID': 'http://arxiv.org/abs/2408.08670v1', 'Published': datetime.date(2024, 8, 16), 'Title': 'Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning', 'Authors': 'Alessio Devoto, Federico Alvetreti, Jary Pomponi, Paolo Di Lorenzo, Pasquale Minervini, Simone Scardapane'}
Document Content Snippet: Recently, foundation models based on Vision Transformers (ViTs) have become
widely available. However, their fine-tuning process is highly
resource-intensive, and it hinders their adoption in several edge or low-energy
applications. To this end, in this paper we introduce an efficient fine-tuning
method for ViTs called $\textbf{ALaST}$ ($\textit{Adaptive Layer Selection
Fine-Tuning for Vision Transformers}$) to speed up the fine-tuning process
while reducing computational cost, memory load, and training time. Our approach
is based on the observation that not all layers are equally critical during
fine-tuning, and their importance varies depending on the 

In [None]:
# Initialize the ChatGroq language model
llm = ChatGroq(model="llama3-8b-8192")

# Define the prompt template for the question-answering task
prompt = ChatPromptTemplate.from_template(
    """Answer the question based only on the context provided.

Context: {context}

Question: {question}"""
)

# Function to format the retrieved documents for the model
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Create the chain that includes retrieving, formatting, and generating a response
chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

response = chain.invoke(query)

# Remove asterisks from the AI response
cleaned_response = response.replace("**", "")
print("Response:", cleaned_response)

Response: Based on the provided context, the transformers referred to are:

1. Vision Transformers (ViTs): These are foundation models used in computer vision tasks.
2. Probabilistic Transformers: These are small neural networks used for data representations in an arbitrary metric space.
3. Adaptive Layer Selection Transformers (ALaST): This is an efficient fine-tuning method for Vision Transformers that adaptively allocates computational resources to layers.


### 7. Indexing - Document Search

In [None]:
# Simulate indexing documents related to deep learning
documents = {
    "doc1": "This document explains the basics of deep learning and neural networks.",
    "doc2": "This document covers popular deep learning frameworks like TensorFlow and PyTorch.",
    "doc3": "This document discusses the applications of deep learning in image recognition and natural language processing."
}
print("Documents indexed for search on deep learning.")

# User search query
search_query = "deep learning"
print(f"User search query: {search_query}")

# Search through indexed documents
search_response = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": f"Search for documents containing: '{search_query}'.",
        }
    ],
    model="llama3-8b-8192",
)
print("Request for document search sent to the LLM.")

# Print search results
print("Search Results:")
print(search_response.choices[0].message.content)

Documents indexed for search on deep learning.
User search query: deep learning
Request for document search sent to the LLM.
Search Results:
I've searched for documents containing the phrase "deep learning". Here are some results:

1. **Deep Learning** by Ian Goodfellow, Yoshua Bengio, and Aaron Courville (2016) - This is a comprehensive textbook on deep learning, widely considered a seminal work in the field.

Source: Book excerpt on GitHub
Text: "Deep learning is a subfield of machine learning that employs neural networks with multiple layers to learn hierarchical representations of data".

2. **Deep Learning for Computer Vision with Python** by Adrian Rosebrock (2017) - This is a book that covers the application of deep learning to computer vision tasks using Python.

Source: Book excerpt on GitHub
Text: "In this chapter, we'll explore the basics of deep learning and how it can be used to create advanced computer vision systems".

3. Research paper: **Deep Learning to Classify Invas

# Multi-Agent Framework in LangChain








**Personalized News Aggregator**

**News API Key** - https://newsapi.org/

In [None]:
!pip install groq feedparser
!pip install langchain groq newsapi-python

Collecting groq
  Downloading groq-0.11.0-py3-none-any.whl.metadata (13 kB)
Collecting feedparser
  Downloading feedparser-6.0.11-py3-none-any.whl.metadata (2.4 kB)
Collecting httpx<1,>=0.23.0 (from groq)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting sgmllib3k (from feedparser)
  Downloading sgmllib3k-1.0.0.tar.gz (5.8 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting httpcore==1.* (from httpx<1,>=0.23.0->groq)
  Downloading httpcore-1.0.6-py3-none-any.whl.metadata (21 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->groq)
  Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Downloading groq-0.11.0-py3-none-any.whl (106 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m106.5/106.5 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading feedparser-6.0.11-py3-none-any.whl (81 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.3/81.3 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00

In [None]:
import os
import requests
from groq import Groq
from newsapi import NewsApiClient
os.environ["GROQ_API_KEY"] = "Place Your API Key Here"
os.environ["NEWS_API_KEY"] = "Place Your News API Key Here"

### News Retriever Agent: Responsible for fetching news articles based on user queries


**Explanation of How Agents Work in the Personalized News Aggregator:**

1. **NewsRetrieverAgent:** This agent fetches news articles based on the user’s input. It takes the search query, retrieves relevant articles from the News API, and filters them based on the keywords.

2. **UserSelectionAgent:** This agent presents the list of retrieved articles to the user and allows them to select one for reading. It manages user input and ensures valid selection.

3. **SummarizationAgent:** This agent takes the content of the selected article and summarizes it using llama3. It ensures the content is suitable for summarization and processes the request to generate a concise summary.

4. **Presentation Agent:** Formats and delivers news.

In [None]:
class NewsRetrieverAgent:
    def __init__(self, api_key):
        # Initialize the News API client with the provided API key
        self.newsapi = NewsApiClient(api_key=api_key)

    def get_news(self, query_tokens, language='en', page_size=5):
        try:
            # Fetch articles based on the provided tokens
            all_articles = self.newsapi.get_everything(
                q=' '.join(query_tokens),
                language=language,
                page_size=100,
                sort_by="publishedAt"
            )

            # Check if the API response is successful
            if all_articles['status'] != 'ok':
                print(f"Error fetching articles: {all_articles['message']}")
                return []

            # Filter articles that contain all tokens in the title or description
            relevant_articles = [
                article for article in all_articles['articles']
                if all(token.lower() in article['title'].lower() or token.lower() in article['description'].lower() for token in query_tokens)
            ]
            return relevant_articles[:page_size]

        except Exception as e:
            print(f"An error occurred while retrieving news: {e}")
            return []

### User Selection Agent: Allows users to select an article from the retrieved articles

In [None]:
class UserSelectionAgent:
    @staticmethod
    def select_article(articles):
        if not articles:
            print("No articles available to select.")
            return None

        print("\nHere are the latest articles:")
        for i, article in enumerate(articles):
            print(f"{i + 1}. {article['title']}")

        while True:
            try:
                # Prompt user to select an article by entering its number
                selection = int(input("\nSelect the number of the article you want to read: ")) - 1
                if 0 <= selection < len(articles):
                    return articles[selection]  # Return the selected article
                else:
                    print("Invalid selection, please try again.")
            except ValueError:
                print("Please enter a valid number.")

### Summarization Agent: Summarizes the content of the selected article

In [None]:
class SummarizationAgent:
    def __init__(self):
        self.client = Groq(api_key=os.environ.get("GROQ_API_KEY"))

    def summarize(self, text):
        # Check if the content is too short to summarize
        if len(text) < 100:
            return "The content is too short for summarization."
        # Create a chat completion request for summarization
        chat_completion = self.client.chat.completions.create(
            messages=[
                {
                    "role": "user",
                    "content": f"Summarize the following news article:\n\n{text}\n\nSummary:",
                }
            ],
            model="llama3-8b-8192",
        )
        return chat_completion.choices[0].message.content.strip()

### Main Pipeline: Coordinates the entire process

In [None]:
def main():
    # Step 1: Instantiate the NewsRetrieverAgent
    news_retriever = NewsRetrieverAgent(api_key=os.environ["NEWS_API_KEY"])

    # Step 2: Get user input and tokenize
    query = input("Enter a topic or person (e.g., 'Elon Musk') to search for news: ")
    query_tokens = query.split()

    # Step 3: Retrieve news articles related to the tokens
    articles = news_retriever.get_news(query_tokens=query_tokens)
    if not articles:
        print(f"No news found for {query}. Please try again.")
        return

    # Step 4: Instantiate the UserSelectionAgent and let the user pick an article
    user_selection_agent = UserSelectionAgent()
    selected_article = user_selection_agent.select_article(articles)
    if selected_article is None:
        print("No article selected. Exiting.")
        return

    print("\nSelected Article Details:")
    print(f"Title: {selected_article['title']}")
    print(f"Source: {selected_article['source']['name']}")
    print(f"Published At: {selected_article['publishedAt']}")
    print(f"Description: {selected_article['description']}")
    print(f"Content: {selected_article['content']}")

    # Step 5: Instantiate the SummarizationAgent and summarize the selected article
    summarizer = SummarizationAgent()

    # Summarize the article content
    summary = summarizer.summarize(selected_article['content'])
    print("\nSummary:",summary)

In [None]:
if __name__ == "__main__":
    main()

Enter a topic or person (e.g., 'Elon Musk') to search for news: elon musk new car

Here are the latest articles:
1. Elon Musk’s New Robot Venture Is Essentially Cartman’s AWESOM-O Prank
2. Jim Cramer Advises Against Buying Tesla Stock, So I Guess It's Going Up Forever
3. Elon Musk’s Tesla Moves $760,000,000 in Bitcoin to New Wallets for First Time in Two Years: Arkham Intelligence
4. Is Elon Musk Dumping Bitcoin? Tesla Moves Entire $765 Million BTC Stash To Unknown Wallets
5. Here's What We Think of Tesla's New Cybercab

Select the number of the article you want to read: 5

Selected Article Details:
Title: Here's What We Think of Tesla's New Cybercab
Source: Biztoc.com
Published At: 2024-10-14T15:11:25Z
Description: In this bonus episode of the Elon, Inc. podcast, we discuss Tesla's unveiling of the long-awaited robotaxi, and why it may be even more challenging to put a safe self-driving car on the road than it is to send rockets to space. #musk #tesla #cybercab #robotax…
Content: In t

In [None]:
if __name__ == "__main__":
    main()

Enter a topic or person (e.g., 'Elon Musk') to search for news: Baba Siddique

Here are the latest articles:
1. Baba Siddique murder case: Another suspect held for funding shooters
2. Cops Make 4th Arrest In Baba Siddique Murder Case, Catch Accused From UP
3. The Lawrence of Sabarmati jail: Bishnoi's name echoes from Mumbai to Canada
4. Baba Siddique: What we know so far about public murder of an Indian politician - BBC.com
5. 'Zulmi' Zeeshan: What Cop Records Show On Baba Siddique Shooters' Handler

Select the number of the article you want to read: 1

Selected Article Details:
Title: Baba Siddique murder case: Another suspect held for funding shooters
Source: Business Standard
Published At: 2024-10-15T10:09:18Z
Description: The Mumbai police have arrested one more person in connection with the killing of NCP leader and former Maharashtra minister Baba Siddique, officials said on Tuesday.
The accused, identified as Harishkumar Balakram (23), hailing from Bahraich in Uttar Pradesh…
Con

In [None]:
if __name__ == "__main__":
    main()

Enter a topic or person (e.g., 'Elon Musk') to search for news: tata's heir 

Here are the latest articles:
1. Tata Trusts meeting today: Decision about successor on agenda, Noel Tata most likely heir to Ratan Tata, say reports

Select the number of the article you want to read: 1

Selected Article Details:
Title: Tata Trusts meeting today: Decision about successor on agenda, Noel Tata most likely heir to Ratan Tata, say reports
Source: Livemint
Published At: 2024-10-11T07:28:36Z
Description: The Tata Trusts board is meeting today to select Ratan Tata's successor — a pivotal role overseeing a $165 billion empire, with potential contenders including Noel Tata, Mehli Mistry, and Darius Khambata.
Content: Ahead of the Tata Trust meeting today on October 11, the late Ratan Tata's half-brother Noel Tata (67) has emerged as the front-runner to be the successor.
Bloomberg reported, citing sources in the … [+2892 chars]

Summary:
Ahead of a Tata Trust meeting on October 11, Noel Tata, the 67-y

# AI Automations by LangChain



Interview Preparation Assistant

**Job Info -** [Job Details](https://www.naukri.com/job-listings-principal-data-scientist-senior-data-scientist-data-scientist-nlp-benovymed-healthcare-noida-gurugram-delhi-ncr-0-to-5-years-151024914631?src=drecomm_mightlike&sid=17291039228452039&xp=1&px=1)

User Inputs Job Description -> Analyze Job Description -> Generate Interview Questions -> Extract Questions -> Conduct Interview Process (Ask Questions) -> Provide Feedback -> Display Overall Score and Improvement Areas


**How AI Automation Works:**

**Chaining Tasks:** In this application, tasks are chained together where the output of one step serves as the input for the next.

**Human-Like Decision-Making:** The program simulates a human-like interview process by asking questions, receiving answers, and providing tailored feedback based on user responses.

**External Data Interaction:** The integration of Groq's API for language model processing allows for advanced text generation and analysis, enhancing the capabilities of the interview preparation assistant.

In [None]:
!pip install langchain groq

Collecting langchain
  Downloading langchain-0.3.4-py3-none-any.whl.metadata (7.1 kB)
Collecting groq
  Downloading groq-0.11.0-py3-none-any.whl.metadata (13 kB)
Collecting langchain-core<0.4.0,>=0.3.12 (from langchain)
  Downloading langchain_core-0.3.12-py3-none-any.whl.metadata (6.3 kB)
Collecting langchain-text-splitters<0.4.0,>=0.3.0 (from langchain)
  Downloading langchain_text_splitters-0.3.0-py3-none-any.whl.metadata (2.3 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain)
  Downloading langsmith-0.1.136-py3-none-any.whl.metadata (13 kB)
Collecting httpx<1,>=0.23.0 (from groq)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting httpcore==1.* (from httpx<1,>=0.23.0->groq)
  Downloading httpcore-1.0.6-py3-none-any.whl.metadata (21 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->groq)
  Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain-core<0.4.0,>=0.3.12->langchain)
  Downloadi

### Setting up GROQ_API_KEY and Initialize the Groq client


In [None]:
import os
import re
from langchain.prompts import PromptTemplate
from groq import Groq

# Set the GROQ_API_KEY directly in the environment
os.environ["GROQ_API_KEY"] = "Place Your API Key Here"

# Initialize the Groq client for LLM processing
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))

### Automated Interview Preparation Assistant Prompts and Utilities

In [None]:
# Step 1: Define the prompt for Job Description Analyzer Agent
# This agent extracts key responsibilities, skills, and qualifications from the provided job description.

job_description_analyzer_template = """
Given the following job description, extract the key responsibilities, skills, and qualifications required for the job.
Job Description: {job_description}
"""
job_description_analyzer_prompt = PromptTemplate(
    template=job_description_analyzer_template, input_variables=["job_description"]
)

# Step 2: Define the Interview Question Generator Prompt
# This prompt generates interview questions based on the analyzed job description, role, and difficulty level.

interview_question_generator_template = """
Based on the extracted responsibilities, skills, and qualifications for a {role} role at a {difficulty} level, generate a list of five interview questions.
"""
interview_question_prompt = PromptTemplate(
    template=interview_question_generator_template, input_variables=["role", "difficulty"]
)

# Step 3: Feedback Agent Prompt for scoring and feedback
# This agent provides feedback on a user's mock interview answer, focusing on strengths, weaknesses, and score.

feedback_agent_template = """
Here is a mock interview answer: {answer}
Provide feedback on this answer, focusing on strengths, weaknesses, and improvements. Assign a score between 0 and 10.
"""
feedback_agent_prompt = PromptTemplate(
    template=feedback_agent_template, input_variables=["answer"]
)

# Step 4: Function to extract individual questions from the generated text
# This uses regex to extract only the actual questions from the generated output, excluding explanations.

def extract_questions(generated_text):
    # Use regex to find all questions in the generated text
    questions = re.findall(r'\*\*Question \d:\*\* (.*?)(?=\n\n|\Z)', generated_text, re.DOTALL)
    return [question.strip() for question in questions]

### Step 5: Chained Workflow for Interview Preparation

**Chained Workflow for Interview Preparation:**

1. **Analyze Job Description:** The job description is sent to the API, and the response is analyzed to extract key information.

2. **Generate Interview Questions:** Based on the extracted information, interview questions are generated.

3. **Extract Questions:** The generated questions are extracted and displayed to the user.

In [None]:
def interview_preparation_workflow(job_description, role, difficulty):
    # 1. Analyze the job description
    prompt = job_description_analyzer_prompt.format(job_description=job_description)
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": prompt}],
        model="llama3-8b-8192"
    )
    analyzed_info = response.choices[0].message.content
    print("Analyzed Job Description: \n", analyzed_info)

    # 2. Generate interview questions
    question_prompt = interview_question_prompt.format(role=role, difficulty=difficulty)
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": question_prompt}],
        model="llama3-8b-8192"
    )
    generated_questions = response.choices[0].message.content

    # 3. Extract individual questions from the generated response
    questions = extract_questions(generated_questions)
    if questions:
        print("Generated Interview Questions:")
        for i, question in enumerate(questions):
            print(f"Question {i + 1}: {question}")
    return questions


### Step 6: User Interaction for the Interview Preparation

Manages user interaction for the interview process, including asking questions, getting answers, providing feedback, and calculating scores.

In [None]:
def conduct_interview(questions):
    overall_score = 0
    total_questions = len(questions)
    improvement_areas = []
    question_index = 0
    # Display total questions available before starting the interview
    print(f"Total Questions Available: {total_questions}")
    # Ensure there are questions to ask
    if total_questions > 0:
        while question_index < total_questions:
            question = questions[question_index].strip()
            print(f"\nQuestion {question_index + 1}: {question}")
            knows_answer = input("Do you know the answer to this question? (yes/no): ").strip().lower()
            if knows_answer == "yes":
                # Ask the user to enter their answer
                answer = input("Please enter your answer: ")

                # Provide feedback and score the answer
                feedback = assistant.provide_feedback(answer)
                print(f"Feedback on your answer: \n{feedback}")

                # Extract score from the feedback
                score_line = [line for line in feedback.split('\n') if 'Score:' in line]
                print("Score Line Detected:", score_line)  # Debugging output
                if score_line:
                    try:
                        score = float(score_line[0].split(":")[1].strip())  # Adjusted to only convert score
                    except ValueError:
                        score = 0
                else:
                    score = 0
                overall_score += score
                if "improvement" in feedback.lower():
                    improvement_areas.append(feedback)
            else:
                print("Skipping this question...")

            # Ask if they want another question
            another_question = input("Do you want another question? (yes/no): ").strip().lower()
            if another_question != "yes":
                break
            question_index += 1
    average_score = overall_score / total_questions if total_questions > 0 else 0
    print(f"\nOverall Score: {average_score:.2f}/10")
    if improvement_areas:
        print("\nAreas to Improve:")
        for improvement in improvement_areas:
            print(f"- {improvement}")
    else:
        print("No major areas of improvement identified. Keep practicing!")

### Step 7: Get user input for job description, role, and difficulty

In [None]:
job_description = input("Please enter the job description: ")
role = input("Please enter the job title/role based on the job description: ")
difficulty = input("Please select the difficulty level (easy, medium, hard): ").strip().lower()

Please enter the job description: We are looking for passionate, Strong experience with an entrepreneur mindset to join us as Hardcore Full Stack Data Scientist end-to-end single-handed multi-hat role who is already working in applied AI in ML, deep Learning, ANN, and CNN platform specifically working as single-handed doers or in a small Data Science Team with full end to end ownership of Data Science work commitment delivery as Data Scientist working in any reputed AI Data Science is driven Tech startup preferably in Healthcare.
Please enter the job title/role based on the job description: Data Scientist
Please select the difficulty level (easy, medium, hard): easy


In [None]:
questions = interview_preparation_workflow(job_description, role, difficulty)

Analyzed Job Description: 
 Here are the key responsibilities, skills, and qualifications required for the job:

**Key Responsibilities:**

* Working as a single-handed Data Scientist, with full end-to-end ownership of Data Science work and commitment to delivery
* Developing and implementing end-to-end data science solutions using applied AI, ML, deep learning, ANN, and CNN platforms
* Driving data science projects in a small team or as a single-handed doer
* Providing data-driven insights and recommendations to improve business outcomes

**Skills:**

* Strong experience with deep learning and AI platforms
* Expertise in ML, ANN, and CNN algorithms
* Proficiency in programming languages such as Python
* Strong data science skills, including data preprocessing, feature engineering, and model development
* Experience working with large datasets and building data-driven products
* Excellent problem-solving and analytical skills
* Ability to work independently and deliver end-to-end owner

In [None]:
conduct_interview(questions)

Total Questions Available: 5

Question 1: Can you explain the concept of data preprocessing and provide an example of how you would handle missing values in a dataset?
Do you know the answer to this question? (yes/no): yes
Please enter your answer: Data preprocessing involves cleaning and transforming raw data into a usable format, ensuring accuracy and consistency. For example, I have used normalization to scale numerical features to a common range, enhancing model performance.
Feedback on your answer: 
Here's the feedback on the mock interview answer:

**Strengths:** (6/10)

* The answer starts by defining what data preprocessing is, which is important for non-technical interviewers.
* The example provided is specific and related to data preprocessing.
* The candidate mentions the benefits of normalization, such as enhancing model performance.

**Weaknesses:** (4/10)

* The answer is quite generic and doesn't provide any unique insights or personal experiences.
* The candidate doesn'

## Tweet summarizer

In [None]:
!pip install langchain-groq textblob

Collecting langchain-groq
  Downloading langchain_groq-0.2.0-py3-none-any.whl.metadata (2.9 kB)
Downloading langchain_groq-0.2.0-py3-none-any.whl (14 kB)
Installing collected packages: langchain-groq
Successfully installed langchain-groq-0.2.0


In [None]:
from langchain_groq import ChatGroq
from langchain.prompts import PromptTemplate
from textblob import TextBlob
from langchain import LLMChain

# Simulated list of tweets (replace this with actual Twitter API fetching logic)
tweets = [
    "Just attended an amazing conference on AI!",
    "The weather is terrible today, can't believe it's raining.",
    "I love learning about new technologies, especially AI and ML!",
    "Had a great time at the park with family. Nature is the best!",
    "Struggling with a project deadline. Need some motivation!",
]

# Step 1: Set up Groq LLM
os.environ["GROQ_API_KEY"] = "Place Your API Key Here"
groq = ChatGroq(api_key=os.environ.get("GROQ_API_KEY"))

# Step 2: Create a prompt template for summarization
prompt_template = PromptTemplate(
    input_variables=["tweet"],
    template="Summarize the following tweet: {tweet}"
)

# Step 3: Create an LLMChain for summarization
summary_chain = LLMChain(llm=groq, prompt=prompt_template)

# Step 4: Function for sentiment analysis
def analyze_sentiment(tweet):
    analysis = TextBlob(tweet)
    return analysis.sentiment.polarity

# Step 5: Process each tweet, summarize, and analyze sentiment
summarized_tweets = []
alerts = []

for tweet in tweets:
    summary = summary_chain.run(tweet)
    summarized_tweets.append(summary)
    sentiment_score = analyze_sentiment(tweet)

    if sentiment_score < 0:
        alerts.append(f"Alert: The tweet is negative: '{tweet}'")

In [None]:
for original, summarized in zip(tweets, summarized_tweets):
    print(f"Original Tweet: {original}")
    print(f"Summarized: {summarized}\n")

if alerts:
    for alert in alerts:
        print(alert)

Original Tweet: Just attended an amazing conference on AI!
Summarized: The person who tweeted has recently attended a conference focused on artificial intelligence (AI) and found it to be impressive or exciting. They might have learned about new developments in AI, heard from experts in the field, or networked with other professionals interested in AI. The tweet could also be used to express their enthusiasm for the potential of AI and its impact on various industries.

Original Tweet: The weather is terrible today, can't believe it's raining.
Summarized: The person expresses their disbelief that it's raining and describes the weather as terrible.

Original Tweet: I love learning about new technologies, especially AI and ML!
Summarized: The tweet expresses the author's enthusiasm for learning about new technological developments, specifically focusing on artificial intelligence (AI) and machine learning (ML).

Original Tweet: Had a great time at the park with family. Nature is the best

# RAG Using LangChain

### Research Paper Q&A using RAG for LangChain



```
Document Loading and Preprocessing -> Embedding Creation and Vector Storage -> Retrieval Mechanism Setup -> Response Generation Using RAG
```

Papers:

1. Paper 1: [An Effective Query System Using LLMs and LangChain](https://www.ijert.org/an-effective-query-system-using-llms-and-langchain
)

2. Paper 2: [Creating Large Language Model Applications Utilizing LangChain: A Primer on Developing LLM Apps Fast](https://www.researchgate.net/publication/372669736_Creating_Large_Language_Model_Applications_Utilizing_LangChain_A_Primer_on_Developing_LLM_Apps_Fast)

3. Paper 3: [LLM Based News Research Tool Using LangChain withEnhancingSimilarity Search and Token Limit](https://ijrpr.com/uploads/V5ISSUE7/IJRPR31768.pdf)

### Importing libraries

In [None]:
!pip install langchain-community langchain-chroma langchain-text-splitters langchain-huggingface langchain-groq unstructured unstructured[pdf] nltk groq



### Setting up Groq for LLM access.

In [None]:
import os
from groq import Groq
from langchain_community.document_loaders import UnstructuredPDFLoader, DirectoryLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_chroma import Chroma
from langchain_groq import ChatGroq
from langchain.chains import RetrievalQA

os.environ["GROQ_API_KEY"] = "Place Your API Key Here"
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))

### Directory Loader and Text Splitter

TextSplitter: Splits each document into manageable chunks (1,500 characters each) to enhance retrieval accuracy and ensure that relevant sections are accessible to the LLM.

In [None]:
# Load multiple LangChain-related research PDFs
loader = DirectoryLoader("langchain_papers/ ", glob="./*.pdf", loader_cls=UnstructuredPDFLoader)
documents = loader.load()

# Split documents into smaller chunks for efficient retrieval
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=300)
text_chunks = text_splitter.split_documents(documents)
persist_directory = "langchain_doc_db"

### Embedding and VectorStore

 Convert document chunks into vector embeddings for similarity-based retrieval.

In [None]:
embedding = HuggingFaceEmbeddings()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]



1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
vectorstore = Chroma.from_documents(
    documents=text_chunks,
    embedding=embedding,
    persist_directory=persist_directory
)
retriever = vectorstore.as_retriever()

In [None]:
llm = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0
)

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="map_reduce",
    retriever=retriever,
    return_source_documents=True
)

### LLM and RetrievalQA Chain



In [None]:
query = "What does the document say about LangChain?"
response = qa_chain.invoke({"query": query})

print(response["result"])

print("\n### Detailed Source Document Excerpts ###")
for idx, doc in enumerate(response["source_documents"], start=1):
    cleaned_text = doc.page_content.replace("\n", " ").strip()
    print(f"\n--- Document {idx} ---\n")
    print(cleaned_text[:2000])
    print("...")

The document provides an overview of LangChain, specifically mentioning the following points:

1. LangChain is a system that utilizes Large Language Models (LLMs) and other types of models, including Chat Models and Text Embedding Models.
2. It allows users to combine these models with prompts to carry out a sequence of operations on text or other data.
3. LangChain provides classes such as Simple Sequential Chain and SequentialChain to concatenate multiple chains together.
4. LangChain is an open-source library.
5. It is proficient in sourcing and integrating with diverse data applications.
6. It is an influential tool in the AI community.
7. It provides walkthroughs of common end-to-end use cases on topics such as autonomous agents, chatbots, code understanding agents, extraction, question answering over analyzing documents, and structured data.
8. An increasing number of examples of LangChain use cases are documented at the LangChain website.

Additionally, the document mentions Lan