## Retrieval Augmented Generation (RAG)

The field of NLP has witnessed significant advancements in recent years, and RAG is one such exciting development. In this series of notebooks, we will cover the fundamentals of RAG, its architecture, and practical implementations. We will also work on some hands-on examples to grasp the concepts better.

Let us embark on this journey together and explore the powerful capabilities of RAG!

Happy learning!


In this Jupyter notebook, we will be working with the llama_index and langchain libraries to perform document indexing and retrieval using GPT-3.5-turbo, an advanced language model. The purpose of this notebook is to demonstrate how to set up the environment, load documents, and create an index for efficient document retrieval.

Before we proceed, please note that we will be using the OpenAI API to leverage the capabilities of the GPT-3.5-turbo model. As a security measure, remember never to reveal your API keys directly in code. Instead, use environment variables or other secure means to store sensitive information.

We will follow these steps:

- Import necessary classes and functions from the llama_index and langchain libraries.
- Set up the OpenAI API key using an environment variable and directly for demonstration purposes (please avoid doing this in production code).
- Load data from the specified directory, where we assume the documents are stored. You may adjust the path according to your data location.
- Initialize the LLMPredictor with the desired GPT-3.5-turbo model and temperature setting.
- Create a ServiceContext using the initialized predictor.
- Index the loaded documents using the created service context to enable efficient document retrieval.
- Now that you have an overview of the tasks we'll be performing, let's proceed with the document loading and indexing process. Happy coding! 🚀

In [None]:
# Install llama_index which is a popular middleware used in many GenAI applications
!pip install llama_index

Collecting llama_index
  Downloading llama_index-0.7.16-py3-none-any.whl (626 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m626.2/626.2 kB[0m [31m13.7 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25hCollecting tiktoken (from llama_index)
  Downloading tiktoken-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m26.5 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Collecting langchain>=0.0.218 (from llama_index)
  Downloading langchain-0.0.247-py3-none-any.whl (1.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/1.4 MB[0m [31m30.4 MB/s[0m eta [36m0:00:00[0m00:01[0m
Collecting openai>=0.26.4 (from llama_index)
  Downloading openai-0.27.8-py3-none-any.whl (73 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.6/73.6 kB[0m [31m5.6 MB/s[0m eta [36m0:00:00[0m
Collecting langsmith<0.1.0,>=0.0.11 (from langchain>=0.0

In [None]:
# Import necessary classes and functions from the llama_index and langchain libraries
from llama_index import (
    GPTVectorStoreIndex,
    SimpleDirectoryReader,
    ServiceContext,
    StorageContext,
    LLMPredictor,
    load_index_from_storage,
)
from langchain.chat_models import ChatOpenAI

# Import the openai library and os module to set the API key
import openai
import os

# SECURITY ALERT: Never reveal your API keys directly in code. Use environment variables or other secure means.
# Here, we're setting the OpenAI API key both using an environment variable and directly (demonstration purposes only)
os.environ['OPENAI_API_KEY'] = 'YOU-API-KEY'
openai.api_key = 'YOU-API-KEY'

# Notify the user that the document loading process has begun
print("started the loading document process...")

# Read the data from the specified directory. Change './boiler_docs/' to your desired path.
documents = SimpleDirectoryReader('/kaggle/input/nlp-and-llm-related-arxiv-papers/').load_data()

# Initialize the LLMPredictor with the desired GPT-3.5-turbo model and temperature setting
llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo"))

# Create a ServiceContext using the initialized predictor
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

# Notify the user that the indexing process has begun
print("started the indexing process...")

# Create an index using the loaded documents and the created service context
index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)



started the loading document process...
started the indexing process...


In [None]:
# Store the created index to the disk at the specified location
print("storing the index to disk")
index.storage_context.persist(persist_dir="/kaggle/working/nlp-and-llm-related-arxiv-papers-documents-index")


storing the index to disk


In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Write a detailed summary of various prompting techniques. Please provide an executive summary after which you can present succinct bullet points? Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
Executive Summary:
This context provides a table summarizing various prompting techniques for designing prompts in large language models. The techniques include making the prompt detailed, specifying the expertise of the model, providing task descriptions, using contextual information, and utilizing demonstrations. The table also mentions the related principles for each technique.

Prompts Techniques:
- Make the prompt as detailed as possible, specifying the required length and including major storyline and conclusion while omitting unimportant details.
- Specify the expertise of the model in the prompt, such as "You are a sophisticated expert in the domain of computer science."
- Instruct the model on what it should do rather than what it should not do.
- Use a prompt format like "Question: Short Answer:" to avoid generating excessively long outputs.
- Retrieve relevant documents via a search engine and concatenate them into the prompt as reference for factual kn

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Write a detailed summary of ActionCLIP.Please provide an executive summary after which you can present succinct bullet points? Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
Executive Summary:
ActionCLIP is a multimodal learning framework for video action recognition that leverages both video and text information. It introduces a new paradigm called "pre-train, prompt, and fine-tune" to utilize pre-trained models and reduce pre-training costs. ActionCLIP achieves impressive results in both general and zero-shot/few-shot action recognition tasks, demonstrating the potential of the multimodal learning framework.

Detailed Summary:
- ActionCLIP proposes a new perspective for action recognition by considering it as a video-text multimodal learning problem.
- Unlike traditional approaches that treat action recognition as a video classification problem, ActionCLIP utilizes semantic information from label texts to enhance performance.
- The framework follows a "pre-train, prompt, and fine-tune" paradigm, allowing it to reuse pre-trained models trained on large-scale web data, thereby reducing pre-training costs.
- ActionCLIP implementation a

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Write a detailed summary of AI-assisted coding. Please provide an executive summary after which you can present succinct bullet points? Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
Executive Summary:
AI-assisted coding refers to the use of artificial intelligence technologies to assist in the process of writing code. It involves the use of machine learning algorithms and natural language processing techniques to automate certain coding tasks and provide suggestions and recommendations to developers. AI-assisted coding aims to improve productivity, accuracy, and efficiency in software development by reducing manual effort and enhancing code quality.

Detailed Summary:
AI-assisted coding is a technique that leverages artificial intelligence technologies to aid developers in writing code. It involves the use of machine learning algorithms and natural language processing techniques to analyze code and provide intelligent suggestions and recommendations. The goal of AI-assisted coding is to enhance productivity and efficiency in software development by automating repetitive tasks and improving code quality.

Some key points about AI-assisted codi

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Write a detailed summary of how ChatGPT is a Jack of all trades but master of none. Please provide an executive summary after which you can present succinct bullet points? Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
Executive Summary:
ChatGPT is a versatile language model that can perform various tasks but lacks expertise in any specific domain. While it can handle different tasks, its accuracy and quality may vary, leading to inconsistent results. ChatGPT is interactive and creative, providing multiple answers, but it can also be repetitive. It is currently in the beta testing stage and has hidden biases and limitations in its reasoning and understanding abilities.

Detailed Summary:
- ChatGPT is a general-purpose language model that can perform a wide range of tasks, making it a "Jack of all trades."
- However, its performance in these tasks may not be accurate or consistent, leading to a drop in quality compared to more specialized models.
- ChatGPT is interactive and can engage in conversations, but its answers may not always be accurate or reliable.
- It has the ability to generate creative responses and provide multiple answers, but it can also be repetitive, giving the

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Write a detailed summary of fine-tuning of language models. Discuss techniques of fine-tuning, why is fine-tuning important and its applications. Please provide an executive summary after which you can present succinct bullet points? Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
Executive Summary:
Fine-tuning of language models is a crucial process that involves adapting pre-trained models to specific tasks or domains. This process enhances the models' performance and enables them to generalize better. Several techniques have been developed for fine-tuning, including Supervised Instruction Tuning, Continual Learning, Parameter-Efficient Fine-Tuning, and Semi-Supervised Fine-Tuning. These techniques address different challenges and aim to make the fine-tuning process more efficient and effective. Fine-tuning has various applications, such as instruction following, generating responses, summarization, translation, and more.

Key Points:
- Fine-tuning of language models involves adapting pre-trained models to specific tasks or domains.
- Techniques for fine-tuning include Supervised Instruction Tuning, Continual Learning, Parameter-Efficient Fine-Tuning, and Semi-Supervised Fine-Tuning.
- Supervised Instruction Tuning involves fine-tuning wi

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Write a detailed summary of fine-tuning of language models. Discuss techniques of fine-tuning, why is fine-tuning important and its applications. Write a detailed blog post that is sufficiently technical and is targeted to an audience that has machine learning and NLP knowledge? Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
Fine-tuning is a crucial process in natural language processing (NLP) that involves adapting pre-trained language models to specific tasks or domains. It plays a significant role in improving model performance and enhancing task generalization abilities. In this blog post, we will explore the concept of fine-tuning in detail, discussing various techniques, its importance, and applications.

Fine-tuning a language model involves updating its parameters using task-specific or domain-specific data. This process allows the model to learn from new data without forgetting previously learned information. There are several commonly employed methods for fine-tuning language models:

1. Supervised Instruction Tuning: This technique focuses on adapting language models for instruction following and enhancing their task generalization abilities. It involves fine-tuning the model using data that provides task instruction supervision. This approach has been widely studied and ha

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("what are examples of multi-modal embeddings? Discuss the importance of multi-modal embeddings and its applications. Write a detailed blog post that is sufficiently technical and is targeted to an audience that has machine learning and NLP knowledge? Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
Title: Exploring the Power of Multi-Modal Embeddings in Machine Learning and NLP

Introduction:
In the era of advanced machine learning techniques, the combination of multiple modalities has emerged as a powerful approach to enhance the performance of various tasks. Multi-modal embeddings, which efficiently and effectively combine different modalities such as audio, video, and text, have become a hot topic in the field of Natural Language Processing (NLP). In this blog post, we will delve into the importance of multi-modal embeddings and explore their applications in machine learning and NLP.

Importance of Multi-Modal Embeddings:
Multi-modal embeddings play a crucial role in bridging the gap between different modalities and enabling machines to understand and process information from various senses, including vision, hearing, and language. By combining different types of data simultaneously, multi-modal models can leverage the complementary nature of modalities, 

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Summarize the sparks of AGI paper and identify what are some tasks that GPT-4 excels in at human level? Write a detailed blog post that is sufficiently technical and is targeted to an audience that has machine learning and NLP knowledge? Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
Title: Sparks of Artificial General Intelligence: Unveiling the Potential of GPT-4

Introduction:
Artificial General Intelligence (AGI) has long been a goal in the field of machine learning and natural language processing (NLP). In a recent paper titled "Sparks of Artificial General Intelligence," researchers shed light on the capabilities and limitations of GPT-4, a cutting-edge language model. This blog post aims to summarize the key findings of the paper and highlight the tasks in which GPT-4 excels at a human-level performance.

Summary of the Paper:
The paper presents an in-depth analysis of GPT-4, comparing its performance with ChatGPT, a previous state-of-the-art language model. The researchers found that GPT-4 surpasses ChatGPT in terms of generating impressive outputs. Moreover, GPT-4's performance is at least comparable, if not superior, to that of a human in certain tasks.

Tasks in which GPT-4 Excels at Human-Level Performance:
1. Language Translation:

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("How can Large language models be used in recommendation engines? Write a detailed blog post that is sufficiently technical and is targeted to an audience that has machine learning and NLP knowledge? Provide a concrete example in ecommerce where a Large Language Model can be used for recommendations? Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
Title: Enhancing Recommendation Engines with Large Language Models

Introduction:
Large language models (LLMs) have gained significant attention in recent years due to their remarkable capabilities in natural language processing (NLP) tasks. These models, such as GPT-3 and GPT-4, have the potential to revolutionize recommendation engines by providing more accurate and personalized recommendations to users. In this blog post, we will explore how LLMs can be utilized in recommendation engines, specifically in the context of e-commerce. We will delve into the technical aspects of integrating LLMs into recommendation systems and provide a concrete example to illustrate their effectiveness.

Understanding Recommendation Engines:
Recommendation engines are algorithms that analyze user data and provide personalized suggestions or recommendations. These systems are widely used in e-commerce platforms to enhance user experience, increase customer engagement, and drive sale

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Provide a concrete example in ecommerce where a Large Language Model can be used for recommendations using the three-step prompting approach? Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
Example:
Prompt: "Please provide personalized product recommendations for a user based on their browsing history and preferences. The recommendations should be in the form of a list, with each product's name, description, and price. Additionally, include a short explanation for why each product is recommended. Use the following three-step prompting approach:

Step 1 - Retrieve the user's browsing history and preferences:
- Retrieve the user's browsing history from the database.
- Retrieve the user's preferences, such as favorite categories or brands.

Step 2 - Analyze the user's data and generate initial recommendations:
- Analyze the user's browsing history to identify their interests and preferences.
- Use machine learning algorithms to generate initial recommendations based on the user's data.

Step 3 - Refine the recommendations and provide explanations:
- Filter the initial recommendations based on the user's preferences.
- For each recommended product, provi

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Provide solutions on how I can ChatGPT in languages other than English? Is there a way to use the english translation of my prompt which is in Hindi?  Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)

Querying the index...
To use ChatGPT in languages other than English, you can follow these solutions:

1. Translate the prompt: If your prompt is in a language other than English, you can translate it to English before using it with ChatGPT. There are various online translation tools available that can help you with this. Once you have the translated prompt in English, you can use it with ChatGPT.

2. Use multilingual models: OpenAI has released multilingual models that can understand and generate text in multiple languages. You can use these models to directly interact with ChatGPT in languages other than English. These models support languages such as Spanish, French, German, Italian, Dutch, Portuguese, Russian, Chinese, Japanese, Korean, and more.

3. Incorporate language-specific prompts: If you want to use the English translation of your prompt, you can include language-specific keywords or phrases in your prompt to guide the model's response. For example, if your prompt is in Hin

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Provide a fairly comprehensive list of NLP tasks that LLMs can execute? Incluide details from the paper SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Tasks and other papers that discuss NLP tasks. In your response provide a long list of tasks and their descriptions?  Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)



Querying the index...
Here is a fairly comprehensive list of NLP tasks that LLMs (Language Models) can execute, based on the context information provided:

1. Question Answering: Answering questions based on given context or knowledge.
2. Classification: Assigning labels or categories to text data.
3. Named Entity Recognition: Identifying and classifying named entities (e.g., person, organization, location) in text.
4. Sentiment Analysis: Determining the sentiment or emotion expressed in text.
5. Text Summarization: Generating a concise summary of a longer text.
6. Machine Translation: Translating text from one language to another.
7. Text Generation: Generating coherent and contextually relevant text.
8. Text Completion: Predicting missing or next words in a given text.
9. Text Classification: Assigning predefined categories or labels to text documents.
10. Text Similarity: Measuring the similarity between two or more texts.
11. Text Segmentation: Dividing a text into smaller segments

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("Provide a fairly comprehensive list of tasks that LLMs can perform that may not be entirely NLP related. Include examples from papers related StructGPT, Gorilla, Starcoder and others.  In your response provide a long list of tasks and their descriptions?  Remember to add formatting elements so that the output is easy to read and well-formatted. Use line breaks to improve formatting.")

# Print the received response
print(response)



Querying the index...
LLMs (Large Language Models) have the potential to perform a wide range of tasks beyond NLP (Natural Language Processing). Here is a comprehensive list of tasks that LLMs can perform:

1. Chatbot: LLMs like ChatGPT can act as conversational agents, engaging in dialogue with users and providing informative and reliable responses.

2. Data Annotation: LLMs can be used as annotators to label or annotate data for various tasks, such as sentiment analysis, named entity recognition, or part-of-speech tagging.

3. Data Generation: LLMs can generate synthetic data for data augmentation, improving the performance of models in tasks like text classification or machine translation.

4. Quality Assessment: LLMs can be used to evaluate the quality of generated text in NLG (Natural Language Generation) tasks like summarization and translation. They can provide human-like assessments and align well with human judgments.

5. Visual Question Answering (VQA): Fine-tuned multimodal 

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("""
Can you summarize the papers related to Code synthesis, code generation? Keywords like CoPilot, Starcoder, SQL-PALM and so on. I want a summary of what these LLMs can do and how can a student who wants to learn coding
start to use them? Give me a step by step approach on how these LLMs can be used to teach students how to code?
""")

# Print the received response
print(response)



Querying the index...
The context information provided includes two papers related to code synthesis and code generation using Large Language Models (LLMs). The first paper, titled "SheetCopilot- Bringing Software Productivity to the Next Level through Large Language Models," focuses on spreadsheet manipulation using LLMs. The second paper, titled "A Survey of Large Language Models," discusses various tasks that LLMs can perform, including code synthesis.

LLMs have shown strong abilities in generating both natural language text and formal language, such as computer programs. Code synthesis refers to the generation of code that satisfies specific conditions. Existing LLMs have been evaluated based on the quality of the generated code by calculating the pass rate against test cases.

To teach students how to code using LLMs, a step-by-step approach can be followed:

1. Familiarize students with the concept of LLMs: Introduce students to the concept of Large Language Models and explain h

In [None]:
# Notify the user that we are querying the index
print("Querying the index...")

# Query the index for the provided question and store the response
response = index.as_query_engine().query("""
Can you provide a detailed summary of the paper - "SheetCopilot- Bringing Software Productivity to the Next Level through Large Language Models".
""")

# Print the received response
print(response)




Querying the index...
The paper "SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models" introduces a system called SheetCopilot, which aims to enhance software productivity by using large language models (LLMs). The authors highlight that many computer end users spend a significant amount of time on repetitive and error-prone tasks, such as tabular data processing and project timeline scheduling. However, most users lack the skills to automate these tasks. With the advancements in LLMs, the authors propose using natural language user requests to direct software.

The SheetCopilot agent is designed to take natural language tasks and control spreadsheets to fulfill the requirements. The authors propose a set of atomic actions that serve as an abstraction of spreadsheet software functionalities. They also develop a state machine-based task planning framework to enable LLMs to interact robustly with spreadsheets. To evaluate the performance of LLMs in

# **The End**