<a href="https://colab.research.google.com/github/arjunp17/TalkToData-LLM-LangChain-Streamlit/blob/main/TalkToYourPaper.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 Install required Python packages

In [None]:
!pip install python-dotenv
!pip install openai
!pip install --upgrade langchain
!pip install unstructured
!pip install docarray
!pip install tiktoken
!pip install PyPDF2
!pip install chromadb
!pip install google-search-results

 Import Packages

In [313]:
import os
import openai
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import ArxivLoader, PyPDFLoader, UnstructuredFileLoader, OnlinePDFLoader, UnstructuredPDFLoader
from IPython.display import display, Markdown
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.agents import load_tools, initialize_agent, Tool
from langchain.agents import AgentType
from langchain import SerpAPIWrapper

 Setting-up APIs

In [314]:
os.environ['OPENAI_API_KEY'] = 'YOUR-OPENAI-API'
openai.api_key  = os.getenv('OPENAI_API_KEY')
os.environ["SERPAPI_API_KEY"] = 'YOUR-SERPAPI'

 Load pdf file of the paper

In [315]:
loader = OnlinePDFLoader("https://arxiv.org/pdf/2112.10752.pdf")
docs = loader.load()

 Define LLM Model

In [316]:
llm = ChatOpenAI(temperature = 0.0)

Construc vector database and QA Chain

In [317]:
# Split documents into chunks
text_splitter = CharacterTextSplitter(chunk_size=2000, chunk_overlap=0)
texts = text_splitter.split_documents(docs)
# Select embeddings
embeddings = OpenAIEmbeddings()
# Create a vectorstore from documents
db = Chroma.from_documents(texts, embeddings)
# Create retriever interface
retriever = db.as_retriever()
# Create QA chain
qa_system = RetrievalQA.from_chain_type(llm=llm,chain_type="stuff",retriever=retriever,verbose=False)

100%|██████████| 1/1 [00:00<00:00,  1.16it/s]


Observe the response for a query

In [320]:
response=qa_system.run("Explain different training stages of the latent diffusion model")
display(Markdown(response))

The training of the latent diffusion model (LDM) involves two distinct stages: the perceptual compression stage and the semantic compression stage.

1. Perceptual Compression Stage: In this stage, an autoencoder is trained to provide a lower-dimensional representation space that is perceptually equivalent to the data space. The objective is to remove high-frequency details from the input images while still preserving the overall perceptual quality. This stage focuses on compressing the data in a way that eliminates imperceptible details. The autoencoder is trained to reconstruct the input images from their compressed representations.

2. Semantic Compression Stage: After the perceptual compression stage, the actual generative model is trained to learn the semantic and conceptual composition of the data. This stage aims to capture the higher-level features and variations in the data. The generative model is trained using diffusion models, which gradually denoise a normally distributed variable to learn the reverse process of a fixed Markov Chain. The generative model consists of a sequence of denoising autoencoders, which are trained to predict a denoised variant of their input. The objective is to learn the reverse process of the Markov Chain and approximate the true data distribution.

The training of the LDM involves training the autoencoder in the perceptual compression stage and then training the generative model in the semantic compression stage. The resulting LDM can be used for high-resolution image synthesis and other tasks such as inpainting and super-resolution. The LDM offers a computationally tractable approach to diffusion-based image synthesis and provides flexibility in modeling various image modalities.

 What happens when you ask about something outside the paper content? Eg., To compare this work with any other related work??

In [326]:
response=qa_system.run("Are there any new papers for Text-to-Audio generation based on latent diffusion models?, If true, give the details of those papers")
display(Markdown(response))

Based on the provided context, there is no mention of any new papers specifically related to Text-to-Audio generation based on latent diffusion models. Therefore, it is not possible to provide any details about such papers.

 How to solve this? - Can be solved by simply adding a search functionality

In [322]:
# Initialize the SerpAPIWrapper for search functionality
search = SerpAPIWrapper()

Define the QA Chain as a tool along with the Serp search tool

In [323]:
tools = [
    Tool(
        name="Search",
        func=search.run,
        description="Useful when you need to answer questions about current events or past events related to the scientic paper, which are not explicit contents of the paper."
    ),
    Tool(
        name="QA system",
        func=qa_system.run,
        description="Useful when you need to answer questions from the contents of the given scientific paper."
    ),
]

Construct an Agent using the tools

In [324]:
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    handle_parsing_errors=True,
    verbose=False)

In [327]:
response=agent.run("Are there any new papers for Text-to-Audio generation based on latent diffusion models?, If true, give the details of those papers")
display(Markdown(response))

Yes, there is a new paper called "AudioLDM: Text-to-Audio Generation Based on Latent Diffusion Models" which is a TTA system that is built on a latent space to learn the continuous audio representations from contrastive training.