In [33]:
# first let's import the required libraries from langchain
import bs4
from langchain import hub 
from langchain_community.document_loaders import WebBaseLoader
from langchain_chroma import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings

In [34]:
import getpass
import os

os.environ["COHERE_API_KEY"] = getpass.getpass()

 ········


In [35]:
#Create the llm
from langchain_cohere import ChatCohere

llm = ChatCohere(model="command-r")

In [36]:
#Load the document using document loader
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("https://arxiv.org/pdf/2103.15348.pdf")


In [37]:
#Create splits of the document
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(pages)

In [38]:
#Use an appropriate embedding model
embeddings_model_name = "sentence-transformers/all-MiniLM-L6-v2"
embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name)

In [39]:
#Create a vector store from document splits and the embedding model
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)

In [40]:
# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")

In [41]:
#Helper function to fomat the output content
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [42]:
#Establish the rag chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [46]:
rag_chain.invoke("Summarize the paper in 10 sentences?")

'1zhshen@gmail.com) ,\nHaibin Ling2( 12345@163.com ),\nShuang Wang1( wshuang87@gmail.com) ,\nJian Yang1( jyang@cs.princeton.edu) ,\nand Wenchao Lu1(wlu@cs.princeton.edu)\n1Princeton University, 2Tsinghua University\nAbstract—Document image analysis is\na challenging task due to the great vari-\nety of document layouts and printing\nstyles. Previous works often require\ntedious feature engineering, which is\nboth time-consuming and not robus\nAnswer: The paper introduces LayoutParser, a toolkit for analysing document images. It addresses the challenges posed by diverse document layouts and printing styles, eliminating the need for time-consuming feature engineering. The method recognises vertical text, identifies column structures, and uses vertical positions to determine layout types.'

In [47]:
rag_chain.invoke("What are some of the major advantages?")

'LayoutParser has many advantages, including the ability to create large-scale and lightweight document digitization pipelines with ease. It also supports training customized layout models and community sharing of those models. This makes it a versatile tool with numerous potential applications.'

In [48]:
rag_chain.invoke("How long is the paper?")

"The paper being discussed appears to be eight rows long, although the exact dimensions are not stated. It seems to be written in a vertical style common in Japanese text. The paper's layout is complex, with variable width columns and row segmentation, which makes it difficult to pinpoint the exact length."

In [26]:
rag_chain.invoke("What is the majore use cases?")

"LayoutParser's main use case is to simplify the creation of document digitization pipelines, promoting reusability and reproducibility. These pipelines can be shared, discussed, and applied to solve specific problems related to document image analysis (DIA) tasks. LayoutParser also enables the community to share pre-trained models and various datasets."

In [27]:
rag_chain.invoke("What is Deep learning?")

'Shen_Zejiang@hotmail.com)\nYuliang Liu2(  liu.yuliang@gmail.com)* \n1 Zhejiang University, Hangzhou, China \n2 Microsoft Research, Redmond, WA, USA \n\nDeep learning is a subset of machine learning, which in turn is a subset of artificial intelligence. \nIt enables computational models that are able to learn from data, \nautomatically extracting relevant features for recognition or prediction tasks without being explicitly programmed.\nAnswer: Deep learning is a branch of artificial intelligence that enables models to learn from data. These models can then be used for recognition or prediction, without being explicitly programmed for a specific task. This makes deep learning particularly useful for image analysis.'

In [28]:
rag_chain.invoke("What is the purpose of life?")

"I'm sorry, I couldn't find the answer to this based on the context provided. However, based on the larger context of the text, which appears to be a document about a toolkit for document image analysis (DIA), the purpose of life might be interpreted as the purpose of the 'life' or existence of the toolkit. In which case, the purpose could be interpreted as aiding and improving the process of digitisation."

In [29]:
rag_chain.invoke("How do you percieve intelligence?")

'ShenZ17@pku.edu.cn )  ,  Meng Wang2(  mwang@cs.brown.edu  )\n1School of Electronic Engineering and Computer Science, Peking University, Beijing, China ;  2Department of Computer Science, Brown University, Providence, RI, USA\nAbstract—Document image analysis aims to develop automated systems that can understand and extract information from document images. Recent advances in deep learning have shown the great potential of end-to-end trainable models for this task. However, existing approaches require laborious annotations, and the models are usually tailored for specific tasks, which limits their generality and flexibility. To facilitate research on this topic, we propose LayoutParser, a code-through toolkit that unifies several state-of-the-art models for document image analysis. LayoutParser provides a wide range of pre-built components for document layout analysis, including text detection, text recognition, table detection, and image-text relationship modeling. With LayoutParser, 

In [32]:
rag_chain.invoke("Act as a Japanese Deep Learning Engineer and interpret the document for your collegues and answer in Japanese language")

'ShenZ@nju.edu.cn ) , Yuxin Wu1,2 , Wenjie Li1,2 ,\nShuang Chen1,2 , Jian Yang1 , Xuelong Li1 ,2 3\n1 School of Computer Science and Engineering, Nanjing University, China\n2 Jiangsu Key Laboratory of Big Data Analysis and Decision Making, Nanjing\nUniversity, China\n3 MOE Key Laboratory of Digital Publishing Technology, Peking University, China\n\n\nAnswer in Japanese: ディープラーニングのモデル開発を簡素化するために努力しているにもかかわらず、現在利用可能なツールは、文書イメージ分析（DIA）の課題に最適化されていません。 このギャップを埋めるために、LayoutParserというオープンソースライブラリが導入されています。LayoutParserは、DLをDIA研究やアプリケーションに簡素化するために設計されたユニークなツールキットです。'