## Installing Packages

1. **tqdm:** library to show the progress of an action (downloading, training, ...)
2. **jq:** lightweigh and flexible JSON processor
3. **unstructured:** A library that prepares raw documents for downstream ML tasks.
4. **pypdf:** A pure-python PDF library capable of splitting, merging, cropping and transforming PDF files.
5. **tiktoken:** a fast open-source tokenizer by OpenAI.

In [1]:
# %pip install tqdm jq unstructured pypdf tiktoken

Collecting jq
  Obtaining dependency information for jq from https://files.pythonhosted.org/packages/1c/07/cd009fd872e62192493c2082604d6bb0e513d41d88649d2ecc67eddf0557/jq-1.6.0-cp311-cp311-macosx_11_0_arm64.whl.metadata
  Downloading jq-1.6.0-cp311-cp311-macosx_11_0_arm64.whl.metadata (6.8 kB)
Collecting unstructured
  Obtaining dependency information for unstructured from https://files.pythonhosted.org/packages/c8/3a/e74f3a33685d8421840311766cb11ccbafa06368c1c8222543accb86ee9c/unstructured-0.12.5-py3-none-any.whl.metadata
  Downloading unstructured-0.12.5-py3-none-any.whl.metadata (26 kB)
Collecting filetype (from unstructured)
  Obtaining dependency information for filetype from https://files.pythonhosted.org/packages/18/79/1b8fa1bb3568781e84c9200f951c735f3f157429f44be0495da55894d620/filetype-1.2.0-py2.py3-none-any.whl.metadata
  Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Collecting python-magic (from unstructured)
  Obtaining dependency information for python-

  Downloading ordered_set-4.1.0-py3-none-any.whl.metadata (5.3 kB)
Downloading jq-1.6.0-cp311-cp311-macosx_11_0_arm64.whl (412 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m412.9/412.9 kB[0m [31m1.1 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0mm
[?25hDownloading unstructured-0.12.5-py3-none-any.whl (1.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hDownloading unstructured_client-0.21.1-py3-none-any.whl (28 kB)
Downloading emoji-2.10.1-py2.py3-none-any.whl (421 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m421.5/421.5 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hDownloading filetype-1.2.0-py2.py3-none-any.whl (19 kB)
Downloading python_iso639-2024.2.7-py3-none-any.whl (274 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m274.7/274.7 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hDo

In [1]:
# Loading Documents
from langchain.document_loaders import(
    UnstructuredCSVLoader,
    UnstructuredHTMLLoader,
    UnstructuredImageLoader,
    PythonLoader,
    PyPDFLoader,
    JSONLoader
)

from langchain.document_loaders.csv_loader import CSVLoader


csv_loader = CSVLoader('olympic_athletes.csv')

olympic_data = csv_loader.load()

In [2]:
olympic_data

[Document(page_content="ID: 1\nName: A Dijiang\nSex: M\nAge: 24\nHeight: 180\nWeight: 80\nTeam: China\nNOC: CHN\nGames: 1992 Summer\nYear: 1992\nSeason: Summer\nCity: Barcelona\nSport: Basketball\nEvent: Basketball Men's Basketball\nMedal: NA", metadata={'source': 'olympic_athletes.csv', 'row': 0}),
 Document(page_content="ID: 2\nName: A Lamusi\nSex: M\nAge: 23\nHeight: 170\nWeight: 60\nTeam: China\nNOC: CHN\nGames: 2012 Summer\nYear: 2012\nSeason: Summer\nCity: London\nSport: Judo\nEvent: Judo Men's Extra-Lightweight\nMedal: NA", metadata={'source': 'olympic_athletes.csv', 'row': 1}),
 Document(page_content="ID: 3\nName: Gunnar Nielsen Aaby\nSex: M\nAge: 24\nHeight: NA\nWeight: NA\nTeam: Denmark\nNOC: DEN\nGames: 1920 Summer\nYear: 1920\nSeason: Summer\nCity: Antwerpen\nSport: Football\nEvent: Football Men's Football\nMedal: NA", metadata={'source': 'olympic_athletes.csv', 'row': 2}),
 Document(page_content="ID: 4\nName: Edgar Lindenau Aabye\nSex: M\nAge: 34\nHeight: NA\nWeight: NA\nT

In [3]:
len(olympic_data)

271116

In [4]:
# getting the first element of the data
olympic_data[0]

Document(page_content="ID: 1\nName: A Dijiang\nSex: M\nAge: 24\nHeight: 180\nWeight: 80\nTeam: China\nNOC: CHN\nGames: 1992 Summer\nYear: 1992\nSeason: Summer\nCity: Barcelona\nSport: Basketball\nEvent: Basketball Men's Basketball\nMedal: NA", metadata={'source': 'olympic_athletes.csv', 'row': 0})

In [5]:
olympic_data[0].page_content

"ID: 1\nName: A Dijiang\nSex: M\nAge: 24\nHeight: 180\nWeight: 80\nTeam: China\nNOC: CHN\nGames: 1992 Summer\nYear: 1992\nSeason: Summer\nCity: Barcelona\nSport: Basketball\nEvent: Basketball Men's Basketball\nMedal: NA"

In [6]:
# getting a gist of data
import pandas as pd

df = pd.read_csv('olympic_athletes.csv')

df.head(5)

Unnamed: 0,ID,Name,Sex,Age,Height,Weight,Team,NOC,Games,Year,Season,City,Sport,Event,Medal
0,1,A Dijiang,M,24.0,180.0,80.0,China,CHN,1992 Summer,1992,Summer,Barcelona,Basketball,Basketball Men's Basketball,
1,2,A Lamusi,M,23.0,170.0,60.0,China,CHN,2012 Summer,2012,Summer,London,Judo,Judo Men's Extra-Lightweight,
2,3,Gunnar Nielsen Aaby,M,24.0,,,Denmark,DEN,1920 Summer,1920,Summer,Antwerpen,Football,Football Men's Football,
3,4,Edgar Lindenau Aabye,M,34.0,,,Denmark/Sweden,DEN,1900 Summer,1900,Summer,Paris,Tug-Of-War,Tug-Of-War Men's Tug-Of-War,Gold
4,5,Christine Jacoba Aaftink,F,21.0,185.0,82.0,Netherlands,NED,1988 Winter,1988,Winter,Calgary,Speed Skating,Speed Skating Women's 500 metres,


In [7]:
# loading pdf file

pdf_loader = PyPDFLoader('document.pdf')

# by default load_and_split() uses RecursiveCharacterTextSplitter
pdf_data = pdf_loader.load_and_split() # it will load and split the pdf in small chunks so that we can use

In [8]:
pdf_data

[Document(page_content='Sparks of Artiﬁcial General Intelligence:\nEarly experiments with GPT-4\nS´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke\nEric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg\nHarsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang\nMicrosoft Research\nAbstract\nArtiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)\nthat exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding\nof learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an\nunprecedented scale of compute and data. In this paper, we report on our investigation of an early version\nof GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-\n4 is part of a new cohort of LLMs (along with ChatGPT and Google’s PaLM for example) that exhibit\nmore general intelligence than previous AI models.

In [9]:
pdf_data[0]

Document(page_content='Sparks of Artiﬁcial General Intelligence:\nEarly experiments with GPT-4\nS´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke\nEric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg\nHarsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang\nMicrosoft Research\nAbstract\nArtiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)\nthat exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding\nof learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an\nunprecedented scale of compute and data. In this paper, we report on our investigation of an early version\nof GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-\n4 is part of a new cohort of LLMs (along with ChatGPT and Google’s PaLM for example) that exhibit\nmore general intelligence than previous AI models. 

In [10]:
pdf_data[0].page_content

'Sparks of Artiﬁcial General Intelligence:\nEarly experiments with GPT-4\nS´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke\nEric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg\nHarsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang\nMicrosoft Research\nAbstract\nArtiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)\nthat exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding\nof learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an\nunprecedented scale of compute and data. In this paper, we report on our investigation of an early version\nof GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-\n4 is part of a new cohort of LLMs (along with ChatGPT and Google’s PaLM for example) that exhibit\nmore general intelligence than previous AI models. We discuss the rising 

In [11]:
len(pdf_data)

192

We can see that the whole pdf has been divided in 192 chunks and page_content is an actual content from the pdf based on the chunk number.

The chunks has been splitted by some tokens and we can customize our tokens based on text_splitters

In [12]:
from langchain.text_splitter import(
    CharacterTextSplitter,
    RecursiveCharacterTextSplitter
)

# CharacterTextSplitter splits on '\n\n'
# '\n\n' - Double new line or most commonly Paragraph break points

# pros: Easy and Simple
# Cons: - Very rigid
#       - Doesn't take into account the structure of given data

character_splitter = CharacterTextSplitter(
    chunk_size = 1000, # how many characters will be in a chunk
    chunk_overlap = 0, # how many characters will overlap in chunks (based on your preferences)
)

# RecursiveCharacterTextSplitter splits on ['\n\n', '\n', ' ', '']
# '\n' - New Lines
# ' ' - Spaces
# '' - Characters
recursive_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 1000,
    chunk_overlap = 0 
)


# creating two different dataset based on pdf file
pdf_data_character_splitter = pdf_loader.load_and_split(
    text_splitter = character_splitter
)

pdf_data_recursive_splitter = pdf_loader.load_and_split(
    text_splitter = recursive_splitter
)

In [13]:
pdf_data_character_splitter

[Document(page_content='Sparks of Artiﬁcial General Intelligence:\nEarly experiments with GPT-4\nS´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke\nEric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg\nHarsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang\nMicrosoft Research\nAbstract\nArtiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)\nthat exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding\nof learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an\nunprecedented scale of compute and data. In this paper, we report on our investigation of an early version\nof GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-\n4 is part of a new cohort of LLMs (along with ChatGPT and Google’s PaLM for example) that exhibit\nmore general intelligence than previous AI models.

In [14]:
len(pdf_data_character_splitter)

155

In [15]:
pdf_data_recursive_splitter

[Document(page_content='Sparks of Artiﬁcial General Intelligence:\nEarly experiments with GPT-4\nS´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke\nEric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg\nHarsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang\nMicrosoft Research\nAbstract\nArtiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)\nthat exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding\nof learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an\nunprecedented scale of compute and data. In this paper, we report on our investigation of an early version\nof GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-\n4 is part of a new cohort of LLMs (along with ChatGPT and Google’s PaLM for example) that exhibit', metadata={'source': 'document.pdf', 'page': 0}),


In [16]:
len(pdf_data_recursive_splitter)

585

In [17]:
pdf_data_character_splitter[0].page_content

'Sparks of Artiﬁcial General Intelligence:\nEarly experiments with GPT-4\nS´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke\nEric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg\nHarsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang\nMicrosoft Research\nAbstract\nArtiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)\nthat exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding\nof learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an\nunprecedented scale of compute and data. In this paper, we report on our investigation of an early version\nof GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-\n4 is part of a new cohort of LLMs (along with ChatGPT and Google’s PaLM for example) that exhibit\nmore general intelligence than previous AI models. We discuss the rising 

In [18]:
len(pdf_data_character_splitter[0].page_content)

3395

In [19]:
pdf_data_recursive_splitter[0].page_content

'Sparks of Artiﬁcial General Intelligence:\nEarly experiments with GPT-4\nS´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke\nEric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg\nHarsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang\nMicrosoft Research\nAbstract\nArtiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)\nthat exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding\nof learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an\nunprecedented scale of compute and data. In this paper, we report on our investigation of an early version\nof GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-\n4 is part of a new cohort of LLMs (along with ChatGPT and Google’s PaLM for example) that exhibit'

In [20]:
len(pdf_data_recursive_splitter[0].page_content)

912

We can see that, CharacterTextSplitter has more characters in a chunk than the defined size, whereas RecursiveCharacterTextSplitter has character closer to the defined chunk size. It's important to have specific characters in a single chunk, because we pass that chunk in LLM, which has a context window and it's finite. If we pass more than the size of the context window we will get an error (crash the context window).

In [27]:
# load contents from the directory
import os
from langchain.document_loaders import DirectoryLoader

mixed_loader = DirectoryLoader(
    path = os.path.join(os.getcwd(), 'content_dir'),
    use_multithreading = True,
    show_progress = True
)

In [28]:
mixed_data = mixed_loader.load_and_split()

  0%|                                                                                                                                  | 0/3 [00:00<?, ?it/s]Error loading file /Users/shuvobarman/content_dir/ESLII.pdf
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00,  8.95it/s]


In [29]:
mixed_data

[Document(page_content='"company: Apple Inc.\\n filing date: 2023-11-02T20:30:32Z\\n text snippet: a8-kex991q4202309302023.htm  Document  Apple reports fourth quarter results  iPhone revenue sets September quarter record  Services revenue reaches new all-time high  CUPERTINO, CALIFORNIA — Apple® today announced financial results for its fiscal 2023 fourth quarter ended September 30, 2023. The Company posted quarterly revenue of $89.5 billion, down 1 percent year over year, and quarterly earnings per diluted share of $1.46, up 13 percent year over year. “Today Apple is pleased to report a September quarter revenue record for iPhone and an all-time revenue record in Services,” said Tim Cook, Apple’s CEO. “We now have our strongest lineup of products ever heading into the holiday season, including the iPhone 15 lineup and our first carbon neutral Apple Watch models, a major milestone in our efforts to make all Apple products carbon neutral by 2030.”\\n\\na8-kex991q4202309302023.htm  Docum

In [30]:
len(mixed_data)

14

In [31]:
mixed_data[0]

Document(page_content='"company: Apple Inc.\\n filing date: 2023-11-02T20:30:32Z\\n text snippet: a8-kex991q4202309302023.htm  Document  Apple reports fourth quarter results  iPhone revenue sets September quarter record  Services revenue reaches new all-time high  CUPERTINO, CALIFORNIA — Apple® today announced financial results for its fiscal 2023 fourth quarter ended September 30, 2023. The Company posted quarterly revenue of $89.5 billion, down 1 percent year over year, and quarterly earnings per diluted share of $1.46, up 13 percent year over year. “Today Apple is pleased to report a September quarter revenue record for iPhone and an all-time revenue record in Services,” said Tim Cook, Apple’s CEO. “We now have our strongest lineup of products ever heading into the holiday season, including the iPhone 15 lineup and our first carbon neutral Apple Watch models, a major milestone in our efforts to make all Apple products carbon neutral by 2030.”\\n\\na8-kex991q4202309302023.htm  Docume

## Text Summarization

In LLM we can generate summary from the given text. Basic hypothesis is:

**Data -> LLM (Summarize) -> Text Summary**

But we cannot give the whole text data into the LLM due to the size limitation of Context Window.


### Text Summarization Techniques:

#### Map-Reduce Strategy

**Steps:**
1. Split the data into chunks and create the data chunks.
2. Pass the data chunks into LLM and get a summary of each chunks separately.
3. Then we can combine the whole summarize chunks and pass it to LLM again.
4. We will get a summary of the combine chunk summary.


#### Refine Strategy

**Steps:**
1. Split the data into chunks and create the data chunks.
2. From the data chunk pass the 1st chunk into LLM and it will create a summary based on the first chunk (Initial      Chunk).
3. We then combine the 2nd chunk from chunk data with first chunk summary and then pass it to LLM.
4. Step 3 continues for rest of the chunk from chunk data and combined it with previous chunk summaries till we 
   reach to the end of the chunk.
5. In the end we will obtain a final summary which contains the data from whole chunks.


In [32]:
# defining openai_api_key
import os

os.environ['OPENAI_API_KEY'] = 'YOUR_API_KEY'

## Summarizing

### The 'stuff' chain

In [36]:
# importing necessary libraries
from langchain.chains.summarize import load_summarize_chain
from langchain_openai import ChatOpenAI

# instantiate llm
llm = ChatOpenAI()

stuff_chain = load_summarize_chain(
    llm = llm,
    chain_type = 'stuff' # simple chain 
)


stuff_chain.invoke(pdf_data[:5])

{'input_documents': [Document(page_content='Sparks of Artiﬁcial General Intelligence:\nEarly experiments with GPT-4\nS´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke\nEric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg\nHarsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang\nMicrosoft Research\nAbstract\nArtiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)\nthat exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding\nof learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an\nunprecedented scale of compute and data. In this paper, we report on our investigation of an early version\nof GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-\n4 is part of a new cohort of LLMs (along with ChatGPT and Google’s PaLM for example) that exhibit\nmore general intelligence than

In [38]:
stuff_chain.invoke(olympic_data[:5])

{'input_documents': [Document(page_content="ID: 1\nName: A Dijiang\nSex: M\nAge: 24\nHeight: 180\nWeight: 80\nTeam: China\nNOC: CHN\nGames: 1992 Summer\nYear: 1992\nSeason: Summer\nCity: Barcelona\nSport: Basketball\nEvent: Basketball Men's Basketball\nMedal: NA", metadata={'source': 'olympic_athletes.csv', 'row': 0}),
  Document(page_content="ID: 2\nName: A Lamusi\nSex: M\nAge: 23\nHeight: 170\nWeight: 60\nTeam: China\nNOC: CHN\nGames: 2012 Summer\nYear: 2012\nSeason: Summer\nCity: London\nSport: Judo\nEvent: Judo Men's Extra-Lightweight\nMedal: NA", metadata={'source': 'olympic_athletes.csv', 'row': 1}),
  Document(page_content="ID: 3\nName: Gunnar Nielsen Aaby\nSex: M\nAge: 24\nHeight: NA\nWeight: NA\nTeam: Denmark\nNOC: DEN\nGames: 1920 Summer\nYear: 1920\nSeason: Summer\nCity: Antwerpen\nSport: Football\nEvent: Football Men's Football\nMedal: NA", metadata={'source': 'olympic_athletes.csv', 'row': 2}),
  Document(page_content="ID: 4\nName: Edgar Lindenau Aabye\nSex: M\nAge: 34\nHe

In [39]:
stuff_chain.run(pdf_data[:5])

'The paper discusses early experiments with GPT-4, a large language model developed by OpenAI, which shows promising capabilities in various domains beyond language mastery. The model exhibits general intelligence and can solve tasks in mathematics, coding, vision, medicine, law, and psychology without specific prompting. The paper explores the limitations of GPT-4 and discusses the challenges in advancing towards more comprehensive artificial general intelligence systems. It also highlights the societal impacts and future research directions in the field of artificial intelligence.'

From the above code we can see that, `invoke()` and `run()` both methods, gives the summary of data but in different manner. 

`invoke()` - gives the data in a dictionary format where `input_document` key holds the value of data that has been passed to LLM for summarization and `output_text` key has the summary output.

`run()` - gives us the summary output directly.


## Custom Prompt

Here we will see what prompt has been used in the chain to get the output results and we will customize the output as per our needs.

In [40]:
stuff_chain

StuffDocumentsChain(llm_chain=LLMChain(prompt=PromptTemplate(input_variables=['text'], template='Write a concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'), llm=ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x29e74a8d0>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x29e93ae90>, openai_api_key=SecretStr('**********'), openai_proxy='')), document_variable_name='text')

In [41]:
# see what prompt has been used to do so
print(stuff_chain.llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


In [43]:
# creating custom template
from langchain.prompts import PromptTemplate

template = '''
Write a concise summary of the following in French:


"{text}"


CONCISE SUMMARY IN FRENCH:
'''

customized_prompt = PromptTemplate.from_template(template)

stuff_chain_customized = load_summarize_chain(
    llm = llm,
    prompt = customized_prompt
)


stuff_chain_customized.run(pdf_data[:2])

'"Étincelles d\'Intelligence Artificielle Générale : Premières expériences avec GPT-4. Les chercheurs en intelligence artificielle ont développé un modèle de langage large (LLM) nommé GPT-4, montrant des capacités remarquables dans divers domaines. Ce modèle semble démontrer une intelligence plus générale que les modèles précédents et est capable de résoudre des tâches complexes sans aucune incitation spécifique. L\'étude explore les capacités de GPT-4 dans des domaines variés tels que les mathématiques, la programmation, la vision, la médecine, le droit et la psychologie, notant des performances proches de celles humaines. Les chercheurs considèrent GPT-4 comme une version précoce d\'un système d\'intelligence artificielle générale (AGI) et soulignent les défis à relever pour avancer vers des versions plus complètes d\'AGI. Ils concluent en réfléchissant sur les implications sociétales de cette avancée technologique."'

## The Map-reduce Chain

It is good for sumarize a couple of pages.

In [44]:
map_reduce_chain = load_summarize_chain(
    llm = llm,
    chain_type = 'map_reduce',
    verbose = True
)

map_reduce_chain.run(pdf_data[:20])



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"Sparks of Artiﬁcial General Intelligence:
Early experiments with GPT-4
S´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke
Eric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg
Harsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang
Microsoft Research
Abstract
Artiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)
that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding
of learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an
unprecedented scale of compute and data. In this paper, we report on our investigation of an early version
of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-
4


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"This paper explores the capabilities of an early version of GPT-4, a large language model developed by OpenAI. The study suggests that GPT-4, along with other models like ChatGPT and Google's PaLM, exhibit more general intelligence than previous AI models. GPT-4 shows proficiency in various tasks beyond language, such as mathematics, coding, vision, medicine, law, and psychology, approaching or exceeding human-level performance. The authors believe that GPT-4 could be considered an early version of artificial general intelligence (AGI). The paper also discusses the limitations of GPT-4 and the challenges in advancing towards deeper and more comprehensive AGI systems.

This document discusses various aspects of GPT-4's mathematical abilities, interactions with the world and humans, discriminative capabilities, limitations, societal infl


[1m> Finished chain.[0m

[1m> Finished chain.[0m


"The paper explores the capabilities of GPT-4, a large language model developed by OpenAI, suggesting it exhibits more general intelligence than previous AI models, showing proficiency in tasks beyond language like mathematics, coding, vision, medicine, law, and psychology. GPT-4 is considered an early version of artificial general intelligence (AGI), although it has limitations. The study discusses GPT-4's abilities, limitations, societal influences, and future development directions, highlighting its progress in various tasks and interdisciplinary composition. The paper also addresses the challenges of evaluating GPT-4's intelligence and the potential harms of autoregressive language models. GPT-4 excels in tasks like generating visual features, solving mathematical problems, and understanding and interacting with humans, showing human-level performance in various domains. The dialogue between GPT-4 and ChatGPT, as well as their performances in different tasks, showcases GPT-4's high

### Custom Prompt

In [45]:
map_reduce_chain

MapReduceDocumentsChain(verbose=True, llm_chain=LLMChain(verbose=True, prompt=PromptTemplate(input_variables=['text'], template='Write a concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'), llm=ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x29e74a8d0>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x29e93ae90>, openai_api_key=SecretStr('**********'), openai_proxy='')), reduce_documents_chain=ReduceDocumentsChain(verbose=True, combine_documents_chain=StuffDocumentsChain(verbose=True, llm_chain=LLMChain(verbose=True, prompt=PromptTemplate(input_variables=['text'], template='Write a concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'), llm=ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x29e74a8d0>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x29e93ae90>, openai_api_key=SecretStr('**********'), openai_proxy='')), document_variable_na

In [46]:
# this prompt is being used on individual data chunks
print(map_reduce_chain.llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


In [47]:
# checking the combined chain prompt template

# this prompt template is being used on the all the summaries produced for each individual chunk
print(map_reduce_chain.combine_document_chain.llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


In [48]:
map_reduce_template = '''
The following is a set of documents

{text}

Based on this list of docs, please identify main themes.
Helpful Answer:'''

combine_template = '''
The following is a set of summaries:

{text}

Take these and distill it into a final, consolidated list of the main themes.
Return the list as a comma separated list.
Helpful Answer:
'''

custom_map_prompt = PromptTemplate.from_template(map_reduce_template)
cusom_cobine_prompt = PromptTemplate.from_template(combine_template)

custom_map_reduce_chain = load_summarize_chain(
    llm = llm,
    chain_type = 'map_reduce',
    map_prompt = custom_map_prompt,
    combine_prompt = cusom_cobine_prompt,
    verbose = True
)

custom_map_reduce_chain.run(pdf_data[:20])



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
The following is a set of documents

Sparks of Artiﬁcial General Intelligence:
Early experiments with GPT-4
S´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke
Eric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg
Harsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang
Microsoft Research
Abstract
Artiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)
that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding
of learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an
unprecedented scale of compute and data. In this paper, we report on our investigation of an early version
of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-
4 is par


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
The following is a set of summaries:

The main themes identified in the set of documents are:
1. Artificial General Intelligence (AGI) and the development of large language models (LLMs) like GPT-4.
2. The capabilities and implications of GPT-4 in various domains and tasks.
3. Comparison of GPT-4 with previous AI models like ChatGPT.
4. Exploration of GPT-4's limitations and challenges in advancing towards AGI.
5. Societal influences of technological advancements in AI and future research directions.
6. Multimodal and interdisciplinary composition of GPT-4, including integrative ability, vision, and music.
7. GPT-4's performance in coding tasks, both in generating code from instructions and understanding existing code.

1. Mathematical abilities
2. Interaction with the world
3. Interaction with humans
4. Discriminative capabilities
5. Limitations of autoregressive architecture
6


[1m> Finished chain.[0m

[1m> Finished chain.[0m


'Artificial General Intelligence, Large Language Models, GPT-4 capabilities, Comparison with ChatGPT, Limitations and challenges, Societal influences, Multimodal composition, Coding tasks, Mathematics, Interaction with the world and humans, Discriminative capabilities, Limitations of autoregressive architecture, Interpretability, Common sense grounding, Vision results, Problem-solving, Ethical considerations, Deception and manipulation, Responsibility and vigilance, Dialogue quality, Interdisciplinary integration, Use of common sense, Object manipulation, Medical assessment and diagnosis, Creative synthesis, Shakespearean style, Performance evaluation, Presidential candidacy.'

## The Refine Chain

In [49]:
refine_chain = load_summarize_chain(
    llm = llm,
    chain_type = 'refine',
    verbose = True
)

refine_chain.run(pdf_data[:20])



[1m> Entering new RefineDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"Sparks of Artiﬁcial General Intelligence:
Early experiments with GPT-4
S´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke
Eric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg
Harsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang
Microsoft Research
Abstract
Artiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)
that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding
of learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an
unprecedented scale of compute and data. In this paper, we report on our investigation of an early version
of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-
4 is


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYour job is to produce a final summary.
We have provided an existing summary up to a certain point: The paper discusses early experiments with GPT-4, a large language model developed by OpenAI, showcasing its general intelligence capabilities in various domains and tasks, including mathematical abilities, interaction with the world, and interaction with humans. It emphasizes the need to understand GPT-4's limitations and the challenges in advancing towards artificial general intelligence. The paper also explores the societal impacts of such technological advancements, such as challenges of erroneous generations, misinformation, bias, and impacts on human expertise, jobs, and economics. It suggests future research directions and highlights GPT-4's common sense grounding.
We have the opportunity to refine the existing summary (only if needed) with some more context below.
---------


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYour job is to produce a final summary.
We have provided an existing summary up to a certain point: The paper discusses early experiments with GPT-4, a large language model developed by OpenAI, showcasing its general intelligence capabilities in various domains and tasks. It emphasizes the need to understand GPT-4's limitations and challenges in advancing towards artificial general intelligence. The paper also explores societal impacts of such technological advancements and suggests future research directions. Additionally, it includes an appendix for multimodal and interdisciplinary composition, providing further details on integrative ability results, vision results, and a graphic novel design example. The paper covers measuring human performance on LeetCode, visualizing IMDb data, examples of visualization, 2D HTML game development, graphical user interface programming, revers


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYour job is to produce a final summary.
We have provided an existing summary up to a certain point: The paper discusses early experiments with GPT-4, a large language model developed by OpenAI, showcasing its general intelligence capabilities in various domains and tasks. It emphasizes the need to understand GPT-4's limitations and challenges in advancing towards artificial general intelligence. The paper also explores societal impacts of such technological advancements and suggests future research directions. Additionally, it includes an appendix for multimodal and interdisciplinary composition, providing further details on integrative ability results, vision results, and a graphic novel design example. The paper covers measuring human performance on LeetCode, visualizing IMDb data, examples of visualization, 2D HTML game development, graphical user interface programming, revers


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYour job is to produce a final summary.
We have provided an existing summary up to a certain point: The paper discusses early experiments with GPT-4, a large language model developed by OpenAI, showcasing its general intelligence capabilities in various domains and tasks. It emphasizes the need to understand GPT-4's limitations and challenges in advancing towards artificial general intelligence. The paper also explores societal impacts of such technological advancements and suggests future research directions. Additionally, it includes an appendix for multimodal and interdisciplinary composition, providing further details on integrative ability results, vision results, and a graphic novel design example. The paper covers measuring human performance on LeetCode, visualizing IMDb data, examples of visualization, 2D HTML game development, graphical user interface programming, revers


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYour job is to produce a final summary.
We have provided an existing summary up to a certain point: The paper discusses early experiments with GPT-4, a large language model developed by OpenAI, showcasing its general intelligence capabilities in various domains and tasks. It emphasizes the need to understand GPT-4's limitations and challenges in advancing towards artificial general intelligence, highlighting its performance in tasks such as passing mock technical interviews and competency tests in domains like medicine and law. The paper also explores GPT-4's ability to plan, learn from experience, interact with tools, and understand humans, addressing the importance of explainability and common sense. It delves into the societal impacts of early AGI and discusses key challenges, directions, and next steps for the field. The paper aims to shed light on GPT-4's capabilities and li


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYour job is to produce a final summary.
We have provided an existing summary up to a certain point: The paper discusses early experiments with GPT-4, a large language model developed by OpenAI, showcasing its general intelligence capabilities in various domains and tasks. It emphasizes the need to understand GPT-4's limitations and challenges in advancing towards artificial general intelligence, highlighting its performance in tasks such as passing mock technical interviews and competency tests in domains like medicine and law. Additionally, the paper explores GPT-4's ability to plan, learn from experience, interact with tools, and understand humans, addressing the importance of explainability and common sense. It delves into the societal impacts of early AGI and discusses key challenges, directions, and next steps for the field. The paper aims to shed light on GPT-4's capabiliti


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYour job is to produce a final summary.
We have provided an existing summary up to a certain point: The paper delves into early experiments with GPT-4, an advanced language model from OpenAI, showcasing its general intelligence across various domains and tasks. It underscores the importance of understanding GPT-4's limitations and the challenges in progressing towards artificial general intelligence. The paper highlights GPT-4's performance in tasks like mock technical interviews, competency tests in fields like medicine and law, planning, learning from experience, interacting with tools, and understanding humans. It also emphasizes the significance of explainability and common sense in AI systems. The societal impacts of early AGI are explored, along with key challenges, directions, and next steps for the field. The paper aims to illuminate GPT-4's capabilities and limitations, 


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYour job is to produce a final summary.
We have provided an existing summary up to a certain point: The paper delves into early experiments with GPT-4, an advanced language model from OpenAI, showcasing its general intelligence across various domains and tasks. It underscores the importance of understanding GPT-4's limitations and the challenges in progressing towards artificial general intelligence. The paper highlights GPT-4's performance in tasks like mock technical interviews, competency tests in fields like medicine and law, planning, learning from experience, interacting with tools, and understanding humans. It also emphasizes the significance of explainability and common sense in AI systems. The societal impacts of early AGI are explored, along with key challenges, directions, and next steps for the field. The paper aims to illuminate GPT-4's capabilities and limitations, 

"The paper delves into early experiments with GPT-4, an advanced language model from OpenAI, showcasing its general intelligence across various domains and tasks. It underscores the importance of understanding GPT-4's limitations and the challenges in progressing towards artificial general intelligence. The paper highlights GPT-4's performance in tasks like mock technical interviews, competency tests in fields like medicine and law, planning, learning from experience, interacting with tools, and understanding humans. It also emphasizes the significance of explainability and common sense in AI systems. The societal impacts of early AGI are explored, along with key challenges, directions, and next steps for the field. The paper aims to illuminate GPT-4's capabilities and limitations, pushing readers to consider its depth of understanding beyond mere improvisation. Additionally, a comparison is drawn between GPT-4's dialogues and ChatGPT, with GPT-4 praised for effectively engaging in a d

In [50]:
# looking at the refine chain
refine_chain

RefineDocumentsChain(verbose=True, initial_llm_chain=LLMChain(verbose=True, prompt=PromptTemplate(input_variables=['text'], template='Write a concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'), llm=ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x29e74a8d0>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x29e93ae90>, openai_api_key=SecretStr('**********'), openai_proxy='')), refine_llm_chain=LLMChain(verbose=True, prompt=PromptTemplate(input_variables=['existing_answer', 'text'], template="Your job is to produce a final summary.\nWe have provided an existing summary up to a certain point: {existing_answer}\nWe have the opportunity to refine the existing summary (only if needed) with some more context below.\n------------\n{text}\n------------\nGiven the new context, refine the original summary.\nIf the context isn't useful, return the original summary."), llm=ChatOpenAI(client=<openai.resources.chat.completio

In [51]:
# printing initial chain prompt
print(refine_chain.initial_llm_chain.prompt.template)

Write a concise summary of the following:


"{text}"


CONCISE SUMMARY:


In [52]:
# printing the refine chain prompt
print(refine_chain.refine_llm_chain.prompt.template)

Your job is to produce a final summary.
We have provided an existing summary up to a certain point: {existing_answer}
We have the opportunity to refine the existing summary (only if needed) with some more context below.
------------
{text}
------------
Given the new context, refine the original summary.
If the context isn't useful, return the original summary.


In the refine prompt we have a text saying `If the context isn't useful, return the original summary.`, this can cause some problem sometimes. It can cause confusion for LLM. Instead of giving any summary, it will show outputs like:
`The existing summary is already comprehensive and does not require any refinement`

In [53]:
custom_initial_template = '''
Extract the most relevant themes from the following:

{text}

THEMES:
'''

custom_refine_template = '''
Your job is to extract the most relevant themes
We have provided an existing list of themes upto a certain point: {existing_answer}
We have the opportunity to refine the existing list(only if needed) with some more context below.
------------
{text}
------------
Given the new context, refine the original list.
If the context isn't useful, return the original list and ONLY the original list.
Return that list as a comma separated list.

LIST:
'''


custom_initial_prompt = PromptTemplate.from_template(custom_initial_template)
custom_refine_prompt = PromptTemplate.from_template(custom_refine_template)

custom_refine_chain = load_summarize_chain(
    llm = llm, 
    chain_type = 'refine',
    question_prompt = custom_initial_prompt,
    refine_prompt = custom_refine_prompt,
    verbose = True
)

custom_refine_chain.run(pdf_data[:20])



[1m> Entering new RefineDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Extract the most relevant themes from the following:

Sparks of Artiﬁcial General Intelligence:
Early experiments with GPT-4
S´ ebastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke
Eric Horvitz Ece Kamar Peter Lee Yin Tat Lee Yuanzhi Li Scott Lundberg
Harsha Nori Hamid Palangi Marco Tulio Ribeiro Yi Zhang
Microsoft Research
Abstract
Artiﬁcial intelligence (AI) researchers have been developing and reﬁning large language models (LLMs)
that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding
of learning and cognition. The latest model developed by OpenAI, GPT-4 [Ope23], was trained using an
unprecedented scale of compute and data. In this paper, we report on our investigation of an early version
of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of)


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Your job is to extract the most relevant themes
We have provided an existing list of themes upto a certain point: 1. Advancements in Artificial General Intelligence, 2. GPT-4 and its capabilities, 3. Multimodal and interdisciplinary composition, 4. Vision and image generation, 5. Music generation, 6. Coding challenges and capabilities, 7. Societal implications and future research directions, 8. Mathematical abilities, 9. Interaction with the world, 10. Interaction with humans, 11. Discriminative capabilities, 12. Limitations of autoregressive architecture highlighted by GPT-4, 13. Societal influences, 14. Directions and Conclusions, 15. GPT-4 has common sense grounding.
We have the opportunity to refine the existing list(only if needed) with some more context below.
------------
10.3 What is actually happening? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Your job is to extract the most relevant themes
We have provided an existing list of themes upto a certain point: Advancements in Artificial General Intelligence, GPT-4 and its capabilities, Multimodal and interdisciplinary composition, Vision and image generation, Music generation, Coding challenges and capabilities, Societal implications and future research directions, Mathematical abilities, Interaction with the world, Interaction with humans, Discriminative capabilities, Limitations of autoregressive architecture highlighted by GPT-4, Societal influences, Directions and Conclusions, GPT-4 has common sense grounding
We have the opportunity to refine the existing list(only if needed) with some more context below.
------------
unicorn drawing. While the system performs non-trivially on both tasks, there is no comparison with the
outputs from GPT-4. These preliminary observation


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Your job is to extract the most relevant themes
We have provided an existing list of themes upto a certain point: Advancements in Artificial General Intelligence, GPT-4 and its capabilities, Multimodal and interdisciplinary composition, Vision and image generation, Music generation, Coding challenges and capabilities, Societal implications and future research directions, Mathematical abilities, Interaction with the world, Interaction with humans, Discriminative capabilities, Limitations of autoregressive architecture highlighted by GPT-4, Societal influences, Directions and Conclusions, GPT-4 has common sense grounding
We have the opportunity to refine the existing list(only if needed) with some more context below.
------------
Figure 1.3: We queried GPT-4 three times, at roughly equal time intervals over the span of a month
while the system was being reﬁned, with the prompt “Dr


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Your job is to extract the most relevant themes
We have provided an existing list of themes upto a certain point: Advancements in Artificial General Intelligence, GPT-4 and its capabilities, Multimodal and interdisciplinary composition, Vision and image generation, Music generation, Coding challenges and capabilities, Societal implications and future research directions, Mathematical abilities, Interaction with the world, Interaction with humans, Discriminative capabilities, Limitations of autoregressive architecture highlighted by GPT-4, Societal influences, Directions and Conclusions, GPT-4 has common sense grounding
We have the opportunity to refine the existing list(only if needed) with some more context below.
------------
Figure 1.5: GPT-4 passes mock technical interviews on LeetCode. GPT-4 could potentially be hired
as a software engineer3.
preliminary tests (see [Ope23] 


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Your job is to extract the most relevant themes
We have provided an existing list of themes upto a certain point: Advancements in Artificial General Intelligence, GPT-4 and its capabilities, Multimodal and interdisciplinary composition, Vision and image generation, Music generation, Coding challenges and capabilities, Societal implications and future research directions, Mathematical abilities, Interaction with the world, Interaction with humans, Discriminative capabilities, Limitations of autoregressive architecture highlighted by GPT-4, Societal influences, Directions and Conclusions, GPT-4 has common sense grounding
We have the opportunity to refine the existing list(only if needed) with some more context below.
------------
a slightly better job of using the dialogue format to engage in a dialectical process, where Socrates
and Aristotle question each other and refine their 


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Your job is to extract the most relevant themes
We have provided an existing list of themes upto a certain point: Advancements in Artificial General Intelligence, GPT-4 and its capabilities, Multimodal and interdisciplinary composition, Vision and image generation, Music generation, Coding challenges and capabilities, Societal implications and future research directions, Mathematical abilities, Interaction with the world, Interaction with humans, Discriminative capabilities, Limitations of autoregressive architecture highlighted by GPT-4, Societal influences, Directions and Conclusions, GPT-4 has common sense grounding, Common sense reasoning
We have the opportunity to refine the existing list(only if needed) with some more context below.
------------
Figure 2.1: The ﬁrst image is Composition 8, art by Wassily Kandinsky, the second and the third
are produced by GPT-4 and ChatGPT


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Your job is to extract the most relevant themes
We have provided an existing list of themes upto a certain point: Advancements in Artificial General Intelligence, GPT-4 and its capabilities, Multimodal and interdisciplinary composition, Vision and image generation, Music generation, Coding challenges and capabilities, Societal implications and future research directions, Mathematical abilities, Interaction with the world, Interaction with humans, Discriminative capabilities, Limitations of autoregressive architecture highlighted by GPT-4, Societal influences, Directions and Conclusions, GPT-4 has common sense grounding, Common sense reasoning, Primes proof in Shakespearean dialogue.
We have the opportunity to refine the existing list(only if needed) with some more context below.
------------
being doubtful, while STUDENT B used Romeo and Juliet, who are both in agreement and lov


[1m> Finished chain.[0m

[1m> Finished chain.[0m


'Advancements in Artificial General Intelligence, GPT-4 and its capabilities, Multimodal and interdisciplinary composition, Vision and image generation, Music generation, Coding challenges and capabilities, Societal implications and future research directions, Mathematical abilities, Interaction with the world, Interaction with humans, Discriminative capabilities, Limitations of autoregressive architecture highlighted by GPT-4, Societal influences, Directions and Conclusions, GPT-4 has common sense grounding, Common sense reasoning, Primes proof in Shakespearean dialogue, Comparison between GPT-4 and ChatGPT on interdisciplinary tasks'