
## Introducing ShakespearGPT: Melding Shakespearean Wisdom with Conversational AI
---
**ShakespearGPT** seamlessly integrates the profound insights of **Shakespeare's literature** with the conversational prowess of **ChatGPT**. Leveraging the capabilities of **ChatGPT's API** and the timeless legacy of **Shakespeare's works**, we embark on a journey of exploration. With the aid of basic retrieval through **Top-K Similarity search**, our approach is fortified by a text file boasting approximately **40,000 lines**, surpassing the limitations of a single context window. By harnessing **Langchain** and constructing a **vector store**, we transcend constraints, furnishing **ChatGPT** with relevant paragraphs or vectors as a knowledge base. This innovative concept extends beyond text files to encompass **PDFs** and large documents, unlocking a realm of boundless possibilities.




---



In [None]:
# Installations
## We are using Shakespear Text file which has 40000 lines and store it in the vector store


###################################################################################################################################################################################################
!wget https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt
!pip install --upgrade langchain openai -q
!pip install unstructured -q
!pip install unstructured[local-inference] -q
!pip install detectron2@git+https://github.com/facebookresearch/detectron2.git@v0.6#egg=detectron2 -q
!apt-get install poppler-utils
!pip install chromadb
!pip install tiktoken
#############################################################################################################################################################################################


##Load your data
Next let's load up some data.We will load up `input.txt` using the `TextLoader` utility of Langchain.

In [2]:
from langchain.document_loaders import TextLoader
def load_docs(directory):
  loader = TextLoader(directory)
  documents = loader.load()
  return documents
documents = load_docs('input.txt')

1

##Chunk your data up into smaller documents
While we could pass the entire essay to a model with long context, we want to be picky about which information we share with our model. The first thing we'll do is **chunk** up our document into smaller pieces. The goal will be to take only a few of those smaller pieces and pass them to the LLM. For this we will use `RecursiveCharacterTextSplitter` utility of Langchain to make it into chunks

In [3]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
def split_docs(documents, chunk_size=1000, chunk_overlap=20):
  text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
  docs = text_splitter.split_documents(documents)
  return docs

docs = split_docs(documents)
print(len(docs))

1360


##Create embeddings of your documents to get ready for semantic search
Next up we need to prepare for **similarity searches**. The way we do this is through embedding our documents (getting a vector per document).
This will help us compare documents later on. For this we will use `OpenAIEmbeddings` where we will pass the Openai api key to get embeddings.

In [4]:
from langchain.embeddings.openai import OpenAIEmbeddings
import os

# Check to see if there is an environment variable with you API keys, if not, use what you put below
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY', 'your_api_key')
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

  warn_deprecated(


## Creating a VectorStore
I have used **chroma** becauase it's local and easy to set up without an account. Let's load it into `Chroma`.

In [None]:
from langchain.vectorstores import Chroma
vectorstore = Chroma.from_documents(docs, embeddings)

Lets test the vector store.


In [20]:
query = "What did Sebastian said to Antonio about eyelids"
docs = vectorstore.similarity_search(query)

In [37]:
# Here's an example of the first document that was returned
from IPython.display import Markdown

def highlight_word(text, word, color='yellow'):
    highlighted_text = text.replace(word, f"<mark style='background-color: {color};'>{word}</mark>")
    return Markdown(highlighted_text)

for doc in docs:
    content = doc.page_content.split('\n')
    for line in content:
        if 'eyelid' in line:
            line = highlight_word(line, 'eyelid')
        display(line)
    print('\n')




'ALONSO:'

'What, all so soon asleep! I wish mine eyes'

'Would, with themselves, shut up my thoughts: I find'

'They are inclined to do so.'

''

'SEBASTIAN:'

'Please you, sir,'

'Do not omit the heavy offer of it:'

'It seldom visits sorrow; when it doth,'

'It is a comforter.'

''

'ANTONIO:'

'We two, my lord,'

'Will guard your person while you take your rest,'

'And watch your safety.'

''

'ALONSO:'

'Thank you. Wondrous heavy.'

''

'SEBASTIAN:'

'What a strange drowsiness possesses them!'

''

'ANTONIO:'

"It is the quality o' the climate."

''

'SEBASTIAN:'

'Why'

Doth it not then our <mark style='background-color: yellow;'>eyelid</mark>s sink? I find not

'Myself disposed to sleep.'

''

'ANTONIO:'

'Nor I; my spirits are nimble.'

'They fell together all, as by consent;'

"They dropp'd, as by a thunder-stroke. What might,"

'Worthy Sebastian? O, what might?--No more:--'

'And yet me thinks I see it in thy face,'

'What thou shouldst be: the occasion speaks thee, and'

'My strong imagination sees a crown'

'Dropping upon thy head.'

''

'SEBASTIAN:'

'What, art thou waking?'

''

'ANTONIO:'

'Do you not hear me speak?'





'SEBASTIAN:'

'I do; and surely'

"It is a sleepy language and thou speak'st"

'Out of thy sleep. What is it thou didst say?'

'This is a strange repose, to be asleep'

'With eyes wide open; standing, speaking, moving,'

'And yet so fast asleep.'

''

'ANTONIO:'

'Noble Sebastian,'

"Thou let'st thy fortune sleep--die, rather; wink'st"

'Whiles thou art waking.'





'SEBASTIAN:'

"You were kneel'd to and importuned otherwise"

'By all of us, and the fair soul herself'

"Weigh'd between loathness and obedience, at"

"Which end o' the beam should bow. We have lost your"

'son,'

'I fear, for ever: Milan and Naples have'

"More widows in them of this business' making"

'Than we bring men to comfort them:'

"The fault's your own."

''

'ALONSO:'

"So is the dear'st o' the loss."

''

'GONZALO:'

'My lord Sebastian,'

'The truth you speak doth lack some gentleness'

'And time to speak it in: you rub the sore,'

'When you should bring the plaster.'

''

'SEBASTIAN:'

'Very well.'

''

'ANTONIO:'

'And most chirurgeonly.'

''

'GONZALO:'

'It is foul weather in us all, good sir,'

'When you are cloudy.'

''

'SEBASTIAN:'

'Foul weather?'

''

'ANTONIO:'

'Very foul.'

''

'GONZALO:'

'Had I plantation of this isle, my lord,--'

''

'ANTONIO:'

"He'ld sow't with nettle-seed."

''

'SEBASTIAN:'

'Or docks, or mallows.'

''

'GONZALO:'

"And were the king on't, what would I do?"

''

'SEBASTIAN:'

"'Scape being drunk for want of wine."





'SEBASTIAN:'

'You have taken it wiselier than I meant you should.'

''

'GONZALO:'

'Therefore, my lord,--'

''

'ANTONIO:'

'Fie, what a spendthrift is he of his tongue!'

''

'ALONSO:'

'I prithee, spare.'

''

'GONZALO:'

'Well, I have done: but yet,--'

''

'SEBASTIAN:'

'He will be talking.'

''

'ANTONIO:'

'Which, of he or Adrian, for a good'

'wager, first begins to crow?'

''

'SEBASTIAN:'

'The old cock.'

''

'ANTONIO:'

'The cockerel.'

''

'SEBASTIAN:'

'Done. The wager?'

''

'ANTONIO:'

'A laughter.'

''

'SEBASTIAN:'

'A match!'

''

'ADRIAN:'

'Though this island seem to be desert,--'

''

'SEBASTIAN:'

"Ha, ha, ha! So, you're paid."

''

'ADRIAN:'

'Uninhabitable and almost inaccessible,--'

''

'SEBASTIAN:'

'Yet,--'

''

'ADRIAN:'

'Yet,--'

''

'ANTONIO:'

"He could not miss't."

''

'ADRIAN:'

'It must needs be of subtle, tender and delicate'

'temperance.'

''

'ANTONIO:'

'Temperance was a delicate wench.'

''

'SEBASTIAN:'

'Ay, and a subtle; as he most learnedly delivered.'

''

'ADRIAN:'

'The air breathes upon us here most sweetly.'

''

'SEBASTIAN:'

'As if it had lungs and rotten ones.'

''

'ANTONIO:'

"Or as 'twere perfumed by a fen."





## Integrating LLM using Langchain

In [38]:
# Importing the langchain utilities
from langchain.chat_models import ChatOpenAI
from langchain.chains.question_answering import load_qa_chain

In [39]:
# Settting up the LLM chain.
llm = ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)
chain = load_qa_chain(llm, chain_type="stuff")

  warn_deprecated(


In [40]:
query = "What did Sebastian said to Antonio about eyelids"
docs = vectorstore.similarity_search(query)

In [42]:
# Running the chain
chain.run(input_documents=docs, question=query)

'Sebastian said to Antonio, "Why doth it not then our eyelids sink? I find not myself disposed to sleep."'

Leveraging **vector store** and **OpenAI embeddings** with the help of **LangChain**, we've empowered **ChatGPT** to generate responses directly from the document, resulting in a highly effective interaction.


---

