#LlamaIndex

#### Notebook Setup

In [None]:
#This notebook teaches you a very basic RAG-based solution where you retrieve information from the digital piano manual pdf file, you break the text in
#the digital piano pdf file into chunks and then embed those chunks into the model and then ask a question and get an answer based on what is the
#most similar to the question that you asked

In [None]:
#install the llama-index package
#We use llama-index becuase its one of the frameworks that allows us to implement RAG(Retrieval Augmented Generation)
%pip install llama-index llama-parse llama-index-vector-stores-qdrant -q
!pip install openai

In [None]:
import nest_asyncio

nest_asyncio.apply()

In [None]:
#Copy the digital piano manual across and storing it in the folder data
!mkdir data
!wget https://jonfernandes.github.io/files/digital-piano.pdf -O "./data/digital-piano.pdf"

--2024-05-02 17:19:34--  https://jonfernandes.github.io/files/digital-piano.pdf
Resolving jonfernandes.github.io (jonfernandes.github.io)... 185.199.108.153, 185.199.109.153, 185.199.110.153, ...
Connecting to jonfernandes.github.io (jonfernandes.github.io)|185.199.108.153|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11002048 (10M) [application/pdf]
Saving to: ‘./data/digital-piano.pdf’


2024-05-02 17:19:34 (105 MB/s) - ‘./data/digital-piano.pdf’ saved [11002048/11002048]



In [None]:
#Check if the digital piano manual is inside the data folder
!ls -la ./data

total 10756
drwxr-xr-x 2 root root     4096 May  2 17:19 .
drwxr-xr-x 1 root root     4096 May  2 17:19 ..
-rw-r--r-- 1 root root 11002048 Jan 12 12:09 digital-piano.pdf


In [None]:
import openai

openai.api_key=""

In [None]:
from llama_index.llms.openai import OpenAI


llm = OpenAI(model="gpt-4-turbo-preview")#Specify that we're going to be using the OpenAI gpt-4-turbo model for our project

#Ask the gpt-4-turbo model the question how do you change the volume on a Roland F-120 and get response back from LLM and store answer in variable response
response = llm.complete(
    "How do you change the volume on a Roland F-120"
)
print(response)



NotFoundError: Error code: 404 - {'error': {'message': 'The model `gpt-4-turbo-preview` does not exist or you do not have access to it.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}

In [None]:
#Ask the model a question without access to the digital piano manual
response = llm.complete(
    "According to the manual, how do you play a demo piece?"
)
print(response)

To provide you with accurate instructions on how to play a demo piece, I would need to know the specific device, instrument, or software you are referring to. The process can vary significantly depending on the context. For example, playing a demo piece on a digital piano involves different steps than playing a demo on music production software or a synthesizer.

However, I can offer a general guide that applies to many electronic musical instruments, such as keyboards or digital pianos:

1. **Turn on the Device**: Ensure the instrument or device is connected to power and turned on.

2. **Select Demo Mode**: Look for a button or menu option labeled "Demo," "Song," or something similar. This might involve pressing a specific button directly or navigating through a menu using a screen and selector buttons.

3. **Choose a Demo Piece**: Some devices will start playing a demo piece immediately upon entering demo mode. Others may require you to select a specific piece from a list. This could

## 1. Build External Knowledge

With llama-index, before any transformations are applied,
data is loaded in the `Document` abstraction, which is
a container that holds the text of the document.

In [None]:
#We're going to use this document abstraction which allows us to store all of our text data in the digital piano manual within the data folder
from llama_index.core import SimpleDirectoryReader

loader = SimpleDirectoryReader(input_dir="./data")
documents = loader.load_data()#I use the variable loader and store the text from the manual in a variable documents

In [None]:
#This is what the text we got from the manual looks like (All of it has been broken down into pieces from the digital piano manual)
documents[1].text

'2\n&&Look What You Can Do!\n&&Personalize  Your Piano\n&eAdjust the keyboard touch\nYou can adjust the touch sensitivity of the keyboard to match \nyour own playing style.\n&eAdd reverberation\nYou can add reverberation (reverb) to create the sensation of performing in a concert hall.\ng p. 9\ng p. 9&&Play  the Piano\n&ePlay using various sounds\nThis unit contains a wide variety of tones (sounds). You can freely select and perform using these tones.\n&eSound a metronome\nYou can sound a metronome.\n&ePlay duets\nYou can divide the keyboard into left and right halves, playing it as though it were two pianos.\ng p. 7\ng p. 7\ng p. 8&&Play and Record Songs\n&ePlay the built-in songs\nThis unit contains numerous built-in songs. For the song titles, refer to “Internal Song List” (p. 19).\n&ePlay back individual parts\nYou can practice along with a song while listening to only the right-hand or left-hand part play back.\n&eRecord your performance\nYou can record your own performances.\ng p

Chunk, Encode, and Store into a Vector Store.

- To streamline the process, use the IngestionPipeline
class that will apply your specified transformations to the
Document's.

In [None]:
#We then want to chunk and encode it what we mean by chunk is break it up into smaller pieces and by encode we mean we want to embed it
from llama_index.core.ingestion import IngestionPipeline
from llama_index.core.node_parser import SentenceSplitter
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.vector_stores.qdrant import QdrantVectorStore
import qdrant_client

#We use an in-memory vector database (like a quadrant) and it allows us to use a sentence splitter to split the sentences from the manual
client = qdrant_client.QdrantClient(location=":memory:")
vector_store = QdrantVectorStore(client=client, collection_name="test_store")

pipeline = IngestionPipeline(
    transformations = [
        SentenceSplitter(),#split the sentences from the manual using SentenceSplitter() function
        OpenAIEmbedding(),#and then we embed it into OpenAI using OpenAIEmbedding() function
    ],
    vector_store=vector_store,
)
_nodes = pipeline.run(documents=documents, num_workers=2)

Exception in thread Thread-12 (_handle_results):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 579, in _handle_results
    task = get()
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
TypeError: APIStatusError.__init__() missing 2 required keyword-only arguments: 'response' and 'body'


KeyboardInterrupt: 

In [None]:
len(_nodes)#print out the number of nodes

47

In [None]:
_nodes[1].text

NameError: name '_nodes' is not defined

Create a Index.

- Upload encoded documents into vector store
- Connect to it with a VectorStoreIndex


In [None]:
#I have chunks of the text from the digital piano manual that have been embedded
#and we want to upload those embedded documents into some sort of vector store and we do that using VectorStoreIndex.from_vector_store() function and
#store the vector store into a variable index because it allows us to access the text that we embedded and then we can ask any question I want off that index
#NOTE: Vector store index might be the vector space
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_vector_store(vector_store=vector_store)#upload the embedded documents into a vector store using VectorStoreIndex.from_vector_store() function and store vector space into variable index

## 2. Retrieve Against A Query


In [None]:
#Ask a question that we want off that index
retriever = index.as_retriever(similarity_top_k=2)#retrieve the top 2 chunks that are the most similar to the question because we specify similarity_top_k = 2 thus getting 2 chunks that will help us answer the question
retrieved_nodes = retriever.retrieve(#ask question how can you play a demo piece? when you ask that question the question/query is also embedded
    "How can you play a demo piece?"
)
#and then what happens is its performing a comparison operation much like the cosine similarity to compare how far the question/query is VS all
#of the chunks stored in the vector store index and then the way the model will know that you have a potential answer is it will retrieve 2 chunks that
#are close to the question in the vector space and will respond with that

In [None]:
#View the retrieved nodes that we get from the LlaMa index which are the 2 chunks of info that our model thinks are the most appropriate to the question we asked
retrieved_nodes

[NodeWithScore(node=TextNode(id_='bac74254-f207-412a-adc8-a3acb0a43dd7', embedding=None, metadata={'page_label': '10', 'file_name': 'digital-piano.pdf', 'file_path': '/content/data/digital-piano.pdf', 'file_type': 'application/pdf', 'file_size': 11002048, 'creation_date': '2024-04-29', 'last_modified_date': '2024-01-12'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='f97f5e2a-308b-4d1b-afc5-f7fc2436ea7c', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '10', 'file_name': 'digital-piano.pdf', 'file_path': '/content/data/digital-piano.pdf', 'file_type': 'application/pdf', 'file_size': 11002048, 'creation_date': '2024-04-29', 'last_modified_date': '2024-01-12'}, hash='bad22a58ae7dca249aa4ba90d3

## 3. Generate Final Response


In [None]:
#It then goes ahead and sends those 2 chunks to the LLM because that's the relevant documents that are then being sent to the LLM
#The LLM that we're using the GPT-4 model and that's why we need the OpenAI API KEYS because we've specified that we're using OpenAI as our LLM
query_engine = index.as_query_engine()

In [None]:
# to inspect the default prompt being used
#the prompt being used for the LLM is the query_engine and we're providing the 2 context strings to the question/query
#so given context info and no prior knowledge it answers the question/query
print(query_engine.
      get_prompts()["response_synthesizer:text_qa_template"]
      .default_template.template
)
#The question/query is the query_str which has value How can you play a demo piece?

Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {query_str}
Answer: 


In [None]:
#This is where the LLM responds to the question and it gives us a spot on answer and its only using the data that we've sent to the it to answer the question
response = query_engine.query(
              "How can you play a demo piece?"
)
print(response)
#If we had a keyword database we would be stuck (for example if we searched for the word demo in the digital piano manual pdf file we would get nothing
#back) and that's why having an embedding based solution is really helpful because the word demo and the sentence built in song is actually more closely
#related to each other so it doesn't need that exact string

To play a demo piece, you can press the [Song] button so it's lit. The song will begin playing, and the display will indicate the currently playing measure of the song. You can fast-forward or fast-reverse the song by pressing the [–] [+] buttons while the song is playing. To stop the song, press the [Song] button so its light goes out. If you want to play songs consecutively, you can hold down the Piano [Ensemble] button and press the [Song] button.


## More Queries

### Comparisons

In [None]:
query = (
    "How can I select a grand piano?"
)

response = query_engine.query(query)
print(response)

To select a grand piano, you can press the [Grand] button on the digital piano.


In [None]:
query = (
    "How can I turn the power off?"
)

response = query_engine.query(query)
print(response)

To turn the power off, you should first turn the [Volume] knob all the way toward the left to minimize the volume. Then, you need to turn the [Power] switch OFF. If you do not want the power to turn off automatically, you can turn off the "Auto Off" setting.
