### Quick intro to LlamaIndex  
Sources: [1](https://lmy.medium.com/comparing-langchain-and-llamaindex-with-4-tasks-2970140edf33), [2](https://docs.llamaindex.ai/en/stable/), [3](https://github.com/run-llama/llama_index), [4](https://nanonets.com/blog/llamaindex/)  

LlamaIndex is a "data framework" to help you build LLM apps. It provides the following tools:

+ Offers data connectors to ingest your existing data sources and data formats (APIs, PDFs, docs, SQL, etc.).
+ Provides ways to structure your data (indices, graphs) so that this data can be easily used with LLMs.
+ Provides an advanced retrieval/query interface over your data: Feed in any LLM input prompt, get back retrieved context and knowledge-augmented output.
+ Allows easy integrations with your outer application framework (e.g. with LangChain, Flask, Docker, ChatGPT, anything else).
+ LlamaIndex provides tools for both beginner users and advanced users.  

The high-level API allows beginner users to use LlamaIndex to ingest and query their data in 5 lines of code.  
The lower-level APIs allow advanced users to customize and extend any module (data connectors, indices, retrievers, query engines, reranking modules), to fit their needs.  

LlamaIndex provides the following tools:
+ Data connectors ingest your existing data from their native source and format. These could be APIs, PDFs, SQL, and (much) more.
+ Data indexes structure your data in intermediate representations that are easy and performant for LLMs to consume.
+ Engines provide natural language access to your data. For example:
+ Query engines are powerful retrieval interfaces for knowledge-augmented output.
+ Chat engines are conversational interfaces for multi-message, “back and forth” interactions with your data.
+ Data agents are LLM-powered knowledge workers augmented by tools, from simple helper functions to API integrations and more.
+ Application integrations tie LlamaIndex back into the rest of your ecosystem. This could be LangChain, Flask, Docker, ChatGPT, or… anything else!  

#### Installing Packages

In [None]:
!pip install -qU openai
!pip install -qU llama-index
!pip install -qU pydantic
!pip install -qU llama-index-llms-openai
%pip install llama-index-embeddings-ollama
!pip install -qU pypdf
!pip install -qU docx2txt

Defaulting to user installation because normal site-packages is not writeable
Collecting llama-index-embeddings-ollama
  Downloading llama_index_embeddings_ollama-0.6.0-py3-none-any.whl (3.4 kB)
Installing collected packages: llama-index-embeddings-ollama
Successfully installed llama-index-embeddings-ollama-0.6.0
Note: you may need to restart the kernel to use updated packages.


#### Importing Packages

In [None]:
import os
import sys
import openai
import pydantic

#os.environ["OPENAI_API_KEY"] = "<the key>"
os.environ["OPENAI_API_KEY"] = "<the key>"
openai.api_key = os.environ["OPENAI_API_KEY"]

import llama_index

from llama_index.core import Settings

from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.embeddings.ollama import OllamaEmbedding

from llama_index.llms.ollama import Ollama
#from llama_index.embeddings.huggingface import HuggingFaceEmbedding


from llama_index.core import VectorStoreIndex
from llama_index.core import SimpleDirectoryReader
from llama_index.core import StorageContext
from llama_index.core import load_index_from_storage

In [3]:
print("LLamaIndex:", llama_index.core.__version__)
print("Pydantic:", pydantic.VERSION)
print("OpenAI:", openai.__version__)

LLamaIndex: 0.11.23
Pydantic: 2.11.4
OpenAI: 1.78.1


In [4]:
import logging
import sys

#logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

#### Defining Models

For using [Ollama Models](https://ollama.com/search), check which ones are installed in your local machine

In [12]:
#model="gpt-4o"
model="gpt-4o-mini"
Settings.llm = OpenAI(temperature=0, 
                      model=model, 
                      #max_tokens=512
                      PRESENCE_PENALTY=0,	
                      TOP_P=1,
                     )


Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

#Settings.llm = Ollama(model="llama3.2:latest", request_timeout=300.0)
#Settings.embed_model = OllamaEmbedding(model_name="nomic-embed-text")

#### Defining Folders

In [7]:
print(f"Current dir: {os.getcwd()}")
DOCS_DIR = "../../Data/txt/"
if not os.path.exists(DOCS_DIR):
  os.mkdir(DOCS_DIR)
docs = os.listdir(DOCS_DIR)
docs = [d for d in docs]
docs.sort()
print(f"Files in {DOCS_DIR}")
for doc in docs:
    print(doc)

Current dir: c:\Users\Renato Rocha Souza\Documents\Repos\GenAI4Humanists\Notebooks\LlamaIndex
Files in ../../Data/txt/
Pride_Prejudice.txt
RomeoJuliet.txt
WarrenCommissionReport.txt
kafka_metamorphosis.txt
nyc_text.txt
paul_graham_essay.txt
state_of_the_union.txt


In [8]:
documents = SimpleDirectoryReader(input_files=[f"{DOCS_DIR}kafka_metamorphosis.txt"]).load_data()
documents



In [13]:
index = VectorStoreIndex.from_documents(documents)

In [14]:
query_engine = index.as_query_engine()
response = query_engine.query("What is the document about?")
print(response)

The document appears to be a passage from a narrative that explores the dynamics of the Samsa family, particularly focusing on their interactions and emotional states following a significant change in their lives. It highlights their decision to take a break from their usual routines, their reflections on work and future prospects, and the evolving relationship between the family members, especially regarding Grete's development into a young woman. The passage conveys themes of family, change, and the search for a better living situation.


In [15]:
INDEX_DIR = "../../Index/VectorStoreIndex/"
if not os.path.exists(INDEX_DIR):
  os.mkdir(INDEX_DIR)
index.storage_context.persist(INDEX_DIR)

In [16]:
if not os.path.exists(INDEX_DIR):
    documents = SimpleDirectoryReader(DOCS_DIR).load_data()
    index = VectorStoreIndex.from_documents(documents)
    index.storage_context.persist(persist_dir=INDEX_DIR)
else:
    storage_context = StorageContext.from_defaults(persist_dir=INDEX_DIR)
    index = load_index_from_storage(storage_context)

query_engine = index.as_query_engine()
response = query_engine.query("What is the document about?")
print(response)

The document appears to be a passage from a narrative that explores the dynamics of the Samsa family, particularly focusing on their interactions and emotional states following a significant change in their lives. It highlights their decision to take a break from their usual routines, their reflections on work and future prospects, and the evolving relationship between the family members, especially regarding Grete's development into a young woman. The passage conveys themes of family, change, and the search for a better living situation.


In [18]:
response = query_engine.query("What are the family attitudes towards Gregor Samsa?")
print(response)

The family attitudes towards Gregor Samsa are complex and evolve throughout the narrative. Initially, there is a sense of concern from his mother, who tries to explain his absence from work by claiming he is unwell. She expresses a degree of understanding, highlighting Gregor's dedication to his job and his lack of social life. However, as the story progresses, the family's attitude shifts to one of frustration and disappointment. They become increasingly disturbed by Gregor's transformation and the burden it places on them. 

His father exhibits anger and a desire to distance himself from Gregor, ultimately demanding that outsiders leave their home. Grete, his sister, initially shows some compassion but later becomes resentful and detached, focusing on the physical state of Gregor's body rather than any emotional connection. Overall, the family's attitudes range from concern to resentment, reflecting their struggle to cope with Gregor's drastic change and the impact it has on their li

In [19]:
response = query_engine.query("How does Gregor Samsa feel about his family?")
print(response)

Gregor Samsa experiences a complex mix of emotions towards his family. He feels a sense of responsibility and concern for their well-being, often reflecting on his desire to take over the family's affairs as he did before his transformation. However, he also grapples with feelings of anger and frustration due to the lack of attention and care he receives from them. As time passes, he notices a shift in their behavior, particularly from his sister, who becomes indifferent and hurried in her interactions with him. This change contributes to his feelings of isolation and despair, as he observes their struggles and the emotional toll his condition takes on them. Overall, Gregor's feelings are marked by a blend of love, guilt, and resentment as he navigates his altered relationship with his family.
