In [1]:
%pip install llama-index-llms-openai
!pip install llama-index

import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.ERROR)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))


Note: you may need to restart the kernel to use updated packages.


In [2]:
import os
from llama_index.core import Settings
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding

# Configure Ollama LLM
ollama_llm = Ollama(
    model="llama3.2:latest",
    base_url="http://localhost:11434",
    temperature=0.1
)

# Configure embedding model
ollama_embedding = OllamaEmbedding(
    model_name="nomic-embed-text:latest",
    base_url="http://localhost:11434",
    ollama_additional_kwargs={"mirostat": 0}
)

Settings.llm = ollama_llm
Settings.embed_model = ollama_embedding

In [3]:
from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader(input_files=['../data/paul_graham_essay3.txt']).load_data()
# documents = SimpleDirectoryReader(input_files=['../data/2022 Q3 AAPL.pdf']).load_data()

In [4]:
import nest_asyncio
nest_asyncio.apply()

In [5]:
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex.from_documents(documents, embed_model=ollama_embedding)


In [6]:
from llama_index.core.postprocessor import SentenceTransformerRerank

rerank = SentenceTransformerRerank(
    model="cross-encoder/ms-marco-MiniLM-L-2-v2", top_n=3
)

  from tqdm.autonotebook import tqdm, trange


config.json:   0%|          | 0.00/794 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


pytorch_model.bin:   0%|          | 0.00/62.5M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/316 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

In [7]:
from time import time

In [9]:
query_engine = vector_index.as_query_engine(
    similarity_top_k=10, node_postprocessors=[rerank]
)

now = time()
response = query_engine.query(
    "Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")

Elapsed: 7.61s


In [10]:

print(response)

The author applied to two art schools: RISD (Rhode Island School of Design) in the US and the Accademia di Belli Arti in Florence. The reason for applying to both is that the author wanted to attend an art school, but was also considering dropping out of their PhD program in computer science due to its demanding nature.


In [11]:
print(response.get_formatted_sources(length=200))


> Source (Doc id: b0d66b47-cd43-4bb3-9a8e-147fc8b6c3ff): I didn't want to drop out of grad school, but how else was I going to get out? I remember when my friend Robert Morris got kicked out of Cornell for writing the internet worm of 1988, I was envious...

> Source (Doc id: a7ab28d2-86bf-4718-89f3-8592fb379b35): So I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association ...

> Source (Doc id: f0897b54-a49f-42d1-8f13-a293be59e246): They were an impressive group. That first batch included reddit, Justin Kan and Emmett Shear, who went on to found Twitch, Aaron Swartz, who had already helped write the RSS spec and would a few ye...


In [13]:
query_engine = vector_index.as_query_engine(similarity_top_k=10)


now = time()
response = query_engine.query(
    "Which grad schools did the author apply for and why?",
)

print(f"Elapsed: {round(time() - now, 2)}s")

Elapsed: 9.99s


In [14]:
print(response)


The author applied to two art schools: RISD in the US, and the Accademia di Belli Arti in Florence, Italy.

He applied to RISD because he was only 25 years old and wanted to attend college again, which was not as strange as it sounds since many students at that age were already enrolled. He also had a good foundation in drawing, color, and design from the RISD summer program.

The author did not apply to the Accademia di Belli Arti for artistic reasons, but rather because he imagined it would be prestigious and good for his career as an artist.


In [15]:
print(response.get_formatted_sources(length=200))


> Source (Doc id: a7ab28d2-86bf-4718-89f3-8592fb379b35): So I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association ...

> Source (Doc id: a8e52a9f-f439-453b-8d04-460129652ded): A lot of Lisp hackers dream of building a new Lisp, partly because one of the distinctive features of the language is that it has dialects, and partly, I think, because we have in our minds a Plato...

> Source (Doc id: f44985e9-ded7-4237-8a94-46c751745c0d): I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.

AI was in the air in t...

> Source (Doc id: aa9b1c4a-b54c-4da9-8138-30337e96bf01): What I Worked On

February 2021

Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what 

In [None]:
query_engine = vector_index.as_query_engine()


In [None]:
response = query_engine.query("How much of profit and sales growth?")


In [None]:
print(response)

In [None]:
tree_summarize_query_engine = vector_index.as_query_engine(response_mode="tree_summarize")
response = tree_summarize_query_engine.query("How much of profit and sales growth?")
print("Tree Summarize Response:")
print(response)

In [None]:
from llama_index.core.response_synthesizers.type import ResponseMode
print(ResponseMode.__members__)

In [None]:
from llama_index.core.response_synthesizers.type import ResponseMode
# tree_summarize_query_engine = vector_index.as_query_engine(response_mode="simple_summarize", verbose=True)
refine_query_engine = vector_index.as_query_engine(response_mode=ResponseMode.REFINE)

response = refine_query_engine.query("How much of profit and sales growth?")
print("Different Response Modes:")
print(response)