In [1]:
import pandas as pd
import numpy as np

In [15]:
from llama_index import (
    VectorStoreIndex, SimpleDirectoryReader, ServiceContext)
from llama_index.llms import OpenAI

%load_ext dotenv
%dotenv

The dotenv extension is already loaded. To reload it, use:
  %reload_ext dotenv
cannot find .env file


In [21]:
data_fn = "/app/data/10k_pdf/NVDA/2021-01-31.pdf"

#documents = SimpleDirectoryReader(data_fn).load_data()
documents = SimpleDirectoryReader(input_files=[data_fn]).load_data()


In [None]:

index = VectorStoreIndex.from_documents(documents)

In [7]:
query_engine = index.as_query_engine()
response = query_engine.query("What does the company nvidia do?")

In [8]:
print(response)

NVIDIA is a company that specializes in computing platforms. They have developed GPUs (Graphics Processing Units) and sophisticated software that enhance the gaming experience by providing smoother, higher quality graphics. In addition to gaming, NVIDIA's platforms are used in scientific computing, artificial intelligence, data science, autonomous vehicles, robotics, and augmented and virtual reality. They have a platform strategy that brings together hardware, software, algorithms, and services to create unique value for the markets they serve. NVIDIA has invested heavily in research and development, resulting in inventions that are essential to modern computing. They are known for their leadership in computer graphics and their CUDA programming model, which allows for general-purpose computing using GPUs.


# Using different LLMs

https://gpt-index.readthedocs.io/en/stable/understanding/using_llms/using_llms.html

In [10]:
llm = OpenAI(temperature=0.1, model="gpt-3.5-turbo") # "gpt-4")

In [12]:
response = llm.complete("Who is George Washington?")
print(response)

George Washington was the first President of the United States. He served as the President from 1789 to 1797. He is often referred to as the "Father of His Country" due to his significant role in the American Revolutionary War and his leadership in establishing the United States as a new nation. Washington was also a key figure in the drafting of the United States Constitution. He is widely respected for his integrity, leadership, and commitment to the principles of democracy.


In [22]:
service_context = ServiceContext.from_defaults(llm=llm)

# Using the documents from the cell above
index_service_context = VectorStoreIndex.from_documents(
    documents, service_context=service_context)


In [24]:
query_engine = index_service_context.as_query_engine()
response = query_engine.query("What does the company nvidia do?")

In [25]:
print(response)

NVIDIA is a company that specializes in accelerated computing and GPU technology. They have expanded their focus beyond PC graphics and now provide platforms for various computationally intensive fields such as scientific computing, artificial intelligence (AI), data science, autonomous vehicles, robotics, and augmented and virtual reality (AR and VR). NVIDIA's GPU architecture enables parallel processing capabilities, making it essential for running deep learning algorithms and powering AI applications. They have a platform strategy that brings together hardware, software, algorithms, libraries, systems, and services to create value for the markets they serve. NVIDIA has invested heavily in research and development, resulting in inventions that have shaped modern computing, including the GPU and the CUDA programming model for general-purpose computing.


In [27]:
print(query_engine.query("how much money did nvidia make in 2021?"))

NVIDIA made $16,675 million in 2021.


In [28]:
from llama_index.llms import HuggingFaceLLM

In [29]:
# setup prompts - specific to StableLM
from llama_index.prompts import PromptTemplate

# This will wrap the default prompts that are internal to llama-index
# taken from https://huggingface.co/Writer/camel-5b-hf
query_wrapper_prompt = PromptTemplate(
    "Below is an instruction that describes a task. "
    "Write a response that appropriately completes the request.\n\n"
    "### Instruction:\n{query_str}\n\n### Response:"
)

In [30]:
import torch

llm = HuggingFaceLLM(
    context_window=2048,
    max_new_tokens=256,
    generate_kwargs={"temperature": 0.25, "do_sample": False},
    query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name="Writer/camel-5b-hf",
    model_name="Writer/camel-5b-hf",
    device_map="auto",
    tokenizer_kwargs={"max_length": 2048},
    # uncomment this if using CUDA to reduce memory usage
    # model_kwargs={"torch_dtype": torch.float16}
)
service_context = ServiceContext.from_defaults(chunk_size=512, llm=llm)

  from .autonotebook import tqdm as notebook_tqdm
Downloading (…)lve/main/config.json: 100%|██████████| 787/787 [00:00<00:00, 2.15MB/s]


ImportError: Using `low_cpu_mem_usage=True` or a `device_map` requires Accelerate: `pip install accelerate`