In [14]:
%pip install -Uq llama-index openai langchain

Note: you may need to restart the kernel to use updated packages.


## imports

In [1]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index import GPTSimpleVectorIndex, download_loader, SimpleDirectoryReader, LLMPredictor
from IPython.display import Markdown, display
from langchain.llms import OpenAIChat
from IPython.display import Markdown, display


from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

from dotenv import load_dotenv

# load OPENAI API KEY
load_dotenv()

True

## data loader

In [4]:
PDFReader = download_loader("PDFReader")

loader = PDFReader()
documents = loader.load_data(file=Path('pdfs/lecture01-intro-2up.pdf'))
#print(documents) 

## manual construction

source: https://github.com/emptycrown/llama-hub/blob/main/loader_hub/file/pdf/base.py

In [2]:
from pypdf import PdfReader
import re
from io import BytesIO
from llama_index import Document


def parse_pdf(file: BytesIO):

    pdf = PdfReader(file)
    text_list = []
    
    # Get the number of pages in the PDF document
    num_pages = len(pdf.pages)

    # Iterate over every page
    for page in range(num_pages):
        # Extract the text from the page
        page_text = pdf.pages[page].extract_text()
        text_list.append(page_text)

    text = "\n".join(text_list)

    return [Document(text)]


with open('pdfs/lecture01-intro-2up.pdf', 'rb') as file:
    manual_load = parse_pdf(file)

## creating index

In [5]:
index = GPTSimpleVectorIndex(documents)

INFO:root:> [build_index_from_documents] Total LLM token usage: 0 tokens
> [build_index_from_documents] Total LLM token usage: 0 tokens
INFO:root:> [build_index_from_documents] Total embedding token usage: 1672 tokens
> [build_index_from_documents] Total embedding token usage: 1672 tokens


In [6]:
index.save_to_disk('index.json')

In [3]:
# load from disk
index = GPTSimpleVectorIndex.load_from_disk('index.json')

##  query chatgpt

In [5]:
# LLM Predictor (gpt-3.5-turbo)
llm_predictor = LLMPredictor(llm=OpenAIChat(temperature=0, model_name="gpt-3.5-turbo"))

In [7]:
response = index.query(
    "Summarize this lecture in bullet points?", 
    llm_predictor=llm_predictor,
    similarity_top_k=3
)

INFO:root:> [query] Total LLM token usage: 1865 tokens
> [query] Total LLM token usage: 1865 tokens
INFO:root:> [query] Total embedding token usage: 9 tokens
> [query] Total embedding token usage: 9 tokens


In [8]:
display(Markdown(f"<b>{response}</b>"))

<b>- Introduction to Artificial Intelligence (AI)
- Definition of AI: creating machines that perform intelligent functions
- Characteristics of intelligence: perception, action, reasoning, learning, communication, planning
- Turing Test for measuring intelligent behavior
- Acting rationally: designing rational agents to achieve the best outcome
- Brief history of AI: early success, collapse, industry boom and bust, emergence of intelligent agents, deep learning
- Strong AI and the concept of singularity
- Examples of AI achievements: defeating human champions in chess, checkers, Jeopardy!, Go, and poker; proving mathematical conjectures; controlling spacecraft operations; driverless cars; progress in image and speech recognition, machine translation, and robotic scientists
- AI continues to find applications in various fields.</b>

In [6]:
response = index.query(
    "Give me 3 practice questions with answers based on the content of this lecture.", 
    llm_predictor=llm_predictor,
    similarity_top_k=5
)

INFO:root:> [query] Total LLM token usage: 1976 tokens
> [query] Total LLM token usage: 1976 tokens
INFO:root:> [query] Total embedding token usage: 15 tokens
> [query] Total embedding token usage: 15 tokens


In [7]:
display(Markdown(f"<b>{response}</b>"))

<b>

1. What is the definition of AI?
Answer: AI stands for Artificial Intelligence, which is the art of creating machines that perform functions that require intelligence when performed by humans. It is the study of the computations that make it possible to perceive, reason, and act.

2. What are the four general characteristics of intelligence?
Answer: The four general characteristics of intelligence are perception, action, reasoning, and learning. Perception involves the manipulation and interpretation of data provided by sensors, while action involves the control and use of effectors to accomplish a variety of tasks. Reasoning includes deductive (logical) inference and inductive inference, while learning involves adapting behavior to better cope with changing environments, discovery of patterns, learning to reason, plan, and act.

3. What are some examples of what AI can do?
Answer: AI has achieved many impressive feats, including defeating world champions in games like chess, checkers, and Go, as well as beating human champions on the game show Jeopardy! It has also been used for logistics planning and scheduling in the military, as well as controlling the operations of spacecraft and rovers on Mars. AI has also made great progress in image recognition, speech recognition, machine translation, and driverless cars.</b>