# Challenge

### 🎯 Problem
- ⚡ Input:
    - 📄 The `billionaires_page.pdf`
- 🔄 Output: A smart index capable of answering:
    - Who was the second richest person in the world in 2023?
    - What was the net worth of the richest person in 2023?
    - What was the age of the richest person in 2022?
    - Which was the primary source of wealth of the richest person in 2022?

## Code

In [None]:
%pip install llama-index>=0.11.20

In [None]:
!mkdir -p 'data/'
!wget 'https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/query_engine/pdf_tables/billionaires_page.pdf?raw=true' -O 'data/billionaires_page.pdf'

In [None]:
import nest_asyncio
nest_asyncio.apply()

In [None]:
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
from rich import print as rprint
import os

In [None]:
# set the OPENAI_API_KEY
os.environ["OPENAI_API_KEY"] = "here your openai api key"

In [None]:
Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

In [None]:
from llama_index.core import VectorStoreIndex

def create_index(text_path: str) -> VectorStoreIndex:
    # add your code here...
    raise NotImplementedError

In [None]:
index = create_index("data")
engine = index.as_query_engine()

In [None]:
rprint(engine.query("""Who was the second richest person in the world in 2023?""").response)

In [None]:
rprint(engine.query("""What was the net worth of the richest person in 2023?""").response)

In [None]:
rprint(engine.query("""What was the age of the richest person in 2022?""").response)

In [None]:
rprint(engine.query("""Which was the primary source of wealth of the richest person in 2022?""").response)