# Lesson 1: Advanced RAG Pipeline

In [1]:
import utils

import os
import openai
openai.api_key = utils.get_openai_api_key()

  import pkg_resources


✅ In Answer Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Answer Relevance, input response will be set to __record__.main_output or `Select.RecordOutput` .
✅ In Context Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Context Relevance, input response will be set to __record__.calls[-1].rets.source_nodes[:].node.text .
✅ In Groundedness, input source will be set to __record__.calls[-1].rets.source_nodes[:].node.text .
✅ In Groundedness, input statement will be set to __record__.main_output or `Select.RecordOutput` .


In [2]:
#from llama_index import SimpleDirectoryReader
from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader(
    input_files=["./eBook-How-to-Build-a-Career-in-AI.pdf"]
).load_data()

In [3]:
print(type(documents), "\n")
print(len(documents), "\n")
print(type(documents[0]))
print(documents[0])

<class 'list'> 

41 

<class 'llama_index.core.schema.Document'>
Doc ID: d3ee076f-bc22-4752-a2f9-b66b976deb83
Text: PAGE 1 Founder, DeepLearning.AI Collected Insights from Andrew
Ng How to  Build Your Career in AI A Simple Guide


## Basic RAG pipeline

In [4]:
#from llama_index import Document
from llama_index.core import Document


document = Document(text="\n\n".join([doc.text for doc in documents]))

In [5]:
"""
from llama_index import VectorStoreIndex
from llama_index import ServiceContext
from llama_index.llms import OpenAI


llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
service_context = ServiceContext.from_defaults(
    llm=llm, embed_model="local:BAAI/bge-small-en-v1.5"
)
index = VectorStoreIndex.from_documents([document],
                                        service_context=service_context)
"""

'\nfrom llama_index import VectorStoreIndex\nfrom llama_index import ServiceContext\nfrom llama_index.llms import OpenAI\n\n\nllm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)\nservice_context = ServiceContext.from_defaults(\n    llm=llm, embed_model="local:BAAI/bge-small-en-v1.5"\n)\nindex = VectorStoreIndex.from_documents([document],\n                                        service_context=service_context)\n'

In [6]:
from llama_index.core import VectorStoreIndex, Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# Define LLM and embedding
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

# Apply globally (so all indices/query_engines use these defaults)
Settings.llm = llm
Settings.embed_model = embed_model

# Now you can build the index without passing service_context
index = VectorStoreIndex.from_documents([document])


2025-08-27 16:32:46,793 - INFO - Load pretrained SentenceTransformer: BAAI/bge-small-en-v1.5
2025-08-27 16:32:49,212 - INFO - 1 prompt is loaded, with the key: query


In [7]:
query_engine = index.as_query_engine()

In [8]:
response = query_engine.query(
    "What are steps to take when finding projects to build your experience?"
)
print(str(response))

2025-08-27 16:32:54,022 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Develop a side hustle, ensure the project helps you grow technically, collaborate with good teammates, and consider if the project can serve as a stepping stone to larger projects.


## Evaluation setup using TruLens

In [9]:
eval_questions = []
with open('eval_questions.txt', 'r') as file:
    for line in file:
        # Remove newline character and convert to integer
        item = line.strip()
        print(item)
        eval_questions.append(item)

What are the keys to building a career in AI?
How can teamwork contribute to success in AI?
What is the importance of networking in AI?
What are some good habits to develop for a successful career?
How can altruism be beneficial in building a career?
What is imposter syndrome and how does it relate to AI?
Who are some accomplished individuals who have experienced imposter syndrome?
What is the first step to becoming good at AI?
What are some common challenges in AI?
Is it normal to find parts of AI challenging?


In [10]:
# You can try your own question:
new_question = "What is the right AI job for me?"
eval_questions.append(new_question)

In [11]:
print(eval_questions)

['What are the keys to building a career in AI?', 'How can teamwork contribute to success in AI?', 'What is the importance of networking in AI?', 'What are some good habits to develop for a successful career?', 'How can altruism be beneficial in building a career?', 'What is imposter syndrome and how does it relate to AI?', 'Who are some accomplished individuals who have experienced imposter syndrome?', 'What is the first step to becoming good at AI?', 'What are some common challenges in AI?', 'Is it normal to find parts of AI challenging?', 'What is the right AI job for me?']


In [12]:
from trulens_eval import Tru
tru = Tru()

tru.reset_database()

  from trulens_eval import Tru
2025-08-27 16:33:06,386 - INFO - Context impl SQLiteImpl.
2025-08-27 16:33:06,387 - INFO - Will assume non-transactional DDL.
2025-08-27 16:33:06,393 - INFO - Context impl SQLiteImpl.
2025-08-27 16:33:06,393 - INFO - Will assume non-transactional DDL.


🦑 Initialized with db url sqlite:///default.sqlite .
🛑 Secret keys may be written to the database. See the `database_redact_keys` option of `TruSession` to prevent this.


Updating app_name and app_version in apps table: 0it [00:00, ?it/s]
Updating app_id in records table: 0it [00:00, ?it/s]
Updating app_json in apps table: 0it [00:00, ?it/s]


For the classroom, we've written some of the code in helper functions inside a utils.py file.  
- You can view the utils.py file in the file directory by clicking on the "Jupyter" logo at the top of the notebook.
- In later lessons, you'll get to work directly with the code that's currently wrapped inside these helper functions, to give you more options to customize your RAG pipeline.

In [13]:
from utils import get_prebuilt_trulens_recorder

tru_recorder = get_prebuilt_trulens_recorder(query_engine,
                                             app_id="Direct Query Engine")

instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.embeddings.multi_modal_base.MultiModalEmbedding'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.base.embeddings.base.BaseEmbedding'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.schema.TransformComponent'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.schema.BaseComponent'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'pydantic.main.BaseModel'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base

In [14]:
with tru_recorder as recording:
    for question in eval_questions:
        response = query_engine.query(question)

In [15]:
records, feedback = tru.get_records_and_feedback(app_ids=[])

In [16]:
records.head()

Unnamed: 0,app_name,app_version,app_id,app_json,type,record_id,input,output,tags,record_json,...,Context Relevance,Context Relevance_calls,Context Relevance feedback cost in USD,Groundedness,Groundedness_calls,Groundedness feedback cost in USD,latency,total_tokens,total_cost,cost_currency
0,Direct Query Engine,base,app_hash_6e8221fde876d15698298cea8c0d1bd6,"{'tru_class_info': {'name': 'TruLlama', 'modul...",RetrieverQueryEngine(llama_index.core.query_en...,record_hash_50d04792bf32f7be0d9b48701c034bc0,What is the right AI job for me?,The right AI job for you would be one that ali...,-,{'record_id': 'record_hash_50d04792bf32f7be0d9...,...,,,,,,,0.793852,2182,0.003307,USD
1,Direct Query Engine,base,app_hash_6e8221fde876d15698298cea8c0d1bd6,"{'tru_class_info': {'name': 'TruLlama', 'modul...",RetrieverQueryEngine(llama_index.core.query_en...,record_hash_b17e6fb2869a9309eaed4a7ff4ff7055,Is it normal to find parts of AI challenging?,It is normal to find parts of AI challenging.,-,{'record_id': 'record_hash_b17e6fb2869a9309eae...,...,,,,,,,0.511552,2129,0.003199,USD
2,Direct Query Engine,base,app_hash_6e8221fde876d15698298cea8c0d1bd6,"{'tru_class_info': {'name': 'TruLlama', 'modul...",RetrieverQueryEngine(llama_index.core.query_en...,record_hash_ce1a7b27a23a487bbdc789b9622fe115,What are some common challenges in AI?,Common challenges in AI include understanding ...,-,{'record_id': 'record_hash_ce1a7b27a23a487bbdc...,...,,,,,,,0.769732,2123,0.003198,USD
3,Direct Query Engine,base,app_hash_6e8221fde876d15698298cea8c0d1bd6,"{'tru_class_info': {'name': 'TruLlama', 'modul...",RetrieverQueryEngine(llama_index.core.query_en...,record_hash_6e595aecd248617542d11e29f2fcfcd4,What is the first step to becoming good at AI?,Learning foundational technical skills.,-,{'record_id': 'record_hash_6e595aecd248617542d...,...,,,,,,,0.649987,1727,0.002593,USD
4,Direct Query Engine,base,app_hash_6e8221fde876d15698298cea8c0d1bd6,"{'tru_class_info': {'name': 'TruLlama', 'modul...",RetrieverQueryEngine(llama_index.core.query_en...,record_hash_bd0ebe790090d258b67a9427d854ff3a,Who are some accomplished individuals who have...,"Former Facebook COO Sheryl Sandberg, U.S. firs...",-,{'record_id': 'record_hash_bd0ebe790090d258b67...,...,,,,,,,0.714716,2143,0.003237,USD


In [17]:
# launches on http://localhost:8501/
tru.run_dashboard()

Starting dashboard ...



  tru.run_dashboard()
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Accordion(children=(VBox(children=(VBox(children=(Label(value='STDOUT'), Output())), VBox(children=(Label(valu…

Dashboard started at http://localhost:57915 .


<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

## Advanced RAG pipeline

### 1. Sentence Window retrieval

In [18]:
#from llama_index.llms import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

## Replace build_sentence_window_index() & build_automerging_index()

In [21]:
# add at top of file (imports)
from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index.core.node_parser import SentenceWindowNodeParser
from llama_index.core.postprocessor import MetadataReplacementPostProcessor, SentenceTransformerRerank
#from llama_index.embeddings.huggingface_openai import HuggingFaceEmbedding
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.embeddings.openai import OpenAIEmbedding
import os


from llama_index.core.node_parser import HierarchicalNodeParser, get_leaf_nodes

def _ensure_embed_model(embed_model):
    # allow passing either a ready embedding object or a "local:MODEL_NAME" string (your current style)
    if hasattr(embed_model, "get_text_embedding"):
        return embed_model
    if isinstance(embed_model, str):
        if embed_model.startswith("local:"):
            return HuggingFaceEmbedding(model_name=embed_model.split("local:", 1)[1])
        else:
            # treat as HF model name
            return HuggingFaceEmbedding(model_name=embed_model)
    raise ValueError("embed_model must be an embedding object or a model name string.")

def build_sentence_window_index(
    document,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="sentence_index",
):
    # 1) node parsing
    node_parser = SentenceWindowNodeParser.from_defaults(
        window_size=3,
        window_metadata_key="window",
        original_text_metadata_key="original_text",
    )
    nodes = node_parser.get_nodes_from_documents([document])

    # 2) ensure embedding object
    embed = _ensure_embed_model(embed_model)

    # 3) build or load index (pass llm/embed directly; no ServiceContext)
    if not os.path.exists(save_dir):
        index = VectorStoreIndex(nodes, llm=llm, embed_model=embed)
        index.storage_context.persist(persist_dir=save_dir)
    else:
        storage = StorageContext.from_defaults(persist_dir=save_dir)
        index = load_index_from_storage(storage, llm=llm, embed_model=embed)

    return index


def build_automerging_index(
    documents,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index",
    chunk_sizes=None,
):
    chunk_sizes = chunk_sizes or [2048, 512, 128]

    # 1) parse hierarchy → leaf nodes
    node_parser = HierarchicalNodeParser.from_defaults(chunk_sizes=chunk_sizes)
    nodes = node_parser.get_nodes_from_documents(documents)
    leaf_nodes = get_leaf_nodes(nodes)

    # 2) ensure embedding object
    embed = _ensure_embed_model(embed_model)

    # 3) build or load index (no ServiceContext)
    storage_context = StorageContext.from_defaults()
    storage_context.docstore.add_documents(nodes)

    if not os.path.exists(save_dir):
        index = VectorStoreIndex(
            leaf_nodes,
            storage_context=storage_context,
            llm=llm,
            embed_model=embed,
        )
        index.storage_context.persist(persist_dir=save_dir)
    else:
        storage = StorageContext.from_defaults(persist_dir=save_dir)
        index = load_index_from_storage(storage, llm=llm, embed_model=embed)

    return index


In [23]:
#from utils import build_sentence_window_index

sentence_index = build_sentence_window_index(
    document,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="sentence_index"
)

Loading llama_index.core.storage.kvstore.simple_kvstore from sentence_index/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from sentence_index/index_store.json.


In [24]:
from utils import get_sentence_window_query_engine

sentence_window_engine = get_sentence_window_query_engine(sentence_index)

config.json:   0%|          | 0.00/799 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.11G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/443 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.1M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/279 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

In [25]:
window_response = sentence_window_engine.query(
    "how do I get started on a personal project in AI?"
)
print(str(window_response))

To get started on a personal project in AI, you can begin by identifying a project that aligns with your career goals and interests. It's important to choose a project that is responsible, ethical, and beneficial to people. Once you have selected a project, you can follow the steps outlined in the chapters provided, such as scoping the project, executing it with an eye toward career development, and building a portfolio that demonstrates skill progression. By following these guidelines, you can embark on a personal AI project that not only enhances your skills but also makes a positive impact in the field.


In [26]:
tru.reset_database()

tru_recorder_sentence_window = get_prebuilt_trulens_recorder(
    sentence_window_engine,
    app_id = "Sentence Window Query Engine"
)

Updating app_name and app_version in apps table: 0it [00:00, ?it/s]
Updating app_id in records table: 0it [00:00, ?it/s]
Updating app_json in apps table: 0it [00:00, ?it/s]


instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.embeddings.multi_modal_base.MultiModalEmbedding'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.base.embeddings.base.BaseEmbedding'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.schema.TransformComponent'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.schema.BaseComponent'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'pydantic.main.BaseModel'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base

In [27]:
for question in eval_questions:
    with tru_recorder_sentence_window as recording:
        response = sentence_window_engine.query(question)
        print(question)
        print(str(response))

What are the keys to building a career in AI?
Keys to building a career in AI include developing a portfolio of projects that demonstrate skill progression, using informational interviews to find the right job, and overcoming imposter syndrome.
How can teamwork contribute to success in AI?
Teamwork can contribute to success in AI by allowing individuals to leverage the diverse skills and perspectives of their colleagues. This collaboration can lead to more innovative solutions, better problem-solving, and overall project improvement. Additionally, working in a team can create a supportive environment that fosters continuous learning and growth, ultimately enhancing the quality of the AI projects being developed.
What is the importance of networking in AI?
Networking in AI is crucial as it can provide valuable insights, guidance, and opportunities for individuals looking to advance in the field. By connecting with professionals who have experience in AI, individuals can gain knowledge a



How can altruism be beneficial in building a career?
Helping others along the way can actually benefit one's career development.
What is imposter syndrome and how does it relate to AI?
Imposter syndrome is a phenomenon where individuals doubt their accomplishments and have a persistent fear of being exposed as a fraud. In the context of AI, newcomers to the field may experience imposter syndrome, feeling like they do not truly belong in the AI community despite their success. This can be a common experience for many people in various fields, including AI, and it is important to address and overcome these feelings to continue growing and contributing effectively in the field.
Who are some accomplished individuals who have experienced imposter syndrome?
Former Facebook COO Sheryl Sandberg, U.S. first lady Michelle Obama, actor Tom Hanks, and Atlassian co-CEO Mike Cannon-Brookes.
What is the first step to becoming good at AI?
The first step to becoming good at AI is learning foundational 

In [28]:
tru.get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Unnamed: 1_level_0,Answer Relevance,Context Relevance,Groundedness,latency,total_cost
app_name,app_version,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Sentence Window Query Engine,base,0.916667,0.388889,0.5625,1.29267,0.000785


In [30]:
# launches on http://localhost:8501/
tru.run_dashboard()

Starting dashboard ...
Dashboard already running at path:   Local URL: http://localhost:57915



<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

### 2. Auto-merging retrieval

In [31]:
#from utils import build_automerging_index

automerging_index = build_automerging_index(
    documents,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index"
)

Loading llama_index.core.storage.kvstore.simple_kvstore from merging_index/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from merging_index/index_store.json.


In [32]:
from utils import get_automerging_query_engine

automerging_query_engine = get_automerging_query_engine(
    automerging_index,
)

In [33]:
auto_merging_response = automerging_query_engine.query(
    "How do I build a portfolio of AI projects?"
)
print(str(auto_merging_response))

> Merging 1 nodes into parent node.
> Parent node id: 31c89103-15e0-4fd8-a7b5-8763062be6af.
> Parent node text: PAGE 21Building a Portfolio of 
Projects that Shows 
Skill Progression CHAPTER 6
PROJECTS

> Merging 1 nodes into parent node.
> Parent node id: f78aa493-30e2-4d7e-8045-6ee68573344f.
> Parent node text: PAGE 21Building a Portfolio of 
Projects that Shows 
Skill Progression CHAPTER 6
PROJECTS

Building a portfolio of AI projects involves showcasing a progression from simple to complex undertakings over time. It is important to be able to communicate your thinking effectively to demonstrate the value of your work and gain trust from others. Identifying ideas that are worth working on is a crucial skill for an AI architect, and working on projects across various industries can help in gaining experience and diversifying your portfolio.


In [34]:
tru.reset_database()

tru_recorder_automerging = get_prebuilt_trulens_recorder(automerging_query_engine,
                                                         app_id="Automerging Query Engine")

Updating app_name and app_version in apps table: 0it [00:00, ?it/s]
Updating app_id in records table: 0it [00:00, ?it/s]
Updating app_json in apps table: 0it [00:00, ?it/s]


instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.embeddings.multi_modal_base.MultiModalEmbedding'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.base.embeddings.base.BaseEmbedding'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.schema.TransformComponent'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'llama_index.core.schema.BaseComponent'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base <class 'pydantic.main.BaseModel'>
instrumenting <class 'llama_index.embeddings.huggingface.base.HuggingFaceEmbedding'> for base

In [35]:
for question in eval_questions:
    with tru_recorder_automerging as recording:
        response = automerging_query_engine.query(question)
        print(question)
        print(response)

> Merging 2 nodes into parent node.
> Parent node id: b46bba4d-5db4-4ae7-b67e-bcedd7e2e1e2.
> Parent node text: PAGE 3Table of 
ContentsIntroduction: Coding AI is the New Literacy.
Chapter 1: Three Steps to Ca...

> Merging 1 nodes into parent node.
> Parent node id: 6f8b3671-b30a-4e55-8524-bc025db7f587.
> Parent node text: PAGE 3Table of 
ContentsIntroduction: Coding AI is the New Literacy.
Chapter 1: Three Steps to Ca...

What are the keys to building a career in AI?
The keys to building a career in AI include learning foundational technical skills, working on projects, finding a job, being part of a community, prioritizing topic selection when learning, and collaborating effectively with others in team settings.
How can teamwork contribute to success in AI?
Teamwork can contribute to success in AI by enhancing the ability to collaborate effectively with others, influence team members, and be influenced by them. This collaboration allows for a more comprehensive approach to tackling 

In [36]:
tru.get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Unnamed: 1_level_0,Answer Relevance,Context Relevance,Groundedness,latency,total_cost
app_name,app_version,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Automerging Query Engine,base,0.888889,0.555556,0.645833,1.213081,0.000862


In [37]:
# launches on http://localhost:8501/
tru.run_dashboard()

Starting dashboard ...
Dashboard already running at path:   Local URL: http://localhost:57915



<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>