# RAG Rerank Retrieval Experiment

In this experiment, we will experiment with pure document retrieval without reranking, as typically found in Naive RAG systems, and compare it to the improvements achieved by utlilsing reranking. 

Reranking is a technique that can dramatically optimise our retrieval pipelines and enhance their accuracy. To learn more about how reranking works and the challenges it addresses, visit [this](https://www.notion.so/fuzzylabs/Reranking-04e3f64f27724e51875abd7eb7d97a3c#6f1682a7b243482a8d226f675816851c) notion page.

In this example notebook, we will demonstrate how to create retrieval pipelines with reranking using the [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) reranker, which has comparable performance to Cohere's paid model, according to Llamaindex.

The following components are used:
- VectorDB - [Pinecone](https://www.pinecone.io/)
- Embedding Model - [jinaai/jina-embeddings-v2-base-en](https://huggingface.co/jinaai/jina-embeddings-v2-base-en)
- Reranker - [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large)


# Parameters
The following parameters are used in the experiment.
   
`QUERY = "Who manufactured STM32F429ZIT6U microcontroller?"`

We have chosen the above query to check for specific content within the context and to evaluate improvements after applying reranking to the retrieval results.

Specifically, we are looking for following content in the retrieved documents.

`In this chapter, many programs were developed using the NUCLEO board, provided with the STM32F429ZIT6U microcontroller. This microcontroller is manufactured by STMicroelectronics [13] using a Cortex-M4 processor designed by Arm Ltd. [14].`

In [1]:
######## Parameters #########

# Data
FILE_PATH = "data/a_beginners_guide_to_designing_embedded_system_applications_on_arm_cortex-m_microcontrollers.pdf"
CHUNK_SIZE = 1000
CHUNK_OVERLAP = 50
COLLECTION_NAME = 'experiment_search'
BATCH_SIZE = 32

# Embedding model
EMBEDDING_MODEL = "jinaai/jina-embeddings-v2-base-en"

# Reranker
RERANKER_MODEL = 'BAAI/bge-reranker-large'

# Query
TOP_K = 3
QUERY = "Who manufactured STM32F429ZIT6U microcontroller?"

We want to utilise GPU to speed up computation time if available.


In [2]:
import torch

DEVICE = 'cuda:0' if torch.cuda.is_available() else 'cpu'
print(f"Using device={DEVICE}")

Using device=cpu


# Data Preprocessing
> Make sure the pdf data is present in [data](data) folder.

In this step, we preprocess the our example PDF data defined by FILE_PATH parameter. This pre-processed data will be added in the vector store.

- We use [PyPDFLoader](https://python.langchain.com/v0.1/docs/modules/data_connection/document_loaders/pdf/#using-pypdf) from langchain to read the PDF.
- We use [RecursiveCharacterTextSplitter](https://python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/recursive_text_splitter/) from langchain to split the text in small chunks. The parameters `CHUNK_SIZE` and `CHUNK_OVERLAP` configure the chunk size and chunk overlap.
> This step takes about 3 mins for a 600 pages PDF document.

In [3]:
%%time

from data_pipeline import prepare_data
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=CHUNK_SIZE,
    chunk_overlap=CHUNK_OVERLAP,
    length_function=len,
    is_separator_regex=False,
)
chunked_texts = prepare_data(FILE_PATH, text_splitter)

CPU times: user 2min 29s, sys: 2.54 s, total: 2min 31s
Wall time: 2min 36s


# Embedding The Data

In this step, we will use Jina AI's embedding model to embed all of our texts and create a pandas DataFrame to associate the embeddings with their corresponding texts.

> This step is resource intensive and it's much faster on Nvidia GPU.
>
> It might take around 10-15 mins if running on CPU.
>
> It calculates the embeddings for entire dataset.

In [4]:
%%time

from embedding_model import embed_locally

text_embeddings = embed_locally(chunked_texts)

  from tqdm.autonotebook import tqdm, trange


CPU times: user 48min 48s, sys: 3min 10s, total: 51min 59s
Wall time: 13min 47s


In [5]:
import pandas as pd

data_to_store = pd.DataFrame({'embeddings': text_embeddings})
data_to_store['metadata'] = [{'texts': text} for text in chunked_texts]
data_to_store.index.name = 'id'
data_to_store.reset_index(inplace=True)
data_to_store['id'] = data_to_store['id'].astype(str)

> We include a metadata column containing the corresponding embedding texts in a dictionary because Pinecone cannot store strings directly and only allows text within the metadata.

In [6]:
data_to_store

Unnamed: 0,id,embeddings,metadata
0,0,"[-0.022770140320062637, -0.049992650747299194,...",{'texts': 'A Beginner’s Guide to Designing Em...
1,1,"[-0.04170405492186546, -0.04468979313969612, 0...",{'texts': 'A Beginner’s Guide to Designing E...
2,2,"[-0.036007110029459, -0.06197873130440712, 0.0...",{'texts': 'A Beginner’s Guide to Designing E...
3,3,"[-0.028807535767555237, -0.03928277641534805, ...",{'texts': 'Arm Education Media is an imprint o...
4,4,"[-0.012164550833404064, -0.04263245314359665, ...",{'texts': 'book. Notices Knowledge and best pr...
...,...,...,...
1630,1630,"[-0.04276715964078903, -0.048491425812244415, ...",{'texts': 'ISBN 978-1911531-16-6 Operating Sy...
1631,1631,"[-0.03923221305012703, -0.051492903381586075, ...",{'texts': 'A Beginner’s Guide to Designing Em...
1632,1632,"[-0.04426056891679764, -0.06682316213846207, 0...",{'texts': 'align with a typical twelve-week se...
1633,1633,"[-0.052978359162807465, -0.04494466632604599, ...",{'texts': 'Arm Education Media is a publishing...


# Create Vector Database with Pinecone


Now we create our vector DB to store our vectors. For this we need to get a [free Pinecone API key](https://app.pinecone.io/) — the API key can be found in the "API Keys" button found in the left navbar of the Pinecone dashboard.

In [7]:
from pinecone import Pinecone, ServerlessSpec
import time
import getpass

# initialise connection to pinecone (get API key at app.pinecone.io)
api_key = getpass.getpass()

# configure client
pc = Pinecone(api_key=api_key)

> An "Index" is a database in Pinecone.

In [8]:
index_name = "rerankers"
existing_indexes = [
    index_info["name"] for index_info in pc.list_indexes()
]

# check if index already exists (it shouldn't if this is first time)
if index_name not in existing_indexes:
    # if does not exist, create index
    pc.create_index(
        index_name,
        dimension=768,  # embedding dimensionality of jina-embeddings-v2-base
        metric='cosine',
        spec=ServerlessSpec(
            cloud="aws", region="us-east-1"
        )
    )
    # wait for index to be initialized
    while not pc.describe_index(index_name).status['ready']:
        time.sleep(1)

# connect to index
index = pc.Index(index_name)
# view index stats
index.describe_index_stats()

{'dimension': 768,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

# Data Ingestion

In this step, we store the embeddings and chunked texts in the Pinecone database we just created.

In [9]:
from tqdm.auto import tqdm

batch_size = 100  # how many embeddings we create and insert at once

for i in tqdm(range(0, len(data_to_store), batch_size)):
    passed = False
    # find end of batch
    i_end = min(len(data_to_store), i+batch_size)
    # create batch
    batch = data_to_store[i:i_end]
    to_upsert = list(zip(batch["id"], batch["embeddings"], batch["metadata"]))

    # upsert to Pinecone
    index.upsert(vectors=to_upsert)

100%|██████████| 17/17 [00:10<00:00,  1.62it/s]


# Document Retrieval

Now that we have our chunked texts and their embeddings stored in the database, we can retrieve them from Pinecone by calling the query function. The steps are as follows:

1. Embed our query.
2. Call the query function on the Pinecone index object.
3. Return the text from the result.

In [10]:
def get_closest_texts(query: str, top_k: int) -> list[tuple[str, float]]:
    # encode query
    encoded_query = embed_locally([query])
    # search Pinecone index
    res = index.query(vector=encoded_query[0], top_k=top_k, include_metadata=True)
    # get doc text
    closest_texts = [(match["metadata"]['texts'], match["score"]) for match in res["matches"]]
    return closest_texts

# Retrieval With No Reranking (Naive RAG Retrieval)

This is the most common approach for retrieving semantically similar documents to the query with, these embeddings are also known as dense vector. The dense vector captures the semantic meaning of the text using the embedding model.

The query vector is matched against all the entries in the database to find closest neighbors to the search query using a distance metric, in this case, the cosine distance.

Typically, we will look for the 3 closest documents in the database. These 3 documents will be pass to a LLM to be used as contexts to generate a response.

In [11]:
three_closest_texts = get_closest_texts(query=QUERY, top_k=3)

In [12]:
pd.set_option('display.max_colwidth', None)
three_closest_texts_df = pd.DataFrame(three_closest_texts, columns=["Retrieved Document", "Score"])
three_closest_texts_df

Unnamed: 0,Retrieved Document,Score
0,"Index\n567 STM 32F429 ZIT 6U microcontroller 5, 33, 37, 38 \n stop bit 46, 64, 65, 80, 256 , 257 , 273 , 468 \n ST Zio connectors xiv, 6, 9, 37, 38, 44, 346 \n superloop 13, 519 \n synchronous communication 255\nT TCP server 456 , 457 , 459 , 460 –463 , 491 \n temperature sensor 4, 5, 87, 106 , 123 , 182 , 220 , 233 , 334 , 486 , 550 \n time management xvi, 86, 95, 99, 289 , 342 , 348 , 353 \n timers xvi, xxvii, 37, 306 , 342 , 343 , 345 , 348 –350 , 353 , 354 , \n 361 , 441 , 512 –517 , 533 , 535 , 537 , 538 , 539 \n tm structure 153 \n TO- 220 package 89\nU UART xvi, xvii, 5, 44, 61, 63, 79, 82, 84, 86, 181 , 191 , 222 , 255 , 290 , 291 , \n 306 , 350 , 420 , 423 , 447 , 455 , 461 , 465 , 468 , 544 \n USB xiv, xxvii, 4, 9, 37, 44, 45, 63, 79, 82, 88, 132 , 449 , 506 , 507 , 508 , 544 \n USB connection xiv, 44, 506 \n use cases 497 , 501 –504 , 511 , 545 , 546 , 547 , 548\nV validation 496 , 497 , 546 , 547 \n verification 81, 92, 496 , 497 , 546 \n vineyard frost prevention 123 , 124",0.824475
1,"Chapter 1 | Introduction to Embedded Systems \n37The STM32F429ZIT6U microcontroller includes a Cortex-M4 processor , as shown in Figure 1.22. It \ncan be appreciated that, beyond the processor, the microcontroller includes other peripherals such as \ncommunication cores (ethernet, USB, UART, etc.), memory, timers, and GPIO (General Purpose Input Output) ports. \nNUCLEO\n-F429ZI32F429ZIT6U\nARM7B776 VQ\nPHL 7B 7213e412000K620 Y12000\nK620 Y\n120 00K620 YDGKYD\nKMS-1 102NL1706C\nSTM32F103CBT6\ne393701GH218CHN\nST890C\nGK717\n11\n22\n33\n44\n55\n66\n77\n88\n9\n10\n11\n12\n13\n14\n15PH_0\nPD_0\nPD_1\nPG_0PH_1PF_2PA_7PF_10PF_5PF_3PC_3PC_030\n29\n28\n27\n2616\n2515\n2414\n2313\n2212\n2111\n2010\n199\n18\n17\n165V\nVIN3.3VIOREF\nGND\nGNDGND\nNCNC\nUART2_RX\nCAN1_TDCAN1_ DRADC1/7ADC1/3\nADC1/10\nADC1/13\nADC3/9\nADC3/15\nADC3/8\nADC3/5NRSTPC_8\nPC_9\nPC_10\nPC_11\nPC_12\nPD_2\nPG_2\nPG_3\nGNDPD_7\nPD_6\nPD_5\nPD_4\nPD_3\nPE_2\nPE_4\nPE_3\nPF_7\nPG_1UART2_RX\nUART2_ XT\nUART_X7TUART2_RTS\nUART2_CTSUART3_TX\nUART3_RX\nUART5_TX\nUART5_RX\nPA_3\nSPI1_MOSISPI3_SCK\nSPI2_SCK",0.808997
2,"USAR Tn\nbxCANnsmcard\nirDA\nFIFODigital\nﬁlter\nDACnAPB1 45 MHz (max)AHB1 180 MHzAHB3\nVDD = 1.8 to 3.6V\nVSS\nVCAP1, VCAP2APB2 90 MHzART\nACCEL/CACHE\nGPIO POR TtOSC32_I N\nOSC32_OUTDMA2USB\nOTG HS\n8 Streams\nFIFODMA/\nFIFO PHY\nCHROM-AR T\nDMA2DFIFO\nFIFO8 Streams\nFIFODMA1\nLCD_R[7:0], LCD_G[7:0] ,\nLCD_B[7:0], LCD_HSYNC,\nLCD_VSYNC, LCD_DE,\nLCD_CLKLCD-TFTDMA/\nFIFO\nRTC_AF1\nRTC_AF1\nRTC 50Hz\n4KB BXPSRAMLS@ VBAT\nRTC\nAWU\nBackup registerXTAL32 KHz\nLSStandby\ninterfaceIWDGXTALOSC 4-26 MHzPOR\nreset\nInt\n@ VDDA @ VDDPVDPOR/PDR BORSupply\nsupervision@ VDDVDD\nVoltage regulator\n3.3 to 1.2VPower\nmanagement\nPLL1, 2, 3RC LSRC HS@ VDDA\nReset\n& clock\ncontrol\nFigure 1.22 STM32F429ZI block diagram made using information available from [9].",0.807932


### Result
The goal here is to verify if following context is present in any of the retrieved documents.

> In this chapter, many programs were developed using the NUCLEO board, provided with the STM32F429ZIT6U microcontroller. This microcontroller is manufactured by STMicroelectronics [13] using a Cortex-M4 processor designed by Arm Ltd. [14].

If it is present then our retrieval approach was able to find the correct document from the vector store.
> It is not present in the any of retrieved documents.

# Retrieval With Reranking

To convert texts into vectors, we are essentially compressing the "meaning" of the text into n-dimensional vectors. This compression results in some loss of information.

Due to this information loss, the top three documents (for example) retrieved by a vector search may miss relevant details, which might fall below our top_k cutoff and not be returned, as demonstrated in the example above.

Reranking addresses this issue by retrieving a larger initial set of documents from the database. A reranker then reorders these documents, retaining only the most relevant ones for our LLM.

For a deeper understanding of how reranking works and the challenges it addresses, see [this](https://www.notion.so/fuzzylabs/Reranking-04e3f64f27724e51875abd7eb7d97a3c#6f1682a7b243482a8d226f675816851c) Notion page.

In [13]:
from FlagEmbedding import FlagReranker
reranker = FlagReranker('BAAI/bge-reranker-large', use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation


def get_top_3_texts_from_rerank_scores(scores: list[float], closest_texts: list[str], top_k: int) -> list[str]:
    top_k_indices = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:top_k]
    return [closest_texts[i] for i in top_k_indices]


In [14]:
# The bge reranker takes a list of list as input

twenty_five_closest_texts = get_closest_texts(query="Who manufactured STM32F429ZIT6U microcontroller?", top_k=25)
texts_to_rerank = [[QUERY,doc[0]] for doc in twenty_five_closest_texts]

### Reranker Scores

Below are the scores which represent the similarly between our query and the retrieved texts from our vector database. The higher the score, the more relevant the document is to our query.

In [15]:
scores = reranker.compute_score(texts_to_rerank)
print(scores)

[0.28374725580215454, -0.6582016348838806, -3.9359004497528076, -6.078181266784668, -5.264961242675781, 4.998556613922119, -5.124876022338867, -1.5439904928207397, -0.9592869877815247, -7.042023658752441, -5.507993221282959, -8.35011100769043, -7.134540557861328, -9.436647415161133, -7.423570156097412, 0.23653705418109894, -8.951774597167969, -5.671559810638428, -6.971182346343994, -7.249991416931152, -9.102521896362305, -2.066066265106201, -7.483180046081543, -7.791862964630127, -5.368914604187012]


### Result

The goal here is to verify if following context is present in any of the retrieved documents.

> In this chapter, many programs were developed using the NUCLEO board, provided with the STM32F429ZIT6U microcontroller. This microcontroller is manufactured by STMicroelectronics [13] using a Cortex-M4 processor designed by Arm Ltd. [14].

If it is present then our retrieval approach was able to find the correct document from the vector store.

> It is present in the second row of Retrieved text in the first result as shown below.

In [16]:
get_top_3_texts_from_rerank_scores(scores, twenty_five_closest_texts, top_k=3)

[('Chapter 1 | Introduction to Embedded Systems \n33Proposed Exercise\n1. How can the code be changed in such a way that the system is blocked after three incorrect codes \nare entered?\nAnswer to the Exercise\n1. It can be achieved by means of the change in T able\xa01.11.\nT able\xa01.11 Proposed modification in the code in order to achieve the new behavior.\nLine in Code\xa01.5 New code to be used\n40 if ( numberOfIncorrectCodes < 5 ) 40 if ( numberOfIncorrectCodes < 3 )\n1.3 Under the Hood\n1.3.1 Brief Introduction to the Cortex-M Processor  Family and the NUCLEO Board\nIn this chapter, many programs were developed using the NUCLEO board, provided with the \nSTM32F429ZIT6U microcontroller . This microcontroller is manufactured by STMicroelectronics [13]',
  0.786926866),
 ('Index\n567 STM 32F429 ZIT 6U microcontroller 5, 33, 37, 38 \n stop bit 46, 64, 65, 80, 256 , 257 , 273 , 468  \n ST Zio connectors xiv, 6, 9, 37, 38, 44, 346  \n superloop 13, 519  \n synchronous communication 2

We can also visualise the relevance score for all 25 texts we have retrieved.

The reranker is optimised based cross-entropy loss, so the relevance score is not bounded to a specific range.

In [17]:
scores_for_all_25_documents = pd.DataFrame({"Received Texts": twenty_five_closest_texts, "Score": scores})

In [18]:
scores_for_all_25_documents.sort_values(by=["Score"], ascending=False)

Unnamed: 0,Received Texts,Score
5,"(Chapter 1 | Introduction to Embedded Systems \n33Proposed Exercise\n1. How can the code be changed in such a way that the system is blocked after three incorrect codes \nare entered?\nAnswer to the Exercise\n1. It can be achieved by means of the change in T able 1.11.\nT able 1.11 Proposed modification in the code in order to achieve the new behavior.\nLine in Code 1.5 New code to be used\n40 if ( numberOfIncorrectCodes < 5 ) 40 if ( numberOfIncorrectCodes < 3 )\n1.3 Under the Hood\n1.3.1 Brief Introduction to the Cortex-M Processor Family and the NUCLEO Board\nIn this chapter, many programs were developed using the NUCLEO board, provided with the \nSTM32F429ZIT6U microcontroller . This microcontroller is manufactured by STMicroelectronics [13], 0.786926866)",4.998557
0,"(Index\n567 STM 32F429 ZIT 6U microcontroller 5, 33, 37, 38 \n stop bit 46, 64, 65, 80, 256 , 257 , 273 , 468 \n ST Zio connectors xiv, 6, 9, 37, 38, 44, 346 \n superloop 13, 519 \n synchronous communication 255\nT TCP server 456 , 457 , 459 , 460 –463 , 491 \n temperature sensor 4, 5, 87, 106 , 123 , 182 , 220 , 233 , 334 , 486 , 550 \n time management xvi, 86, 95, 99, 289 , 342 , 348 , 353 \n timers xvi, xxvii, 37, 306 , 342 , 343 , 345 , 348 –350 , 353 , 354 , \n 361 , 441 , 512 –517 , 533 , 535 , 537 , 538 , 539 \n tm structure 153 \n TO- 220 package 89\nU UART xvi, xvii, 5, 44, 61, 63, 79, 82, 84, 86, 181 , 191 , 222 , 255 , 290 , 291 , \n 306 , 350 , 420 , 423 , 447 , 455 , 461 , 465 , 468 , 544 \n USB xiv, xxvii, 4, 9, 37, 44, 45, 63, 79, 82, 88, 132 , 449 , 506 , 507 , 508 , 544 \n USB connection xiv, 44, 506 \n use cases 497 , 501 –504 , 511 , 545 , 546 , 547 , 548\nV validation 496 , 497 , 546 , 547 \n verification 81, 92, 496 , 497 , 546 \n vineyard frost prevention 123 , 124, 0.824474633)",0.283747
15,"(346\nA Beginner’s Guide to Designing Embedded System ApplicationsNUCLEO\n-F429ZI32F429ZIT6U\nARM7B776 VQ\nPHL 7B 7213e412000K620 Y12000\nK620 Y\n120 00K620 YDGKYD\nKMS-1 102NL1706C\nSTM32F103CBT6\ne393701GH218CHN\nST890C\nGK717\n11\n22\n33\n44\n55\n66\n77\n88\n9\n10\n11\n12\n13\n14\n15PH_0\nPD_0\nPD_1\nPG_0PH_1PF_2PA_7PF_10PF_5PF_3PC_3PC_030\n29\n28\n27\n2616\n2515\n2414\n2313\n2212\n2111\n2010\n199\n18\n17\n165V\nVIN3.3VIOREF\nGND\nGNDGND\nNCNC\nUART2_RX\nCAN1_TDCAN1_ DRADC1/7ADC1/3\nADC1/10\nADC1/13\nADC3/9\nADC3/15\nADC3/8\nADC3/5NRSTPC_8\nPC_9\nPC_10\nPC_11\nPC_12\nPD_2\nPG_2\nPG_3\nGNDPD_7\nPD_6\nPD_5\nPD_4\nPD_3\nPE_2\nPE_4\nPE_3\nPF_7\nPG_1UART2_RX\nUART2_ XT\nUART_X7TUART2_RTS\nUART2_CTSUART3_TX\nUART3_RX\nUART5_TX\nUART5_RX\nPA_3\nSPI1_MOSISPI3_SCK\nSPI2_SCK\nSPI4_SCK\nSPI5_SCKSPI4_CSSPI3_MISO\nSPI3_MOSI\nSPI3_MOSI\nSPI2_MOSI\nPWM1/1NPWM2/4\nPWM11/1PWM3/3\nI2C3_SDA PWM3/4\nA0\nA1\nA2\nA3\nA4\nA5\nADC3/6 PF_8 SPI5_MISO PWM /131\nADC3/7 PF_9 SPI5_MOSI PWM /141PE_6 SPI4_MOSI PWM /29PE_5 SPI4_MISO PWM /91\nUART _TX6\nUART_X6R\nUART_X1T\nUART _RTS3\nUART _CTS3\nUART_X4T, 0.773964405)",0.236537
1,"(Chapter 1 | Introduction to Embedded Systems \n37The STM32F429ZIT6U microcontroller includes a Cortex-M4 processor , as shown in Figure 1.22. It \ncan be appreciated that, beyond the processor, the microcontroller includes other peripherals such as \ncommunication cores (ethernet, USB, UART, etc.), memory, timers, and GPIO (General Purpose Input Output) ports. \nNUCLEO\n-F429ZI32F429ZIT6U\nARM7B776 VQ\nPHL 7B 7213e412000K620 Y12000\nK620 Y\n120 00K620 YDGKYD\nKMS-1 102NL1706C\nSTM32F103CBT6\ne393701GH218CHN\nST890C\nGK717\n11\n22\n33\n44\n55\n66\n77\n88\n9\n10\n11\n12\n13\n14\n15PH_0\nPD_0\nPD_1\nPG_0PH_1PF_2PA_7PF_10PF_5PF_3PC_3PC_030\n29\n28\n27\n2616\n2515\n2414\n2313\n2212\n2111\n2010\n199\n18\n17\n165V\nVIN3.3VIOREF\nGND\nGNDGND\nNCNC\nUART2_RX\nCAN1_TDCAN1_ DRADC1/7ADC1/3\nADC1/10\nADC1/13\nADC3/9\nADC3/15\nADC3/8\nADC3/5NRSTPC_8\nPC_9\nPC_10\nPC_11\nPC_12\nPD_2\nPG_2\nPG_3\nGNDPD_7\nPD_6\nPD_5\nPD_4\nPD_3\nPE_2\nPE_4\nPE_3\nPF_7\nPG_1UART2_RX\nUART2_ XT\nUART_X7TUART2_RTS\nUART2_CTSUART3_TX\nUART3_RX\nUART5_TX\nUART5_RX\nPA_3\nSPI1_MOSISPI3_SCK\nSPI2_SCK, 0.808997333)",-0.658202
8,"(Preface\nxx10A250V AC 10A 125V ACCU S\n10A 0VDC 10A VDC 32 8\nSRD-05VDC-SL- CCQCRSONGLE\n10A250V AC 10A 125V ACCU S\n10A 0VDC 10A VDC 32 8\nSRD-05VDC-SL- CCQCRSONGLE\n2 Relay ModuleK2 K1\nJD-VCC VCC GND GND IN1 IN2 VCCR3 R2D2 D2Q2 Q1\nIN2 IN1\nR4 R1+ +\nB1810\n817C\nGB1810\n817C\nG\n1982A 12381H2FPD-270A\nSolenoid valve External 12V\nPower source\nGND12\n12V\nAC POWER Adapte r\nInput 240 VCA\nOutpur 12V CC 2A\nErPAri\nHL-69\nMoisture\nsensor\nEPARIDO-LED\nAO DO GND VCC\nPWR-LED\n+\n+\n--\n--\n+\n+\n--\n--123456789101112131415161718192021222324252627282930f g\nh i\njf g\nh i\nja b\nc d\nea b\nc d\ne123456789101112131415161718192021222324252627282930(Relay IN1)\ndPF2_5V GND\n5V\nGNDNUCLEO-F429Z\nI\n32F429ZIT6UAR\nM\n7B776 VQPHL 7B 7213e412000K620 Y12000\nK620 Y\n12000K620 Y DGKYDKMS-1102NL1706C\nSTM32F103CBT6\ne393701GH218CHN\nST890C\nGK717\n3.3V\nGNDD4 to D7D8,D9CN9 CN8\nCN7CN10Relay module\nCN9\nPF 9_\nPF 7_\nPF 8_\nPG 1_(Mode)(How Often)(How Long)(Moisture)COM11NO\nA3\nVDDVSS VORSRWED0D1D2D3D4D5D6D7A K11 6, 0.781673431)",-0.959287
7,"(38\nA Beginner’s Guide to Designing Embedded System ApplicationsFigure 1.23 shows how different elements of the STM32F429ZIT6U microcontroller are mapped \nto the Zio and Arduino-compatible headers of the NUCLEO-F429ZI board. Some other elements \nare mapped to the CN11 and CN12 headers of the NUCLEO-F429ZI board, as will be discussed in upcoming chapters. Further information on these headers is available from [17].\nIn this chapter, buttons were connected to the NUCLEO board using pins D2 to D7. From Figure 1.23, \nit can be seen that those digital inputs can also be referred to as PF_15, PE_13, P_14, PE_11, PE_9, and PF_13, respectively. Throughout this book, many pins of the ST Zio connectors will be used, and they will be referred to in the code using the names shown in Figure 1.23., 0.781762719)",-1.54399
21,"(422\nA Beginner’s Guide to Designing Embedded System Applications(Relay GND)(Relay VCC)\nNUCLEO-F429Z\nI\n32F429ZIT6UARM\n7B776 VQPHL 7B 7213e412000K620 Y12000\nK620 Y\n12000K620 Y DGKYDKMS-1102NL1706C\nSTM32F103CBT6\ne393701GH218CHN\nST890C\nGK717CN8\n+\n+\n--\n--\n+\n+\n--\n--123456789101112131415161718192021222324252627282930f g\nh i\njf g\nh i\nja b\nc d\nea b\nc d\ne123456789101112131415161718192021222324252627282930\nMQ-2\nGas sensor\n-2MQ\nGND5V\nRGB LEDGND 3V3GND V53V3 5V\n3V3 5VMB-102CN9\nCN7CN10GND\n3.3VPotentiometer\n5V\nGNDA1Temperature\nsensorLM35\n3.3V\n5V\nGNDGNDHV1HV2HV\nLVHV3\nLV3LV3 LV1HV4\nLV4LV4 LV2GND\nGND\nGNDVCCVORSR/WEDBDBDBDBDBDBDBDB01234567\nNCPSB RSTVOUTBLABLK\n1 20A0PG 0_10KΩ GND 5V25V2uF 220\n25V2uF 220\n25V2uF 220PIR\nsensor\nRed\nGND\nBlue\nGreenPE 12_(Gas)\nPF 9_\nPF 7_\nPF 8_\nPG 1_(Dir1LS)\n(Dir1)\nPE 6_(Audio)\n(DirLS)2\n(Dir)2\nPB 4_\nPD 12_\nPA0_(Red)(Blue)(Green)\nLDRL 35M\nPC 9_Buzzer\n5V\nDO1033.3V\nSD card and\nSD card readerGND\nA\nB\nC\nD871\n42\n53\n6\n9\n0 #\n5151\n10K\nJAPANFCCvRKingston\n1\nSDCS/32GB, 0.766061246)",-2.066066
2,"(USAR Tn\nbxCANnsmcard\nirDA\nFIFODigital\nﬁlter\nDACnAPB1 45 MHz (max)AHB1 180 MHzAHB3\nVDD = 1.8 to 3.6V\nVSS\nVCAP1, VCAP2APB2 90 MHzART\nACCEL/CACHE\nGPIO POR TtOSC32_I N\nOSC32_OUTDMA2USB\nOTG HS\n8 Streams\nFIFODMA/\nFIFO PHY\nCHROM-AR T\nDMA2DFIFO\nFIFO8 Streams\nFIFODMA1\nLCD_R[7:0], LCD_G[7:0] ,\nLCD_B[7:0], LCD_HSYNC,\nLCD_VSYNC, LCD_DE,\nLCD_CLKLCD-TFTDMA/\nFIFO\nRTC_AF1\nRTC_AF1\nRTC 50Hz\n4KB BXPSRAMLS@ VBAT\nRTC\nAWU\nBackup registerXTAL32 KHz\nLSStandby\ninterfaceIWDGXTALOSC 4-26 MHzPOR\nreset\nInt\n@ VDDA @ VDDPVDPOR/PDR BORSupply\nsupervision@ VDDVDD\nVoltage regulator\n3.3 to 1.2VPower\nmanagement\nPLL1, 2, 3RC LSRC HS@ VDDA\nReset\n& clock\ncontrol\nFigure 1.22 STM32F429ZI block diagram made using information available from [9]., 0.807931781)",-3.9359
6,"(List of Figures\nxxxiv\nFigure 1.19 Simplified diagram of the Cortex processor family. 33\nFigure 1.20 Simplified diagram of the Cortex M0, M3, and M4 processors, and details of the \ncorresponding cores. 34\nFigure 1.21 Arm Cortex M0, M3, and M4 Instruction Set Architecture (ISA). 35\nFigure 1.22 STM32F429ZI block diagram made using information available from [9]. 36\nFigure 1.23 ST Zio connectors of the NUCLEO-F429ZI board. 37\nFigure 1.24 Hierarchy of different elements introduced in this chapter. 38\nFigure 1.25 “Smart door locks” built with Mbed contains elements introduced in this chapter. 39\nFigure 2.1 The smart home system is now connected to a PC. 45\nFigure 2.2 Website with the program documentation generated with Doxygen. 57\nFigure 2.3 Detailed description of functions and variables of the program that is available on the website (Part 1/3). 57\nFigure 2.4 Detailed description of functions and variables of the program that is available on the website (Part 2/3). 58, 0.782090366)",-5.124876
4,"(// commented// See https://os.mbed.com/teams/ST/wiki/STDIO for more information.////==============================================================================\nCode 8.1 Notes on the PinNames.h file of the NUCLEO-F429ZI board.\nCode 8.2 and Code 8.3 show the section of PinNames.h regarding PWM pins. For example, on line 4 \nof Code 8.2, it can be seen that PA_0 is related to PWM2 and channel 1 (channel 1 is indicated by \nthe 1 that is the penultimate value of line 4), which is used to control the green LED of the RGB LED \n(see Figure 8.3). This “normal” functionality of PA_0 is shown in Figure 8.5. In line 1 of Code 8.3, it can be seen that PB_0 is related to PWM1 and channel 2, with inverted behavior (ultimate value 1 of line 1). Inverted behavior means that a logic true is set by a 0 V value, as was explained above. This functionality is also shown on Figure 8.5.\n1\n23456789\n10111213141516171819202122//*** PWM ***, 0.788035452)",-5.264961


From the score table above, we can observe that the top result (2nd row, index 0) returned by cosine distance is not as relevant compared to the first row (index 5). However, without reranking, the text at index 5 did not even make it into the top 3 when we only performed cosine distance retrieval directly from the vector database.