# Overview

In this tutorial, we'll use Feast to inject documents and structured data (i.e., features) into the context of an LLM (Large Language Model) to power a RAG Application (Retrieval Augmented Generation).

Feast solves several common issues in this flow:
1. **Online retrieval:** At inference time, LLMs often need access to data that isn't readily 
   available and needs to be precomputed from other data sources.
   * Feast manages deployment to a variety of online stores (e.g. Milvus, DynamoDB, Redis, Google Cloud Datastore) and 
     ensures necessary features are consistently _available_ and _freshly computed_ at inference time.
2. **Vector Search:** Feast has built support for vector similarity search that is easily configured declaritively so users can focus on their application.
3. **Richer structured data:** Along with vector search, users can query standard structured fields to inject into the LLM context for better user experiences.
4. **Feature/Context and versioning:** Different teams within an organization are often unable to reuse 
   data across projects and services, resulting in duplicate application logic. Models have data dependencies that need 
   to be versioned, for example when running A/B tests on model/prompt versions.
   * Feast enables discovery of and collaboration on previously used documents, features, and enables versioning of sets of 
     data.

We will:
1. Deploy a local feature store with a **Parquet file offline store** and **Sqlite online store**.
2. Write/materialize the data (i.e., feature values) from the offline store (a parquet file) into the online store (Sqlite).
3. Serve the features using the Feast SDK
4. Inject the document into the LLM's context to answer questions

In [1]:
%%sh
pip install feast -U -q
echo "Please restart your runtime now (Runtime -> Restart runtime). This ensures that the correct dependencies are loaded."

Please restart your runtime now (Runtime -> Restart runtime). This ensures that the correct dependencies are loaded.


**Reminder**: Please restart your runtime after installing Feast (Runtime -> Restart runtime). This ensures that the correct dependencies are loaded.

## Step 2: Create a feature repository

A feature repository is a directory that contains the configuration of the feature store and individual features. This configuration is written as code (Python/YAML) and it's highly recommended that teams track it centrally using git. See [Feature Repository](https://docs.feast.dev/reference/feature-repository) for a detailed explanation of feature repositories.

The easiest way to create a new feature repository to use the `feast init` command. For this demo, you **do not** need to initialize a feast repo.


### Demo data scenario 
- We data from Wikipedia about states that we have embedded into sentence embeddings to be used for vector retrieval in a RAG application.
- We want to generate predictions for driver satisfaction for the rest of the users so we can reach out to potentially dissatisfied users.

In [6]:
import feast

### Step 2a: Inspecting the feature repository

Let's take a look at the demo repo itself. It breaks down into


* `data/` contains raw demo parquet data
* `example_repo.py` contains demo feature definitions
* `feature_store.yaml` contains a demo setup configuring where data sources are
* `test_workflow.py` showcases how to run all key Feast commands, including defining, retrieving, and pushing features.
   * You can run this with `python test_workflow.py`.

In [8]:
%cd /Users/farceo/dev/feast/examples/rag-docling/feature_repo/
!ls -R

/Users/farceo/dev/feast/examples/rag-docling/feature_repo
__init__.py        [1m[36mdata[m[m               feature_store.yaml
[1m[36m__pycache__[m[m        example_repo.py    test_workflow.py

./__pycache__:
example_repo.cpython-310.pyc example_repo.cpython-311.pyc

./data:
Untitled.ipynb                registry.db
docling_samples.parquet       small.pdf
metadata_samples.parquet      smallest-possible-pdf-2.0.pdf
online_store.db               tmp.ipynb


### Step 2b: Inspecting the project configuration
Let's inspect the setup of the project in `feature_store.yaml`. 

The key line defining the overall architecture of the feature store is the **provider**. 

The provider value sets default offline and online stores. 
* The offline store provides the compute layer to process historical data (for generating training data & feature 
  values for serving). 
* The online store is a low latency store of the latest feature values (for powering real-time inference).

Valid values for `provider` in `feature_store.yaml` are:

* local: use file source with Milvus Lite
* gcp: use BigQuery/Snowflake with Google Cloud Datastore/Redis
* aws: use Redshift/Snowflake with DynamoDB/Redis

Note that there are many other offline / online stores Feast works with, including Azure, Hive, Trino, and PostgreSQL via community plugins. See https://docs.feast.dev/roadmap for all supported connectors.

A custom setup can also be made by following [Customizing Feast](https://docs.feast.dev/v/master/how-to-guides/customizing-feast)

In [9]:
!pygmentize feature_store.yaml

[38;2;0;128;0;01mproject[39;00m:[38;2;187;187;187m [39mrag
[38;2;0;128;0;01mprovider[39;00m:[38;2;187;187;187m [39mlocal
[38;2;0;128;0;01mregistry[39;00m:[38;2;187;187;187m [39mdata/registry.db
[38;2;0;128;0;01monline_store[39;00m:
[38;2;187;187;187m  [39m[38;2;0;128;0;01mtype[39;00m:[38;2;187;187;187m [39msqlite
[38;2;187;187;187m  [39m[38;2;0;128;0;01mpath[39;00m:[38;2;187;187;187m [39mdata/online_store.db
[38;2;187;187;187m  [39m[38;2;0;128;0;01mvector_enabled[39;00m:[38;2;187;187;187m [39mtrue
[38;2;187;187;187m  [39m[38;2;0;128;0;01mvector_len[39;00m:[38;2;187;187;187m [39m384
[38;2;187;187;187m  [39m[38;2;61;123;123;03m# type: milvus[39;00m
[38;2;187;187;187m  [39m[38;2;61;123;123;03m# path: data/online_store.db[39;00m
[38;2;187;187;187m  [39m[38;2;61;123;123;03m# embedding_dim: 384[39;00m
[38;2;187;187;187m  [39m[38;2;61;123;123;03m# index_type: "IVF_FLAT"[39;00m


[38;2;0;128;0;01moffline_store[39;00m:
[38;2;187;187;187m 

### Inspecting the raw data

The raw feature data we have in this demo is stored in a local parquet file. The dataset Wikipedia summaries of diferent cities.

In [11]:
df.head()

Unnamed: 0,document_id,chunk_id,file_name,raw_chunk_markdown,vector
0,doc-1,chunk-1,2203.01017v2,"Ahmed Nassar, Nikolaos Livathinos, Maksym Lysa...","[-0.056879762560129166, 0.01667858101427555, -..."
1,doc-1,chunk-2,2203.01017v2,a. Picture of a table:\nTables organize valuab...,"[0.050771258771419525, -0.0055733839981257915,..."
2,doc-1,chunk-3,2203.01017v2,a. Picture of a table:\ncomplex column/row-hea...,"[-0.05088765174150467, 0.05101901665329933, -0..."
3,doc-1,chunk-4,2203.01017v2,a. Picture of a table:\nmodel. The latter impr...,"[0.011835305020213127, -0.09409898519515991, 0..."
4,doc-1,chunk-5,2203.01017v2,a. Picture of a table:\nwe can obtain the cont...,"[-0.0068757119588553905, 0.006624480709433556,..."


In [13]:
import pandas as pd 

df = pd.read_parquet("./data/docling_samples.parquet")
mdf = pd.read_parquet("./data/metadata_samples.parquet")
df['chunk_embedding'] = df['vector'].apply(lambda x: x.tolist())
embedding_length = len(df['vector'][0])
print(f'embedding length = {embedding_length}')

embedding length = 384


In [34]:
df['created'] = pd.Timestamp.now()
mdf['created'] = pd.Timestamp.now()

In [35]:
from IPython.display import display

display(df.head())

Unnamed: 0,document_id,chunk_id,file_name,raw_chunk_markdown,vector,chunk_embedding,created
0,doc-1,chunk-1,2203.01017v2,"Ahmed Nassar, Nikolaos Livathinos, Maksym Lysa...","[-0.056879762560129166, 0.01667858101427555, -...","[-0.056879762560129166, 0.01667858101427555, -...",2025-03-26 22:50:32.803496
1,doc-1,chunk-2,2203.01017v2,a. Picture of a table:\nTables organize valuab...,"[0.050771258771419525, -0.0055733839981257915,...","[0.050771258771419525, -0.0055733839981257915,...",2025-03-26 22:50:32.803496
2,doc-1,chunk-3,2203.01017v2,a. Picture of a table:\ncomplex column/row-hea...,"[-0.05088765174150467, 0.05101901665329933, -0...","[-0.05088765174150467, 0.05101901665329933, -0...",2025-03-26 22:50:32.803496
3,doc-1,chunk-4,2203.01017v2,a. Picture of a table:\nmodel. The latter impr...,"[0.011835305020213127, -0.09409898519515991, 0...","[0.011835305020213127, -0.09409898519515991, 0...",2025-03-26 22:50:32.803496
4,doc-1,chunk-5,2203.01017v2,a. Picture of a table:\nwe can obtain the cont...,"[-0.0068757119588553905, 0.006624480709433556,...","[-0.0068757119588553905, 0.006624480709433556,...",2025-03-26 22:50:32.803496


In [36]:
display(mdf.head())

Unnamed: 0,document_id,file_name,full_document_markdown,pdf_bytes,created
0,doc-1,2203.01017v2,## TableFormer: Table Structure Understanding ...,b'%PDF-1.5\n%\x8f\n5 0 obj\n<< /Type /XObject ...,2025-03-26 22:50:32.804817
1,doc-3,2305.03393v1-pg9,order to compute the TED score. Inference timi...,b'%PDF-1.3\n%\xc4\xe5\xf2\xe5\xeb\xa7\xf3\xa0\...,2025-03-26 22:50:32.804817
2,doc-2,2305.03393v1,## Optimized Table Tokenization for Table Stru...,b'%PDF-1.5\n%\x8f\n74 0 obj\n<< /Filter /Flate...,2025-03-26 22:50:32.804817
3,doc-4,amt_handbook_sample,"pulleys, provided the inner race of the bearin...",b'%PDF-1.6\r%\xe2\xe3\xcf\xd3\r\n875 0 obj\r<<...,2025-03-26 22:50:32.804817
4,doc-5,code_and_formula,## JavaScript Code Example\n\nLorem ipsum dolo...,b'%PDF-1.5\n%\xbf\xf7\xa2\xfe\n3 0 obj\n<< /Li...,2025-03-26 22:50:32.804817


## Step 3: Register feature definitions and deploy your feature store

`feast apply` scans python files in the current directory for feature/entity definitions and deploys infrastructure according to `feature_store.yaml`.

### Step 3a: Inspecting feature definitions
Let's inspect what `example_repo.py` looks like:

```python
from datetime import timedelta

from feast import (
    FeatureView,
    Field,
    FileSource,
)
from feast.data_format import ParquetFormat
from feast.types import Float32, Array, String, ValueType
from feast import Entity

chunk = Entity(
    name="chunk_id",
    description="Chunk ID",
    value_type=ValueType.STRING,
)

parquet_file_path = "./data/docling_samples.parquet"

source = FileSource(
    file_format=ParquetFormat(),
    path=parquet_file_path,
    timestamp_field="created",
)

city_embeddings_feature_view = FeatureView(
    name="docling_fv",
    entities=[chunk],
    schema=[
        Field(name="file_name", dtype=String),
        Field(name="full_document_markdown", dtype=String),
        Field(name="raw_chunk_markdown", dtype=String),
        Field(
            name="vector",
            dtype=Array(Float32),
            vector_index=True,
            vector_search_metric="COSINE",
        ),
        Field(name="bytes", dtype=String),
        Field(name="chunk_id", dtype=String),
    ],
    source=source,
    ttl=timedelta(hours=2),
)
```

### Step 3b: Applying feature definitions
Now we run `feast apply` to register the feature views and entities defined in `example_repo.py`, and sets up SQLite online store tables. Note that we had previously specified SQLite as the online store in `feature_store.yaml` by specifying a `local` provider.

In [44]:
%rm -rf .ipynb_checkpoints/

In [45]:
! feast apply 

  if not d.validate_tree(d.body) or not d.validate_tree(d.furniture):
No project found in the repository. Using project name rag defined in feature_store.yaml
Applying changes for project rag
  or self.pipeline_options.generate_table_images
  if not d.validate_tree(d.body) or not d.validate_tree(d.furniture):
  or self.pipeline_options.generate_table_images
Updated feature view [1m[33mdocling_feature_view[0m
	features: [1m[33m[name: "file_name"
value_type: STRING
, name: "raw_chunk_markdown"
value_type: STRING
, name: "vector"
value_type: DOUBLE_LIST
vector_index: true
vector_search_metric: "COSINE"
][0m -> [1m[92m[name: "file_name"
value_type: STRING
, name: "raw_chunk_markdown"
value_type: STRING
, name: "vector"
value_type: FLOAT_LIST
vector_index: true
vector_search_metric: "COSINE"
][0m
Updated on demand feature view [1m[33mdocling_transform_docs[0m
	features: [1m[33m[name: "document_id"
value_type: STRING
, name: "chunk_id"
value_type: STRING
, name: "chunk_text"
va

## Step 5: Load features into your online store

In [46]:
from datetime import datetime
from feast import FeatureStore

store = FeatureStore(repo_path=".")

### Step 5a: Using `write_to_online_store`

We now serialize the latest values of features since the beginning of time to prepare for serving. Note, `materialize_incremental` serializes all new features since the last `materialize` call, or since the time provided minus the `ttl` timedelta. In this case, this will be `CURRENT_TIME - 1 day` (`ttl` was set on the `FeatureView` instances in [feature_repo/feature_repo/example_repo.py](feature_repo/feature_repo/example_repo.py)). 

```bash
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
feast materialize-incremental $CURRENT_TIME
```

An alternative to using the CLI command is to use Python:

In [47]:
df.head()

Unnamed: 0,document_id,chunk_id,file_name,raw_chunk_markdown,vector,chunk_embedding,created
0,doc-1,chunk-1,2203.01017v2,"Ahmed Nassar, Nikolaos Livathinos, Maksym Lysa...","[-0.056879762560129166, 0.01667858101427555, -...","[-0.056879762560129166, 0.01667858101427555, -...",2025-03-26 22:50:32.803496
1,doc-1,chunk-2,2203.01017v2,a. Picture of a table:\nTables organize valuab...,"[0.050771258771419525, -0.0055733839981257915,...","[0.050771258771419525, -0.0055733839981257915,...",2025-03-26 22:50:32.803496
2,doc-1,chunk-3,2203.01017v2,a. Picture of a table:\ncomplex column/row-hea...,"[-0.05088765174150467, 0.05101901665329933, -0...","[-0.05088765174150467, 0.05101901665329933, -0...",2025-03-26 22:50:32.803496
3,doc-1,chunk-4,2203.01017v2,a. Picture of a table:\nmodel. The latter impr...,"[0.011835305020213127, -0.09409898519515991, 0...","[0.011835305020213127, -0.09409898519515991, 0...",2025-03-26 22:50:32.803496
4,doc-1,chunk-5,2203.01017v2,a. Picture of a table:\nwe can obtain the cont...,"[-0.0068757119588553905, 0.006624480709433556,...","[-0.0068757119588553905, 0.006624480709433556,...",2025-03-26 22:50:32.803496


In [48]:
store.write_to_online_store(feature_view_name='docling_feature_view', df=df)

In [49]:
store.write_to_online_store(feature_view_name='docling_transform_docs', df=mdf)

  or self.pipeline_options.generate_table_images
  if not d.validate_tree(d.body) or not d.validate_tree(d.furniture):
  or self.pipeline_options.generate_table_images
Token indices sequence length is longer than the specified maximum sequence length for this model (928 > 512). Running this sequence through the model will result in indexing errors
  or self.pipeline_options.generate_table_images
  if not d.validate_tree(d.body) or not d.validate_tree(d.furniture):
  or self.pipeline_options.generate_table_images
  if not d.validate_tree(d.body) or not d.validate_tree(d.furniture):
  or self.pipeline_options.generate_table_images
  if not d.validate_tree(d.body) or not d.validate_tree(d.furniture):
  or self.pipeline_options.generate_table_images
  if not d.validate_tree(d.body) or not d.validate_tree(d.furniture):
  ).applymap(


IndexError: list index out of range

In [50]:
conn = store._provider._online_store._conn
document_table = store._provider._online_store._conn.execute(
    "SELECT name FROM sqlite_master WHERE type='table' and name like '%docling%';"
).fetchall()[0][0]
written_data = pd.read_sql_query(f"select * from {document_table}", conn)

In [51]:
written_data

Unnamed: 0,entity_key,feature_name,value,vector_value,event_ts,created_ts
0,b'\x01\x00\x00\x00\x02\x00\x00\x00\x08\x00\x00...,file_name,b'\x12\x10right_to_left_03',right_to_left_03,2025-03-26 22:50:32,
1,b'\x01\x00\x00\x00\x02\x00\x00\x00\x08\x00\x00...,raw_chunk_markdown,b'\x12;2-5 -\xd8\xa7\xd8\xb3\xd8\xaa\xd8\xa7\x...,2-5 -استاندارد ک الا\nنام استاندارد,2025-03-26 22:50:32,
2,b'\x01\x00\x00\x00\x02\x00\x00\x00\x08\x00\x00...,vector,b'\x82\x01\x83\x0c\n\x80\x0c\x96d\xd2\xbc\x03h...,b'\x96d\xd2\xbc\x03h\xae=\x16(E=\x8bX\'\xbd\x1...,2025-03-26 22:50:32,
3,b'\x01\x00\x00\x00\x02\x00\x00\x00\x08\x00\x00...,file_name,b'\x12\x10right_to_left_03',right_to_left_03,2025-03-26 22:50:32,
4,b'\x01\x00\x00\x00\x02\x00\x00\x00\x08\x00\x00...,raw_chunk_markdown,b'\x12\x94\x012-5 -\xd8\xa7\xd8\xb3\xd8\xaa\xd...,2-5 -استاندارد ک الا\nشمشه و شمشال توليد ش...,2025-03-26 22:50:32,
...,...,...,...,...,...,...
1051,b'\x01\x00\x00\x00\x02\x00\x00\x00\x08\x00\x00...,raw_chunk_markdown,b'\x12\x8e\x01TableFormer predicted structure\...,TableFormer predicted structure\n= . Ki-67 pro...,2025-03-26 22:50:32,
1052,b'\x01\x00\x00\x00\x02\x00\x00\x00\x08\x00\x00...,vector,b'\x82\x01\x83\x0c\n\x80\x0c\xd9\x1as\xbd$\x01...,b'\xd9\x1as\xbd$\x01\xa6\xbc\xe7\x04Y\xbd\xd3\...,2025-03-26 22:50:32,
1053,b'\x01\x00\x00\x00\x02\x00\x00\x00\x08\x00\x00...,file_name,b'\x12\x0c2203.01017v2',2203.01017v2,2025-03-26 22:50:32,
1054,b'\x01\x00\x00\x00\x02\x00\x00\x00\x08\x00\x00...,raw_chunk_markdown,b'\x12\x8a\x01TableFormer predicted structure\...,TableFormer predicted structure\nFigure 16: Ex...,2025-03-26 22:50:32,


### Step 5b: Inspect materialized features

Note that now there are `online_store.db` and `registry.db`, which store the materialized features and schema information, respectively.

In [9]:
pymilvus_client = store._provider._online_store._connect(store.config)
COLLECTION_NAME = pymilvus_client.list_collections()[0]

milvus_query_result = pymilvus_client.query(
    collection_name=COLLECTION_NAME,
    filter="file_name == '2203.01017v2'",
)
pd.DataFrame(milvus_query_result[0]).head()

Unnamed: 0,chunk_id_pk,chunk_id,created_ts,event_ts,file_name,raw_chunk_markdown,vector
0,0100000002000000080000006368756e6b5f6964020000...,002bce0097246931724ae35b1e1a0d13fbb2c1a97e6c04...,0,1740914705958118,2203.01017v2,a. Picture of a table:\n95% on complex tables.,0.051321
1,0100000002000000080000006368756e6b5f6964020000...,002bce0097246931724ae35b1e1a0d13fbb2c1a97e6c04...,0,1740914705958118,2203.01017v2,a. Picture of a table:\n95% on complex tables.,0.091583
2,0100000002000000080000006368756e6b5f6964020000...,002bce0097246931724ae35b1e1a0d13fbb2c1a97e6c04...,0,1740914705958118,2203.01017v2,a. Picture of a table:\n95% on complex tables.,-0.039993
3,0100000002000000080000006368756e6b5f6964020000...,002bce0097246931724ae35b1e1a0d13fbb2c1a97e6c04...,0,1740914705958118,2203.01017v2,a. Picture of a table:\n95% on complex tables.,0.028728
4,0100000002000000080000006368756e6b5f6964020000...,002bce0097246931724ae35b1e1a0d13fbb2c1a97e6c04...,0,1740914705958118,2203.01017v2,a. Picture of a table:\n95% on complex tables.,-0.003588


### Quick note on entity keys
Note from the above command that the online store indexes by `entity_key`. 

[Entity keys](https://docs.feast.dev/getting-started/concepts/entity#entity-key) include a list of all entities needed (e.g. all relevant primary keys) to generate the feature vector. In this case, this is a serialized version of the `driver_id`. We use this later to fetch all features for a given driver at inference time.

## Step 6: Embedding a query using PyTorch and Sentence Transformers

During inference (e.g., during when a user submits a chat message) we need to embed the input text. This can be thought of as a feature transformation of the input data. In this example, we'll do this with a small Sentence Transformer from Hugging Face.

In [10]:
from sentence_transformers import SentenceTransformer

EMBED_MODEL_ID = "sentence-transformers/all-MiniLM-L6-v2"
embedding_model = SentenceTransformer(EMBED_MODEL_ID)

def embed_chunk(inputs):
    output = {
        "query_embedding": embedding_model.encode([
            inputs["query_string"]], normalize_embeddings=True,
        ).tolist()[0]
    }
    return output

In [13]:
embed_chunk({"query_string": "test"})['query_embedding'][0:10]

[0.011573407799005508,
 0.025136204436421394,
 -0.03670184686779976,
 0.05932486802339554,
 -0.0071490369737148285,
 -0.04119417816400528,
 0.07708743214607239,
 0.037442512810230255,
 0.012449025176465511,
 -0.006117636803537607]

## Step 7: Fetching real-time vectors and data for online inference

At inference time, we need to use vector similarity search through the document embeddings from the online feature store using `retrieve_online_documents_v2()` while passing the embedded query. These feature vectors can then be fed into the context of the LLM.

In [15]:
sample_query = df['raw_chunk_markdown'].values[0] 
print(sample_query)

Ahmed Nassar, Nikolaos Livathinos, Maksym Lysak, Peter Staar IBM Research
{ ahn,nli,mly,taa @zurich.ibm.com }


In [17]:
# Note we can enhance this special case to embed within the feature server, optionally.
query_embedding = embed_chunk({"query_string": sample_query})

In [24]:
from IPython.display import display

# Retrieve top k documents
context_data = store.retrieve_online_documents_v2(
    features=[
        "docling_feature_view:vector",
        "docling_feature_view:file_name",
        "docling_feature_view:raw_chunk_markdown",
        "docling_feature_view:chunk_id",
    ],
    query=query_embedding['query_embedding'],
    top_k=3,
    distance_metric='COSINE',
).to_df()

display(context_data)

Unnamed: 0,vector,file_name,raw_chunk_markdown,chunk_id,distance
0,"[-0.056879762560129166, 0.01667858101427555, -...",redp5110_sampled,1.2 Current state of IBM i security\nthe empl...,246855c6650678a5b15f8e0cfa2d2670e249140ac2541e...,0.515772
1,"[-0.056879762560129166, 0.01667858101427555, -...",2203.01017v2,"Ahmed Nassar, Nikolaos Livathinos, Maksym Lysa...",6385912fa27a8dd602cea2afaa3ecc9a27229ebd508661...,1.0
2,"[-0.056879762560129166, 0.01667858101427555, -...",redp5110_sampled,"We build confident, satisfied clients\nNo one ...",8e0a5ad8fd2216eff21b4ac27efb018586ceb9ed4e3a34...,0.5106


In [53]:
 def format_documents(context_df):
    output_context = ""
    
    # Remove duplicates based on 'chunk_id' (ensuring unique document chunks)
    unique_documents = context_df.drop_duplicates(subset=["chunk_id"])["raw_chunk_markdown"]
    
    # Format each document
    for i, document_text in enumerate(unique_documents):
        output_context += f"****START DOCUMENT {i}****\n"
        output_context += f"document = {{ {document_text.strip()} }}\n"
        output_context += f"****END DOCUMENT {i}****\n\n"
    
    return output_context.strip()

In [54]:
RAG_CONTEXT = format_documents(context_data)

In [55]:
print(RAG_CONTEXT)

****START DOCUMENT 0****
document = { 1.2  Current state of IBM i security
the employees that they manage. }
****END DOCUMENT 0****

****START DOCUMENT 1****
document = { Ahmed Nassar, Nikolaos Livathinos, Maksym Lysak, Peter Staar IBM Research
{ ahn,nli,mly,taa @zurich.ibm.com } }
****END DOCUMENT 1****

****START DOCUMENT 2****
document = { We build confident, satisfied clients
No one else has the vast consulting experiences, skills sharing and renown service offerings to do what we can do for you.
Because no one else is IBM. }
****END DOCUMENT 2****


In [56]:
FULL_PROMPT = f"""
You are an assistant for answering questions about a series of documents. You will be provided documentation from different documents. Provide a conversational answer.
If you don't know the answer, just say "I do not know." Don't make up an answer.

Here are document(s) you should use when answer the users question:
{RAG_CONTEXT}
"""

In [63]:
question = 'Who are the authors of the paper?'

In [60]:
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),
)

In [65]:
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": FULL_PROMPT},
        {"role": "user", "content": question}
    ],
)

In [66]:
print('\n'.join([c.message.content for c in response.choices]))

The authors of the paper are Ahmed Nassar, Nikolaos Livathinos, Maksym Lysak, and Peter Staar from IBM Research.


# End