# Load
## 📚 강의 개요 (Overview)

이 강의에서는 텍스트 데이터를 임베딩하고, 벡터 데이터베이스에 로드(Load)하여 효율적으로 저장하고 검색하는 방법을 다룹니다. RAG (Retrieval-Augmented Generation) 시스템에서 어떤 임베딩 모델을 사용할지, 그리고 어떤 벡터 저장소를 활용할지가 검색 및 응답 성능에 중요한 영향을 미칩니다.

이 강의를 통해 다양한 임베딩 모델을 활용하여 텍스트를 벡터로 변환하고, 이를 벡터 데이터베이스에 저장하여 빠르게 검색하는 방법을 배웁니다.

## 목차: 
* [OpenAI 임베딩 모델 활용하기](#openai-임베딩-모델-활용하기)
* [Ollama Embedding 모델 활용하기](#ollama-embedding-모델-활용하기)
* [FAISS로 임베딩 벡터 저장하기](#faiss로-임베딩-벡터-저장하기)
* [Chroma 벡터 DB](#chroma-벡터-db)
* [Qdrant 벡터 DB](#qdrant-벡터-db)

In [None]:
# 환경변수 설정하기 (.env 파일을 사용하지 않을 경우 여기에 입력해주세요!)
import os

# 환경변수 설정
os.environ["API_KEY"] = "sk-..."

In [None]:
from dotenv import load_dotenv

load_dotenv(override = True)

True

### OpenAI 임베딩 모델 활용하기

`text-embedding-3-small`: OpenAI에서 제공하는 최신 임베딩 모델 중 하나로, 빠르고 가벼운 임베딩을 제공합니다.

문서 임베딩 vs 질의 임베딩
* `embed_documents()` → 여러 개의 문장을 한 번에 임베딩 (문서 검색 등에 활용).
* `embed_query()` → 질의(Query)를 임베딩 (질문-응답 시스템에서 검색할 때 활용).

In [1]:
from langchain_openai import OpenAIEmbeddings
embeddings_model = OpenAIEmbeddings(model="text-embedding-3-small")

In [None]:
# 문서(문장) 리스트를 벡터로 변환 (임베딩)
embeddings = embeddings_model.embed_documents(
    [
        "Hi there!",
        "Oh, hello!",
        "What's your name?",
        "My friends call me World",
        "Hello World!"
    ]
)

# 임베딩된 문서 개수와 개별 임베딩 벡터의 차원 출력
len(embeddings), len(embeddings[0])

(5, 1536)

In [None]:
# 질의(Query) 문장을 임베딩 벡터로 변환
embedded_query = embeddings_model.embed_query("What was the name mentioned in the conversation?")
embedded_query[:5] #처음 5개의 값만 출력 

[-0.010634176433086395,
 -0.01016946416348219,
 -0.0020040736999362707,
 0.023065242916345596,
 -0.026829415932297707]

In [None]:
len(embeddings[0])# 개별 문장의 임베딩 차원 확인

1024

### Ollama Embedding 모델 활용하기

In [2]:
# PDF 문서를 로드하여 페이지별로 저장하는 과정

from langchain_community.document_loaders import PyPDFLoader

file_path = (
    "data/arxiv_paper.pdf"
)

# PDF 로더 객체 생성
loader = PyPDFLoader(file_path)

# PDF의 각 페이지를 저장할 리스트 초기화
pages = []

# PDF를 비동기 방식으로 로드하여 페이지별로 저장
async for page in loader.alazy_load():
    pages.append(page)

In [3]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

# RecursiveCharacterTextSplitter를 사용하여 텍스트 분할 설정
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,       # 하나의 청크 크기를 1000자로 설정
    chunk_overlap=200,     # 청크 간 200자 겹치게 설정 (문맥 유지 목적)
    length_function=len,   # 텍스트 길이를 측정하는 함수 (len 사용)
    is_separator_regex=False  # separator를 정규식이 아닌 단순 문자열로 처리
)

# PDF에서 로드한 데이터를 텍스트 청크로 분할
texts = text_splitter.split_documents(pages)

print(f"{texts[0].metadata}")# 첫 번째 청크의 메타데이터
print(texts[0].page_content)# 첫 번째 청크의 내용 
print("-"*100)
print(f"{texts[1].metadata}")# 두 번째 청크의 메타데이터
print(texts[1].page_content)# 두 번째 청크의 내용

{'source': 'data/arxiv_paper.pdf', 'page': 0}
Uncertainty in Action: Confidence Elicitation in Embodied Agents
Tianjiao Yu, Vedant Shah, Muntasir Wahed, Kiet A. Nguyen, Adheesh Juvekar
Tal August, Ismini Lourentzou
University of Illinois Urbana-Champaign
{ty41,vrshah4,mwahed2,kietan2,adheesh2,taugust,lourent2}@illinois.edu
https://plan-lab.github.io/ece
Abstract
Expressing confidence is challenging for embod-
ied agents navigating dynamic multimodal en-
vironments, where uncertainty arises from both
perception and decision-making processes. We
present the first work investigating embodied con-
fidence elicitation in open-ended multimodal en-
vironments. We introduce Elicitation Policies,
which structure confidence assessment across
inductive, deductive, and abductive reasoning,
along with Execution Policies, which enhance
confidence calibration through scenario reinter-
pretation, action sampling, and hypothetical rea-
soning. Evaluating agents in calibration and fail-
ure prediction t

In [4]:
from langchain_ollama import OllamaEmbeddings

# "bge-m3" 모델을 사용하여 텍스트 임베딩 생성
embeddings_model=OllamaEmbeddings(model="bge-m3")
# 청크된 문서 리스트를 벡터화하여 임베딩 생성
embeddings = embeddings_model.embed_documents([i.page_content for i in texts])

len(embeddings[0])# 생성된 임베딩 벡터의 차원 확인

1024

### FAISS로 임베딩 벡터 저장하기

FAISS(Facebook AI Similarity Search)
*  대량의 벡터 데이터를 효율적으로 검색하는 라이브러리
* 문서 검색, 추천 시스템, 이미지 검색 등에서 활용됨.
* 벡터의 유사도를 계산하여 가장 가까운 문서를 빠르게 찾을 수 있음.

FAISS 벡터 저장소 구축
* `FAISS.from_documents()` → 텍스트 청크를 벡터로 변환하여 FAISS에 저장.
* `embedding=embeddings_model` → Ollama의 `bge-m3` 모델을 사용하여 임베딩을 생성.

FAISS 검색 방식
* `similarity_search(query, k=1)` → 가장 유사한 k개의 문서를 검색.
* `similarity_search(query, k=10, filter={"page": 0})` → 특정 조건(예: page=0)에서 유사한 문서 검색.
* `similarity_search_with_score(query, k=10)` → 검색된 문서와 함께 유사도 점수(거리) 출력.

In [13]:
%pip install -qU langchain_community faiss-cpu

Note: you may need to restart the kernel to use updated packages.


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
embedchain 0.1.125 requires pypdf<6.0.0,>=5.0.0, but you have pypdf 4.3.1 which is incompatible.
embedchain 0.1.125 requires rich<14.0.0,>=13.7.0, but you have rich 13.4.2 which is incompatible.


In [None]:
from langchain_community.vectorstores import FAISS

#  FAISS 벡터 저장소 생성 (OllamaEmbeddings을 활용)
# 앞서 생성한 청크(`texts`)와 임베딩 모델(`embeddings_model`)을 이용하여 벡터 저장소를 구축
vector_store = FAISS.from_documents(
    documents=texts,
    embedding=embeddings_model # Ollama 임베딩 모델
)

# 벡터 저장소 크기 확인
print(f"청크의 수: {len(texts)}") # 총 청크된 문서 개수
print(f"벡터 저장소에 저장된 문서 수: {vector_store.index.ntotal}")# 총 청크된 문서 개수 출력

청크의 수: 99
벡터 저장소에 저장된 문서 수: 99


In [None]:
# 1개의 유사한 문서를 검색 (기본 검색)
results = vector_store.similarity_search(query="Embodied Agent가 뭐야?",k=1)

# 검색된 문서 출력
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* Uncertainty in Action: Confidence Elicitation in Embodied Agents
Tianjiao Yu, Vedant Shah, Muntasir Wahed, Kiet A. Nguyen, Adheesh Juvekar
Tal August, Ismini Lourentzou
University of Illinois Urbana-Champaign
{ty41,vrshah4,mwahed2,kietan2,adheesh2,taugust,lourent2 }@illinois.edu
https://plan-lab.github.io/ece
Abstract
Expressing confidence is challenging for embod-
ied agents navigating dynamic multimodal en-
vironments, where uncertainty arises from both
perception and decision-making processes. We
present the first work investigating embodied con-
fidence elicitation in open-ended multimodal en-
vironments. We introduce Elicitation Policies,
which structure confidence assessment across
inductive, deductive, and abductive reasoning,
along with Execution Policies, which enhance
confidence calibration through scenario reinter-
pretation, action sampling, and hypothetical rea-
soning. Evaluating agents in calibration and fail-
ure prediction tasks within the Minecraft envi- [{'source':

In [None]:
# 특정 페이지에 대한 필터링 검색 (page=0인 문서에서 검색)
results = vector_store.similarity_search(query="Embodied Agent가 뭐야?",k=10,filter={"page": 0})

# 필터링된 문서 출력
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* Uncertainty in Action: Confidence Elicitation in Embodied Agents
Tianjiao Yu, Vedant Shah, Muntasir Wahed, Kiet A. Nguyen, Adheesh Juvekar
Tal August, Ismini Lourentzou
University of Illinois Urbana-Champaign
{ty41,vrshah4,mwahed2,kietan2,adheesh2,taugust,lourent2 }@illinois.edu
https://plan-lab.github.io/ece
Abstract
Expressing confidence is challenging for embod-
ied agents navigating dynamic multimodal en-
vironments, where uncertainty arises from both
perception and decision-making processes. We
present the first work investigating embodied con-
fidence elicitation in open-ended multimodal en-
vironments. We introduce Elicitation Policies,
which structure confidence assessment across
inductive, deductive, and abductive reasoning,
along with Execution Policies, which enhance
confidence calibration through scenario reinter-
pretation, action sampling, and hypothetical rea-
soning. Evaluating agents in calibration and fail-
ure prediction tasks within the Minecraft envi- [{'source':

In [None]:
# 유사도 점수를 포함한 검색 (Query와 문서 간 거리 계산)
results = vector_store.similarity_search_with_score(query="Embodied Agent가 뭐야?",k=10)

print("여기서의 score는 query와 문서의 거리를 나타내기 때문에, 낮을수록 유사합니다.\n")

# 검색된 문서와 점수 출력
for doc, score in results:
    print(f"* [유사도={score:3f}] {doc.page_content[:100]} [{doc.metadata}]")
    print("-"*100)

여기서의 score는 query와 문서의 거리를 나타내기 때문에, 낮을수록 유사합니다.

* [유사도=0.848845] Uncertainty in Action: Confidence Elicitation in Embodied Agents
Tianjiao Yu, Vedant Shah, Muntasir  [{'source': 'data/arxiv_paper.pdf', 'page': 0}]
----------------------------------------------------------------------------------------------------
* [유사도=0.876677] Confidence Elicitation in Embodied Agents
8. Impact Statement
This work advances Embodied AI by intr [{'source': 'data/arxiv_paper.pdf', 'page': 8}]
----------------------------------------------------------------------------------------------------
* [유사도=0.892435] tend to yield improved confidence calibration. For instance,
MineLLM’s ECE achieves 0.32 and 0.30 pa [{'source': 'data/arxiv_paper.pdf', 'page': 5}]
----------------------------------------------------------------------------------------------------
* [유사도=0.892554] Confidence Elicitation in Embodied Agents
ter quantify uncertainty and anticipate divergent outcomes [{'source': 'data/arxiv_paper.p

### Chroma 벡터 DB


* ChromaDB는 벡터 데이터베이스로, 텍스트 검색과 추천 시스템 등에 활용됨.
* FAISS는 메모리 내(in-memory)에서 작동하지만, Chroma는 영구 저장(Persistent Storage) 가능.
* Chroma는 쿼리 시 더 다양한 필터링과 조합이 가능함.

**Chroma 벡터 저장소 구축**
* `Chroma(collection_name="test_01", embedding_function=embeddings_model)`→ "test_01"이라는 컬렉션을 생성하고, Ollama 임베딩 모델을 사용하여 벡터 저장.
* `add_documents(documents=texts, ids=ids)`→ 임베딩을 생성한 후 ChromaDB에 저장.

**유사도 검색 방식**
* `similarity_search(query, k=1)` → 가장 유사한 k개의 문서를 검색.
* `similarity_search_with_score(query, k=5)` → 문서와 쿼리 간의 유사도 점수를 함께 반환.


In [6]:
#%pip install -qU chromadb langchain-chroma

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.16.2 requires protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3, but you have protobuf 5.29.3 which is incompatible.[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [7]:
from langchain_chroma import Chroma
from uuid import uuid4

#  Chroma 벡터 저장소 생성
vector_store = Chroma(
    collection_name="test_01",  # 데이터가 저장될 컬렉션 이름
    embedding_function=embeddings_model,  # Ollama 임베딩 모델 사용
)

#  UUID를 활용하여 문서별 고유 ID 생성
ids = [str(uuid4()) for _ in range(len(texts))]

# 벡터 저장소에 문서 추가 (임베딩 자동 생성)
vector_store.add_documents(documents=texts, ids=ids)

['e36a3340-2299-435e-bbe0-980b9f425a41',
 '927df71a-4ec1-4fc1-ad35-591db7f6b35e',
 'f7762f29-5604-4668-b6b8-1d2cbd262922',
 '1a0a1b08-0bc2-4a5d-be57-75e960e59062',
 '64c232a2-3f9e-4f33-91c6-5df21c6b82c6',
 '0479c143-2250-4a96-ae1f-05ae58d47566',
 'b838def6-ea76-4e22-a5ab-e53b82d778db',
 '02de8bab-82b0-476c-adcd-afba5eabd2b2',
 '64ded0df-cd27-4e41-9b89-615ab300fc41',
 '57f86230-afd3-43e1-ac63-7b84db5ad8ce',
 '468a59c9-aae2-492e-a1e8-727cb31d2d9e',
 '47aa8a17-3448-4f3b-abfe-7cb52a7dd9f7',
 'faba007d-fafd-4706-b66f-aae4b4ebcb66',
 '1034752d-086d-4301-809b-948df36c3421',
 'ee3a30d2-0c50-4404-b8b4-f0b58899de1f',
 'ce8162d5-2885-412d-ac62-1d2604d4f382',
 '621177aa-c822-4243-a2cf-f6e52f7216e8',
 '5f1e09fe-aa11-4487-9e77-6aacbbdcdf66',
 'cf823ad0-ee7d-42f5-85db-4e668a308d7e',
 'c9b8da22-1f75-4f66-b400-4e2afe2af8fd',
 'e71d1737-1dc2-47ac-8815-88cb88148ba7',
 '62ee0eca-8619-4d1f-884c-4b8624039a90',
 'e0993fa8-eda7-47ee-8a03-c477848c51a1',
 'df6774bf-85ed-467f-a684-af67f2cc8ce2',
 '818c8370-869c-

In [8]:
# 1개의 유사한 문서를 검색 (기본 검색)
results = vector_store.similarity_search(query="Embodied_agent가 뭐야?", k=1)

# 검색된 문서 출력
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* Uncertainty in Action: Confidence Elicitation in Embodied Agents
Tianjiao Yu, Vedant Shah, Muntasir Wahed, Kiet A. Nguyen, Adheesh Juvekar
Tal August, Ismini Lourentzou
University of Illinois Urbana-Champaign
{ty41,vrshah4,mwahed2,kietan2,adheesh2,taugust,lourent2}@illinois.edu
https://plan-lab.github.io/ece
Abstract
Expressing confidence is challenging for embod-
ied agents navigating dynamic multimodal en-
vironments, where uncertainty arises from both
perception and decision-making processes. We
present the first work investigating embodied con-
fidence elicitation in open-ended multimodal en-
vironments. We introduce Elicitation Policies,
which structure confidence assessment across
inductive, deductive, and abductive reasoning,
along with Execution Policies, which enhance
confidence calibration through scenario reinter-
pretation, action sampling, and hypothetical rea-
soning. Evaluating agents in calibration and fail-
ure prediction tasks within the Minecraft envi- [{'page': 0,

In [9]:
# 유사도 점수를 포함한 검색 (Query와 문서 간 거리 계산)
results = vector_store.similarity_search_with_score(query="Embodied_agent가 뭐야?", k=5)

print("여기서의 score는 query와 문서의 거리를 나타내기 때문에, 낮을수록 유사합니다.\n")

# 검색된 문서와 점수 출력
for doc, score in results:
    print(f"* [유사도={score:.3f}] {doc.page_content[:100]} [{doc.metadata}]")  # 상위 100자만 출력
    print("-" * 100)

여기서의 score는 query와 문서의 거리를 나타내기 때문에, 낮을수록 유사합니다.

* [유사도=0.830] Uncertainty in Action: Confidence Elicitation in Embodied Agents
Tianjiao Yu, Vedant Shah, Muntasir  [{'page': 0, 'source': 'data/arxiv_paper.pdf'}]
----------------------------------------------------------------------------------------------------
* [유사도=0.855] Confidence Elicitation in Embodied Agents
Malinin, A. and Gales, M. Uncertainty estimation in autore [{'page': 10, 'source': 'data/arxiv_paper.pdf'}]
----------------------------------------------------------------------------------------------------
* [유사도=0.857] Confidence Elicitation in Embodied Agents
8. Impact Statement
This work advances Embodied AI by intr [{'page': 8, 'source': 'data/arxiv_paper.pdf'}]
----------------------------------------------------------------------------------------------------
* [유사도=0.859] tend to yield improved confidence calibration. For instance,
MineLLM’s ECE achieves 0.32 and 0.30 pa [{'page': 5, 'source': 'data/arxiv_paper.p

### Qdrant 벡터 DB

* Qdrant는 벡터 검색(Vector Search) 및 필터링을 지원하는 벡터 데이터베이스.
* 메모리 내(in-memory) 실행 가능하며, 영구 저장(persistent storage)도 지원.
* FAISS나 Chroma와 다르게, 메타데이터 기반 필터링 기능이 강력함.

**Qdrant 벡터 저장소 구축**
* `QdrantClient(":memory:")` → 인메모리 벡터 저장소 생성.
* `create_collection()` → 1024차원의 벡터를 저장할 수 있는 컬렉션 생성.
* `distance=Distance.COSINE` → 문서 간 코사인 유사도를 계산하여 가장 유사한 문서를 찾음.
  
**Qdrant 검색 기능**
* `similarity_search(query, k=1)` → 가장 유사한 k개의 문서를 검색.
* `similarity_search_with_score(query, k=5)` → 문서와 쿼리 간의 유사도 점수를 함께 반환.
* `similarity_search(query, k=1, filter=...)` → 특정 조건을 만족하는 문서만 검색.

In [10]:
#%pip install -qU langchain_qdrant

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.16.2 requires protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3, but you have protobuf 5.29.3 which is incompatible.[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [11]:
# LangChain에서 Qdrant 벡터 저장소 불러오기
from langchain_qdrant import QdrantVectorStore

# Qdrant 클라이언트 라이브러리 불러오기
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams

# Qdrant 클라이언트 생성 (메모리 기반)
# ":memory:" 옵션을 사용하면 휘발성(In-Memory) 데이터베이스로 실행됨.
client = QdrantClient(":memory:")

#  Qdrant 컬렉션 생성
client.create_collection(
    collection_name="test",  # 저장소 이름
    vectors_config=VectorParams(
        size=1024,  # 벡터 차원 수 (사용하는 임베딩 모델에 맞춰야 함)
        distance=Distance.COSINE  # 벡터 간 유사도 측정 방식 (코사인 거리 사용)
    ),
)

# Qdrant 벡터 저장소 객체 생성
vector_store = QdrantVectorStore(
    client=client,  # Qdrant 클라이언트
    collection_name="test",  # 컬렉션 이름
    embedding=embeddings_model  # Ollama 임베딩 모델 사용
)

In [12]:
# 문서 ID를 생성하기 위해 UUID 사용
from uuid import uuid4

# 각 문서에 대해 고유한 ID 생성
ids = [str(uuid4()) for _ in range(len(texts))]

# 벡터 저장소에 문서 추가
vector_store.add_documents(documents=texts, ids=ids)

['b840b5af-51c4-4073-b436-ed7e049b0bef',
 'e11116e7-534f-4b44-8895-d3cac16c0cc3',
 'bb0c0ad4-aa56-4859-a340-b1ba64601af4',
 '8df32e68-9a66-41c6-9083-ab60fa855898',
 '8605f401-379a-4099-bfb1-332b497b9a5a',
 'f13a3fd5-e5e4-468f-bab1-f4be3bb35cc9',
 '83088019-cc7c-492b-81c0-60bf2d27180f',
 'e41d6a63-a2b6-4e0a-9827-5a9d0f13f1a8',
 '83710cbf-5a27-42d3-9d5f-aed608eac089',
 '1be9b921-aeb4-41dc-a59a-e551171b756e',
 '207e508e-a8b8-495e-bbf4-45aaeac333c3',
 '018eef8c-7555-4c32-b8ba-32c06d1adcd4',
 '1f7f02d1-93ba-433f-a8d4-ea7f1df22c52',
 'ec5cda78-489f-4939-b945-80495f9e94fc',
 '6cd32a15-6c56-4e43-b892-df19da068eb0',
 '8a07c46d-c59f-4358-a1d7-50411fa5c927',
 'af88d6e9-da5e-4580-a647-eb2c37ca3cd7',
 'c48a8721-12dc-4051-8d0a-f00426c592e5',
 '68cbc027-852c-4f49-859e-233261e103e4',
 '644b20fe-1e86-4738-86f6-1696de289a82',
 '0a9b490f-b184-45a5-8733-297f1e1718c9',
 'de7da756-6634-4512-af78-6ec08e9b7b00',
 'e15fa4c4-d95d-4fea-ab5a-c022786482b7',
 'd95c8f44-a5cd-456a-b408-f8ed3fa9e961',
 '0434c7bf-fe92-

In [None]:
# 1개의 유사한 문서를 검색 (기본 검색)
results = vector_store.similarity_search(query="Embodied_agent가 뭐야?", k=1)

# 검색된 문서 출력
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* Uncertainty in Action: Confidence Elicitation in Embodied Agents
Tianjiao Yu, Vedant Shah, Muntasir Wahed, Kiet A. Nguyen, Adheesh Juvekar
Tal August, Ismini Lourentzou
University of Illinois Urbana-Champaign
{ty41,vrshah4,mwahed2,kietan2,adheesh2,taugust,lourent2 }@illinois.edu
https://plan-lab.github.io/ece
Abstract
Expressing confidence is challenging for embod-
ied agents navigating dynamic multimodal en-
vironments, where uncertainty arises from both
perception and decision-making processes. We
present the first work investigating embodied con-
fidence elicitation in open-ended multimodal en-
vironments. We introduce Elicitation Policies,
which structure confidence assessment across
inductive, deductive, and abductive reasoning,
along with Execution Policies, which enhance
confidence calibration through scenario reinter-
pretation, action sampling, and hypothetical rea-
soning. Evaluating agents in calibration and fail-
ure prediction tasks within the Minecraft envi- [{'source':

In [13]:
#  특정 필터링 조건을 적용한 검색
from qdrant_client.http import models

results = vector_store.similarity_search(
    query="thud",
    k=1,
    filter=models.Filter(must=[models.FieldCondition(
        key="metadata.page",  # 특정 필드(예: page) 기반 필터링
        match=models.MatchValue(value=1),  # page=1인 문서만 검색
    )])
)

# 필터링된 문서 출력
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* which are increasingly prevalent in real-world applications
(Achiam et al., 2023; Touvron et al., 2023a). Additionally,
their free-form nature of outputs further complicates the
application of traditional methods. As a result, alternative
approaches have been proposed, including estimating un-
certainty by directly querying models for confidence scores
after generating responses (Xiong et al., 2024; Kadavath
et al., 2022; Lin et al., 2022a; Mielke et al., 2022; Chen &
Mueller, 2024). Despite these advancements, existing meth-
ods are not designed for embodied tasks, where confidence
elicitation must address the challenges of multimodal per-
ception, hierarchical reasoning and planning across various
open-ended tasks, as well as non-deterministic interactions.
LLM-based Embodied Agents. With the advent of lan-
guage models, leveraging their reasoning and planning abili-
ties to empower embodied agents has become quintessential
(Huang et al., 2023; Yao et al., 2023; Chen et al., 2023; 

In [14]:
# 유사도 점수를 포함한 검색 (Query와 문서 간 거리 계산)
results = vector_store.similarity_search_with_score(query="Embodied_agent가 뭐야?", k=5)

print("여기서의 score는 query와 문서의 거리를 나타내기 때문에, 낮을수록 유사합니다.\n")

# 검색된 문서와 점수 출력
for doc, score in results:
    print(f"* [유사도={score:.3f}] {doc.page_content[:1000]} [{doc.metadata}]")  # 첫 1000자 출력
    print("-" * 100)

여기서의 score는 query와 문서의 거리를 나타내기 때문에, 낮을수록 유사합니다.

* [유사도=0.585] Uncertainty in Action: Confidence Elicitation in Embodied Agents
Tianjiao Yu, Vedant Shah, Muntasir Wahed, Kiet A. Nguyen, Adheesh Juvekar
Tal August, Ismini Lourentzou
University of Illinois Urbana-Champaign
{ty41,vrshah4,mwahed2,kietan2,adheesh2,taugust,lourent2}@illinois.edu
https://plan-lab.github.io/ece
Abstract
Expressing confidence is challenging for embod-
ied agents navigating dynamic multimodal en-
vironments, where uncertainty arises from both
perception and decision-making processes. We
present the first work investigating embodied con-
fidence elicitation in open-ended multimodal en-
vironments. We introduce Elicitation Policies,
which structure confidence assessment across
inductive, deductive, and abductive reasoning,
along with Execution Policies, which enhance
confidence calibration through scenario reinter-
pretation, action sampling, and hypothetical rea-
soning. Evaluating agents in calibration and fail