In [1]:
%pip install llama-index-embeddings-huggingface

Collecting llama-index-embeddings-huggingfaceNote: you may need to restart the kernel to use updated packages.

  Downloading llama_index_embeddings_huggingface-0.5.1-py3-none-any.whl.metadata (767 bytes)
Collecting sentence-transformers>=2.6.1 (from llama-index-embeddings-huggingface)
  Downloading sentence_transformers-3.4.1-py3-none-any.whl.metadata (10 kB)
Collecting transformers<5.0.0,>=4.41.0 (from sentence-transformers>=2.6.1->llama-index-embeddings-huggingface)
  Downloading transformers-4.49.0-py3-none-any.whl.metadata (44 kB)
Collecting torch>=1.11.0 (from sentence-transformers>=2.6.1->llama-index-embeddings-huggingface)
  Downloading torch-2.6.0-cp313-cp313-win_amd64.whl.metadata (28 kB)
Collecting scikit-learn (from sentence-transformers>=2.6.1->llama-index-embeddings-huggingface)
  Downloading scikit_learn-1.6.1-cp313-cp313-win_amd64.whl.metadata (15 kB)
Collecting scipy (from sentence-transformers>=2.6.1->llama-index-embeddings-huggingface)
  Downloading scipy-1.15.2-cp31


[notice] A new release of pip is available: 25.0 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [1]:
from llama_index.core import Settings, VectorStoreIndex, SimpleDirectoryReader
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.ollama import Ollama

# 환경 변수 설정
import os
from dotenv import load_dotenv

load_dotenv()

  from .autonotebook import tqdm as notebook_tqdm


True

In [7]:
# Ollama LLM 설정
llm = Ollama(
    model="deepseek-r1:8b",  # Ollama에서 제공하는 deepseek-r1:8b 모델 사용
    temperature=0.1,  # 응답의 창의성 조절 (0에 가까울수록 일관된 응답)
    context_window=4096,  # 컨텍스트 윈도우 크기
    timeout=120,  # 타임아웃 시간을 120초로 설정
    request_timeout=120,  # 요청 타임아웃도 120초로 설정
    additional_kwargs={  # 추가 파라미터
        "num_gpu": 0,  # CPU 사용 설정
        "seed": 42,  # 재현성을 위한 시드 설정
    },
)

# BGE 임베딩 모델 설정
embed_model = HuggingFaceEmbedding(
    model_name="BAAI/bge-m3", device="cpu"  # CPU 사용. GPU의 경우 "cuda"
)

In [4]:
# ServiceContext 설정
Settings.llm = llm
Settings.embed_model = embed_model

# PDF 문서 로드
documents = SimpleDirectoryReader(
    input_files=["../data/pdf_sample1/240828_(AI리포트)_미국의_인공지능(AI)_정책,전략.pdf"],
    filename_as_id=True).load_data()

In [5]:
# 인덱스 생성
index = VectorStoreIndex.from_documents(documents)

# 쿼리 엔진 생성
query_engine = index.as_query_engine(
    streaming=True,  # 스트리밍 응답 활성화
    similarity_top_k=3,  # 상위 3개의 가장 관련성 높은 문서 청크 사용
)

In [6]:
# 질문하기
response = query_engine.query(
    """
    책임 있는 AI 개발에서 미국의 리더십을 명확히 하는 데에
    AI안전 연구소가 맡은 중요한 역할이 있다고 생각한다고 말한 사람과 
    그 사람이 누구와 협력한다고 했는지 알려줘
    """
)
print(response)

<think>
Okay, I need to figure out who the person is that mentioned the important role of the AI Safety Research Institute in establishing U.S. leadership in responsible AI development and with whom they are collaborating.

First, looking at the context provided, there's a mention of OpenAI's Chief Security Officer (CSO) saying that the AI Safety Research Institute plays an important role in making the U.S. leadership clear in responsible AI development. So, the person in question is likely this CSO from OpenAI.

Next, I need to determine who they are collaborating with. The context mentions that OpenAI, Anthropic, and Microsoft have joined forces with the U.S. AI Safety Research Institute (USAISI) for rigorous testing before deploying AI models widely. So, the collaborations involve these companies working alongside USAISI.

Putting it together, the person is the CSO from OpenAI, and they are collaborating with OpenAI, Anthropic, Microsoft, and USAISI.
</think>

The individual in ques