In [7]:
!pip install langchain==0.2.16 langchain_core==0.2.38 langchain_community==0.2.16 pymupdf openai 
!pip install langchain_openai==0.1.25 langchain-qdrant qdrant_client asyncio ragas==0.1.14

Collecting langchain_openai==0.1.25
  Downloading langchain_openai-0.1.25-py3-none-any.whl.metadata (2.6 kB)
Collecting ragas==0.1.14
  Downloading ragas-0.1.14-py3-none-any.whl.metadata (5.3 kB)
Collecting langchain-core<0.3.0,>=0.2.40 (from langchain_openai==0.1.25)
  Downloading langchain_core-0.2.40-py3-none-any.whl.metadata (6.2 kB)
Downloading langchain_openai-0.1.25-py3-none-any.whl (51 kB)
Downloading ragas-0.1.14-py3-none-any.whl (163 kB)
Downloading langchain_core-0.2.40-py3-none-any.whl (396 kB)
Installing collected packages: langchain-core, langchain_openai, ragas
  Attempting uninstall: langchain-core
    Found existing installation: langchain-core 0.2.38
    Uninstalling langchain-core-0.2.38:
      Successfully uninstalled langchain-core-0.2.38
  Attempting uninstall: langchain_openai
    Found existing installation: langchain-openai 0.2.0
    Uninstalling langchain-openai-0.2.0:
      Successfully uninstalled langchain-openai-0.2.0
  Attempting uninstall: ragas
    Foun

**Step 1: Download and chunk the data**

We are going to use the following docs as our knowledge base:
1. Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People (PDF)
2. National Institute of Standards and Technology (NIST) Artificial Intelligent Risk Management Framework 

Let's start with a simple fixed chunking strategy as a baseline, and later evaluate parent-doc retrieval if we have time

In [8]:
# define constants
CHUNK_SIZE = 1500
OVERLAP = 150

RAGAS_CHUNK_SIZE = 750
RAGAS_OVERLAP = 75

GENERATOR_LLM = "gpt-4o-mini-2024-07-18"
CRITIC_LLM = "gpt-4o-2024-08-06"

N_EVAL_QUESTIONS = 50

PDFS = [
    "https://www.whitehouse.gov/wp-content/uploads/2022/10/Blueprint-for-an-AI-Bill-of-Rights.pdf",
    "https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf"
]

In [9]:
import os
import openai
from getpass import getpass

# collect OpenAI key
openai.api_key = getpass("OpenAI API Key: ")
os.environ["OPENAI_API_KEY"] = openai.api_key

In [10]:
import importlib
import vanilla_rag

importlib.reload(vanilla_rag)
for pdf in PDFS:
    chunks = await vanilla_rag.load_and_chunk_pdf(pdf,CHUNK_SIZE,OVERLAP)


Loading https://www.whitehouse.gov/wp-content/uploads/2022/10/Blueprint-for-an-AI-Bill-of-Rights.pdf...
Chunking...
Loading https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf...
Chunking...


In [5]:
print(chunks[100])

page_content='GAI resources; Apply organizational risk tolerances to ﬁne-tuned third-party 
models; Apply organizational risk tolerance to existing third-party models 
adapted to a new domain; Reassess risk measurements after ﬁne-tuning third-
party GAI models. 
Value Chain and Component 
Integration; Intellectual Property 
MG-3.1-002 
Test GAI system value chain risks (e.g., data poisoning, malware, other software 
and hardware vulnerabilities; labor practices; data privacy and localization 
compliance; geopolitical alignment). 
Data Privacy; Information Security; 
Value Chain and Component 
Integration; Harmful Bias and 
Homogenization 
MG-3.1-003 
Re-assess model risks after ﬁne-tuning or retrieval-augmented generation 
implementation and for any third-party GAI models deployed for applications 
and/or use cases that were not evaluated in initial testing. 
Value Chain and Component 
Integration 
MG-3.1-004 
Take reasonable measures to review training data for CBRN information, and 


**Step 2: Basic RAG Pipeline**

In [8]:
importlib.reload(vanilla_rag)
rag_chain = await vanilla_rag.vanilla_rag_chain(chunks, openai.api_key, "AI-Risk")

created qdrant client
populated vector db
created chain


In [10]:
from pprint import pprint
response = await rag_chain.ainvoke({"input":"What are some key risks associated with modern LLMs?"})
pprint(response)

{'context': [Document(metadata={'_id': 'f9c13262335345adba20b565821c7ce7', '_collection_name': 'AI-Risk'}, page_content='with greater ease and scale than other technologies. LLMs have been reported to generate dangerous or \nviolent recommendations, and some models have generated actionable instructions for dangerous or \n \n \n9 Confabulations of falsehoods are most commonly a problem for text-based outputs; for audio, image, or video \ncontent, creative generation of non-factual content can be a desired behavior.  \n10 For example, legal confabulations have been shown to be pervasive in current state-of-the-art LLMs. See also, \ne.g.,'),
             Document(metadata={'_id': '874f59b2faa24d92bec35afe72b262c3', '_collection_name': 'AI-Risk'}, page_content='development, production, or use of CBRN weapons or other dangerous materials or agents. While \nrelevant biological and chemical threat knowledge and information is often publicly accessible, LLMs \ncould facilitate its analysis or

**Step 3: Generate synthetic data**

In [6]:
from ragas.testset.generator import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context, conditional
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

generator_llm = ChatOpenAI(model=GENERATOR_LLM)
critic_llm = ChatOpenAI(model=CRITIC_LLM)
embeddings = OpenAIEmbeddings()

generator = TestsetGenerator.from_langchain(
    generator_llm,
    critic_llm,
    embeddings
)

distributions = {
    simple: 0.5,
    multi_context: 0.3,
    reasoning: 0.1,
    conditional: 0.1
}




For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  from ragas.metrics._answer_correctness import AnswerCorrectness, answer_correctness
  from .autonotebook import tqdm as notebook_tqdm

For example, replace imports like: `from langchain.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  from ragas.metrics._context_entities_recall import (


In [7]:
importlib.reload(vanilla_rag)
for pdf in PDFS:
    ragas_chunks = await vanilla_rag.load_and_chunk_pdf(pdf,RAGAS_CHUNK_SIZE,RAGAS_OVERLAP)

testset = generator.generate_with_langchain_docs(ragas_chunks, N_EVAL_QUESTIONS, distributions, with_debugging_logs=True)

                                   

ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass `raise_exceptions=False` incase you want to show only a warning message instead.