Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 149 additions & 1 deletion docs/howtos/customisations/aws-bedrock.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,10 @@
"\n",
"Amazon Bedrock is a fully managed service that makes FMs from leading AI startups and Amazon available via an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case.\n",
"\n",
"This tutorial will show you how to use Amazon Bedrock endpoints and LangChain."
"This tutorial will show you how to use Amazon Bedrock with Ragas.\n",
"\n",
"1. [Metrics](#load-sample-dataset)\n",
"2. [Testset generation](#test-data-generation)"
]
},
{
Expand All @@ -22,6 +25,14 @@
":::"
]
},
{
"cell_type": "markdown",
"id": "f466494a",
"metadata": {},
"source": [
"## Metrics"
]
},
{
"cell_type": "markdown",
"id": "e54b5e01",
Expand Down Expand Up @@ -330,6 +341,143 @@
"df.head()"
]
},
{
"cell_type": "markdown",
"id": "b133aff0",
"metadata": {},
"source": [
"## Test Data Generation"
]
},
{
"cell_type": "markdown",
"id": "4c7192f2",
"metadata": {},
"source": [
"Load the documents using desired dataloader."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "529266ad",
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.document_loaders import UnstructuredURLLoader\n",
"\n",
"urls = [\n",
" \"https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-8-2023\",\n",
" \"https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-9-2023\",\n",
"]\n",
"loader = UnstructuredURLLoader(urls=urls)\n",
"documents = loader.load()"
]
},
{
"cell_type": "markdown",
"id": "87587749",
"metadata": {},
"source": [
"now we have documents created in the form of langchain `Document`\n",
"Next step is to wrap the embedding and llm model into ragas schema."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1d5eaed2",
"metadata": {},
"outputs": [],
"source": [
"from ragas.llms import LangchainLLMWrapper\n",
"from ragas.embeddings.base import LangchainEmbeddingsWrapper\n",
"\n",
"bedrock_model = LangchainLLMWrapper(bedrock_model)\n",
"bedrock_embeddings = LangchainEmbeddingsWrapper(bedrock_embeddings)"
]
},
{
"cell_type": "markdown",
"id": "d7d17468",
"metadata": {},
"source": [
"Next Step is to create chunks from the documents and store the chunks `InMemoryDocumentStore`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4e717c13",
"metadata": {},
"outputs": [],
"source": [
"from ragas.testset.extractor import KeyphraseExtractor\n",
"from langchain.text_splitter import TokenTextSplitter\n",
"from ragas.testset.docstore import InMemoryDocumentStore\n",
"\n",
"splitter = TokenTextSplitter(chunk_size=1000, chunk_overlap=100)\n",
"keyphrase_extractor = KeyphraseExtractor(llm=bedrock_model)\n",
"\n",
"docstore = InMemoryDocumentStore(\n",
" splitter=splitter,\n",
" embeddings=bedrock_embeddings,\n",
" extractor=keyphrase_extractor,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "7773f4b5",
"metadata": {},
"source": [
"Initializing `TestsetGenerator` with required arguments and generating data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "495ff805",
"metadata": {},
"outputs": [],
"source": [
"from ragas.testset import TestsetGenerator\n",
"from ragas.testset.evolutions import simple, reasoning, multi_context\n",
"\n",
"test_generator = TestsetGenerator(\n",
" generator_llm=bedrock_model,\n",
" critic_llm=bedrock_model,\n",
" embeddings=bedrock_embeddings,\n",
" docstore=docstore,\n",
")\n",
"\n",
"distributions = {simple: 0.5, reasoning: 0.25, multi_context: 0.25}\n",
"\n",
"# use generator.generate_with_llamaindex_docs if you use llama-index as document loader\n",
"testset = test_generator.generate_with_langchain_docs(\n",
" documents=documents, test_size=10, distributions=distributions\n",
")"
]
},
{
"cell_type": "markdown",
"id": "8a80046b",
"metadata": {},
"source": [
"Export the results into pandas¶"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0b4633c8",
"metadata": {},
"outputs": [],
"source": [
"test_df = testset.to_pandas()\n",
"test_df.head()"
]
},
{
"cell_type": "markdown",
"id": "f668fce1",
Expand Down