Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arcee.ai LLM & Retriever integration #11579

Merged
merged 7 commits into from
Oct 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
146 changes: 146 additions & 0 deletions docs/docs_skeleton/docs/integrations/llms/arcee.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Arcee\n",
"This notebook demonstrates how to use the `Arcee` class for generating text using Arcee's Domain Adapted Language Models (DALMs)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup\n",
"\n",
"Before using Arcee, make sure the Arcee API key is set as `ARCEE_API_KEY` environment variable. You can also pass the api key as a named parameter."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import Arcee\n",
"\n",
"# Create an instance of the Arcee class\n",
"arcee = Arcee(\n",
" model=\"DALM-PubMed\",\n",
" # arcee_api_key=\"ARCEE-API-KEY\" # if not already set in the environment\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Additional Configuration\n",
"\n",
"You can also configure Arcee's parameters such as `arcee_api_url`, `arcee_app_url`, and `model_kwargs` as needed.\n",
"Setting the `model_kwargs` at the object initialization uses the parameters as default for all the subsequent calls to the generate response."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"arcee = Arcee(\n",
" model=\"DALM-Patent\",\n",
" # arcee_api_key=\"ARCEE-API-KEY\", # if not already set in the environment\n",
" arcee_api_url=\"https://custom-api.arcee.ai\", # default is https://api.arcee.ai\n",
" arcee_app_url=\"https://custom-app.arcee.ai\", # default is https://app.arcee.ai\n",
" model_kwargs={\n",
" \"size\": 5,\n",
" \"filters\": [\n",
" {\n",
" \"field_name\": \"document\",\n",
" \"filter_type\": \"fuzzy_search\",\n",
" \"value\": \"Einstein\"\n",
" }\n",
" ]\n",
" }\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generating Text\n",
"\n",
"You can generate text from Arcee by providing a prompt. Here's an example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Generate text\n",
"prompt = \"Can AI-driven music therapy contribute to the rehabilitation of patients with disorders of consciousness?\"\n",
"response = arcee(prompt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Additional parameters\n",
"\n",
"Arcee allows you to apply `filters` and set the `size` (in terms of count) of retrieved document(s) to aid text generation. Filters help narrow down the results. Here's how to use these parameters:\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Define filters\n",
"filters = [\n",
" {\n",
" \"field_name\": \"document\",\n",
" \"filter_type\": \"fuzzy_search\",\n",
" \"value\": \"Einstein\"\n",
" },\n",
" {\n",
" \"field_name\": \"year\",\n",
" \"filter_type\": \"strict_search\",\n",
" \"value\": \"1905\"\n",
" }\n",
"]\n",
"\n",
"# Generate text with filters and size params\n",
"response = arcee(prompt, size=5, filters=filters)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
141 changes: 141 additions & 0 deletions docs/docs_skeleton/docs/integrations/retrievers/arcee.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Arcee Retriever\n",
"This notebook demonstrates how to use the `ArceeRetriever` class to retrieve relevant document(s) for Arcee's Domain Adapted Language Models (DALMs)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup\n",
"\n",
"Before using `ArceeRetriever`, make sure the Arcee API key is set as `ARCEE_API_KEY` environment variable. You can also pass the api key as a named parameter."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.retrievers import ArceeRetriever\n",
"\n",
"retriever = ArceeRetriever(\n",
" model=\"DALM-PubMed\",\n",
" # arcee_api_key=\"ARCEE-API-KEY\" # if not already set in the environment\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Additional Configuration\n",
"\n",
"You can also configure `ArceeRetriever`'s parameters such as `arcee_api_url`, `arcee_app_url`, and `model_kwargs` as needed.\n",
"Setting the `model_kwargs` at the object initialization uses the filters and size as default for all the subsequent retrievals."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"retriever = ArceeRetriever(\n",
" model=\"DALM-PubMed\",\n",
" # arcee_api_key=\"ARCEE-API-KEY\", # if not already set in the environment\n",
" arcee_api_url=\"https://custom-api.arcee.ai\", # default is https://api.arcee.ai\n",
" arcee_app_url=\"https://custom-app.arcee.ai\", # default is https://app.arcee.ai\n",
" model_kwargs={\n",
" \"size\": 5,\n",
" \"filters\": [\n",
" {\n",
" \"field_name\": \"document\",\n",
" \"filter_type\": \"fuzzy_search\",\n",
" \"value\": \"Einstein\"\n",
" }\n",
" ]\n",
" }\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Retrieving documents\n",
"You can retrieve relevant documents from uploaded contexts by providing a query. Here's an example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = \"Can AI-driven music therapy contribute to the rehabilitation of patients with disorders of consciousness?\"\n",
"documents = retriever.get_relevant_documents(query=query)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Additional parameters\n",
"\n",
"Arcee allows you to apply `filters` and set the `size` (in terms of count) of retrieved document(s). Filters help narrow down the results. Here's how to use these parameters:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Define filters\n",
"filters = [\n",
" {\n",
" \"field_name\": \"document\",\n",
" \"filter_type\": \"fuzzy_search\",\n",
" \"value\": \"Music\"\n",
" },\n",
" {\n",
" \"field_name\": \"year\",\n",
" \"filter_type\": \"strict_search\",\n",
" \"value\": \"1905\"\n",
" }\n",
"]\n",
"\n",
"# Retrieve documents with filters and size params\n",
"documents = retriever.get_relevant_documents(query=query, size=5, filters=filters)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
10 changes: 10 additions & 0 deletions libs/langchain/langchain/llms/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,12 @@ def _import_anyscale() -> Any:
return Anyscale


def _import_arcee() -> Any:
from langchain.llms.arcee import Arcee

return Arcee


def _import_aviary() -> Any:
from langchain.llms.aviary import Aviary

Expand Down Expand Up @@ -479,6 +485,8 @@ def __getattr__(name: str) -> Any:
return _import_anthropic()
elif name == "Anyscale":
return _import_anyscale()
elif name == "Arcee":
return _import_arcee()
elif name == "Aviary":
return _import_aviary()
elif name == "AzureMLOnlineEndpoint":
Expand Down Expand Up @@ -633,6 +641,7 @@ def __getattr__(name: str) -> Any:
"AmazonAPIGateway",
"Anthropic",
"Anyscale",
"Arcee",
"Aviary",
"AzureMLOnlineEndpoint",
"AzureOpenAI",
Expand Down Expand Up @@ -713,6 +722,7 @@ def get_type_to_cls_dict() -> Dict[str, Callable[[], Type[BaseLLM]]]:
"amazon_bedrock": _import_bedrock,
"anthropic": _import_anthropic,
"anyscale": _import_anyscale,
"arcee": _import_arcee,
"aviary": _import_aviary,
"azure": _import_azure_openai,
"azureml_endpoint": _import_azureml_endpoint,
Expand Down