# Benchmark All

Here, we'll run benchmarking against all tool usage task.

Expand the models list to benchmark against different models.

In [None]:
import datetime
import uuid

from langsmith.client import Client

from langchain_benchmarks import clone_public_dataset, registry
from langchain_benchmarks.tool_usage import agents
from langchain_benchmarks.rate_limiting import RateLimiter, with_rate_limit

In [None]:
requests_per_minute = 50
rate_limiter = RateLimiter(requests_per_second=requests_per_minute / 60)

In [None]:
experiment_uuid = uuid.uuid4().hex[:4]
models = ["gpt-3.5-turbo-16k"]
client = Client()  # Launch langsmith client for cloning datasets
today = datetime.date.today().isoformat()

for task in registry:
    dataset_name = task.name + f"_benchmarking_{today}"
    clone_public_dataset(task.dataset_id, dataset_name=dataset_name)

    if task.type != "ToolUsageTask":
        continue
    for model in models:
        print()
        print(f"Benchmarking {task.name} with model: {model}")
        eval_config = task.get_eval_config()
        agent_factory = agents.OpenAIAgentFactory(task, model=model, rate_limiter=rate_limiter)

        client.run_on_dataset(
            dataset_name=dataset_name,
            llm_or_chain_factory=agent_factory,
            evaluation=eval_config,
            verbose=False,
            project_name=f"{dataset_name}-{model}-{experiment_uuid}",
            tags=[model],
            concurrency_level=5,
            project_metadata={
                "model": model,
                "id": experiment_uuid,
                "task": task.name,
                "date": today,
            },
        )

Dataset Tool Usage - Typewriter (1 tool)_benchmarking_2023-12-12 already exists. Skipping.
You can access the dataset at https://smith.langchain.com/o/e081f11e-fbd2-41b4-9fa8-5d76c76ef854/datasets/e959b972-9b0b-4035-ad63-e0314ffc58d6.

Benchmarking Tool Usage - Typewriter (1 tool) with model: gpt-3.5-turbo-16k
View the evaluation results for project 'Tool Usage - Typewriter (1 tool)_benchmarking_2023-12-12-gpt-3.5-turbo-16k-2710' at:
https://smith.langchain.com/o/e081f11e-fbd2-41b4-9fa8-5d76c76ef854/datasets/e959b972-9b0b-4035-ad63-e0314ffc58d6/compare?selectedSessions=6ba5fbb8-1ea7-46fd-86ec-e01d9c3dc913

View all tests for Dataset Tool Usage - Typewriter (1 tool)_benchmarking_2023-12-12 at:
https://smith.langchain.com/o/e081f11e-fbd2-41b4-9fa8-5d76c76ef854/datasets/e959b972-9b0b-4035-ad63-e0314ffc58d6
[------------------------------------------------->] 20/20Dataset Tool Usage - Typewriter (26 tools)_benchmarking_2023-12-12 already exists. Skipping.
You can access the dataset at http

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIError: The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID b4e2139abd830cf764cc7419d532791f in your email.) {
  "error": {
    "message": "The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID b4e2139abd830cf764cc7419d532791f in your email.)",
    "type": "server_error",
    "param": null,
    "code": null
  }
}
 500 {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the

[-------------->                                   ] 6/20

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIError: The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 8b68a315d94df4645585730223d0a3c4 in your email.) {
  "error": {
    "message": "The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 8b68a315d94df4645585730223d0a3c4 in your email.)",
    "type": "server_error",
    "param": null,
    "code": null
  }
}
 500 {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the

[--------------------------->                      ] 11/20

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIError: The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 1d3a8b1298980e39a32d412390a8fc00 in your email.) {
  "error": {
    "message": "The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 1d3a8b1298980e39a32d412390a8fc00 in your email.)",
    "type": "server_error",
    "param": null,
    "code": null
  }
}
 500 {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the

[--------------------------------------->          ] 16/20

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIError: The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID af58d4a8c026da2b825d735784e28efa in your email.) {
  "error": {
    "message": "The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID af58d4a8c026da2b825d735784e28efa in your email.)",
    "type": "server_error",
    "param": null,
    "code": null
  }
}
 500 {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the

[----------------------------------------->        ] 17/20

Chain failed for example e361b334-e698-40ae-a488-82b7734ce8d0 with inputs {'question': 'school'}
Error Type: Timeout, Message: Request timed out: HTTPSConnectionPool(host='api.openai.com', port=443): Read timed out. (read timeout=600)


[-------------------------------------------->     ] 18/20

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised Timeout: Request timed out: HTTPSConnectionPool(host='api.openai.com', port=443): Read timed out. (read timeout=600).
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised Timeout: Request timed out: HTTPSConnectionPool(host='api.openai.com', port=443): Read timed out. (read timeout=600).


[----------------------------------------------->  ] 19/20