# Your Guide to Agent-Factory!

Welcome to your entry point for using Agent-Factory!

This notebook will guide you through the process of creating, running and, perhaps most importantly, evaluating your first agent. As a "[hello-world](https://en.wikipedia.org/wiki/%22Hello,_World!%22_program)" example, we will build an agent that summarizes text content from a given webpage URL. But, before we begin:

### What is Agent-Factory?

Agent-Factory is a framework that enables users to build and deploy AI agents through natural language prompts, lowering the technical barrier of entry for non-AI experts.

### How does it work?

The main concepts you will need to understand to get you started are:

- **Task**: The high-level description of what you want to achieve with your agent. In this case, we want to summarize text content from a given webpage URL.

- **Manufacturing Agent**: The agent that will help you create your own agents. Think of it as you own personal alchemist-druid! You submit your vision of your ideal minion-assistant that you need help with for a very specific task, and the Manufacturing Agent will fulfill your wish.

  - _In practice:_ A server, running on the background, that receives requests as a natural language prompt, and builds the Generated Agent.

- **Generated Agent**: The agent that the Manufacturing Agent creates for you, tailored to do the best that it can to complete your task.
It is a fully functional agent that can be run independently, anywhere you like, and it will have its own set of very specific capabilities and requirements, tailored to the specific task you requested.

  - _In practice:_ A directory containing of Python code (agent.py) that implements the Generated Agent, along with all the files necessary for the agent to be successfully run (`tools`, `requirements.txt`, `agent_parameters.json`, etc.).

- **Evaluation Case**: A set of criteria that the Generated Agent will be evaluated against to ensure that it can complete the task you requested as intended. This is a crucial step to ensure that the Generated Agent meets the requirements of the task you requested.
**NOTE:** these criteria can either be automatically generated by an agent (Evaluator Agent) or be manually defined by you.

  - _In practice:_ A structured JSON file that contains the evaluation criteria, such as the input data, expected output, and any other relevant information needed to evaluate the Generated Agent.

- **Agent Judge**: The agent that will execute the Generated Agent against the Evaluation Case and provide a score based on how well it performed. This is the final step to ensure that the Generated Agent is capable of completing the task you requested.

### Installation

Before running this notebook, ensure that you have installed Agent-Factory, by following the instructions in the [Installation Guide](getting-started/installation.md).


### Setup

In a terminal (not inside this notebook), move to the src/agent-factory directory and start the agent-factory server in `--nochat` mode:

```bash
cd src/agent_factory && uv run . --host 0.0.0.0 --port 8080 --nochat
```

This will prepare the Manufacturing Agent to receive requests and build any Generated Agent you ask.
The `--nochat` flag is used so that the Manufacturing Agent "one-shots" your request, meaning it will not engage in a back-and-forth conversation with you, but rather will try to fulfill your request in one go.


### Let's build!



In [None]:
# Necessary to run the Generated Agent within a notebook environment
import nest_asyncio

nest_asyncio.apply()

In [None]:
# This notebook is in under the /docs directory, so we need to go up one level to run the agent-factory command
%cd ../

In [None]:
# Let's ask the Manufacturing Agent to create the Generated Agent for us
!uv run agent-factory "Summarize text content from a given webpage URL" --output_dir hello_world

In [None]:
# Now that the Generated Agent has been created, let's navigate to the directory
# where it was created and see the files it generated.
%cd generated_workflows/hello_world
%ls

Now that our Generated Agent is ready, let's run it!

We are using `uv` to explicitly define the Python version we want to use, in this case, Python 3.13, which packages to install from the generated requirements.txt file, and,
finally, the input for the Generated Agent, i.e. the URL we want to summarize.

In [None]:
!uv run --with-requirements requirements.txt --python 3.13 python agent.py --url https://blog.mozilla.ai/introducing-any-llm-a-unified-api-to-access-any-llm-provider/

Great, this summary is looking pretty good! 🎉

But, how do we know whether the Generated Agent actually did what we had in mind? For example, did it actually visit the URL we provided, or did it just hallucinate and completely made up the summary? 🤔

### Evaluation time!

In order to ensure that our Generated Agent acts according to our requirements, we will build an Evaluation Case with certain criteria the Generated Agent must meet in order to be considered successful.
As we mentioned in the beginning, this can be done either manually or automatically from the Evaluator Agent.
To simplify the process, we recommend to first use the Evaluator Agent to auto-generate a few relevant criteria for us,
and if we are not satisfied with the results, we can refine the criteria later.

In [None]:
# Let's navigate back to the root directory of the project, so we can run the Evaluator Agent.
%cd ../../

In [None]:
# And now let's call the Evaluator Agent
!uv run -m eval.generate_evaluation_case generated_workflows/hello_world

From the Evaluator Agent's trace, we can see some of the criteria it generated for us, such as:
- Ensure that the agent calls the visit_webpage tool with the exact URL provided by the user before any text extraction or summarization steps.
- Check that the agent extracts and isolates only the main body text from the fetched webpage, removing navigation menus, headers, footers, advertisements, and boilerplate links from the output.
- Verify that if no meaningful main body text is found or if the page cannot be reached, the agent returns an empty extracted_text and places an explanatory message in the summary field of the output.

If we want, we navigate to the `generated_workflows/hello_world/evaluation_case.json` file to see the full list of criteria and edit it accordingly. For now, these criteria are looking pretty good, so let's use them as is.

The next step is to run the Agent Judge, which will execute the Generated Agent against the Evaluation Case and provide a score based on how well it performed.

In [None]:
!uv run -m eval.run_generated_agent_evaluation generated_workflows/hello_world