# Type Filtering Pipeline Example

This notebook implements a [Gotaglio](https://github.com/MikeHopcroft/gotaglio) pipeline that uses
`ts-type-filter` to optimize LLM prompts for a ficticious restaurant ordering bot. The user can can ask the bot for items from the [menu](./menu.ts) and the bot will update a shopping cart. The conversation can continue through multiple turns, each of which modifies the cart.

In this example we'll be using a using the `MenuPipeline` class defined in [menu_pipeline.py](./menu_pipeline.py).
First we create an instance of `Gotaglio` that knows about this pipeline.

- The first parameter is the list of pipeline classes.
- The second parameter overrides the default Gotaglio configuration. I intend to put my model configuration (models.json) and credentials (.credentials.json) files at the root of this repo. Since this notebook file is two levels down in samples/menu, I set the `base_folder` property to `../..`. Alternatively I could have put the configuration files in the same folder as this notebook, but I want to be able to share them across all samples.


In [None]:
from gotaglio.gotag import Gotaglio, read_json_file
import samples.menu.menu_pipeline as sample

gt = Gotaglio([sample.menu_pipeline_spec], {"base_folder": "../.."})

AttributeError: module 'samples.menu.menu_pipeline' has no attribute 'MenuPipeline'

We'd like to run `MenuPipeline` on the cases in [data/cases.json](./data/cases.json). These are multi-turn test cases that have an initial shopping `cart` prooperty and an array of `turns`, each of which has a user `query` and an `expected` cart. Each test case also has a `uuid` and an optional `keywords` array.

In the following code, we read `cases.json` and then we use `gt.run()` to run the flattened cases:
* The first parameter specifies the name of the pipeline we want to run. We have to specify a name because `Gotaglio` can be instantiated with multiple pipelines. The pipeline name is specified in the `_name` static member of the class implementing the pipeline.
* The second parameter provides the test cases. You can either pass the path to a json file with the test cases, or you can pass a dictionary. In this case we pass a dictionary because we want to use the result of `flatten_cases()`.
* The third parameter is a dictionary of [glom](https://glom.readthedocs.io/en/latest/)-paths to pipeline configuration value overrides. We specify [data/template.txt](./data/template.txt) as a [jinja2](https://pypi.org/project/Jinja2/) template for the LLM prompt. We specify the built-in `perfect` model that always gives the correct answer.
* The `save` parameter indicates that the run log should also be written to the default log location. Run logs are useful for analysis and as a starting point for new runs with some configuration changes. In this notebook, the default log location is the log folder in the folder where this notebook resides.

Finally, we use `gt.format()` to display a richly formatted transcript of each case.

In [None]:
# Load the JSON file
cases = read_json_file("data/cases.json")

# Run the pipeline on the flattened cases.
runlog1 = gt.run(
    "menu",
    cases,
    {
        "prepare.menu": "data/menu.ts",
        "prepare.template": "data/template.txt",
        "infer.model.name": "gpt4o",
    },
    save=True,
)

# Format the results.
gt.format(runlog1)