-
Notifications
You must be signed in to change notification settings - Fork 347
Description
Tip
TL;DR baml should enable a rich tooling ecosystem to be built around BAML for external contributors like me
This proposal is part of the baml plugin ecosystem series
API Name: BAML Build-Time Generation Hooks / BAML Build Plugin System
Explicit Purpose: To allow developers to programmatically inspect and modify the in-memory representation of the BAML schema during the baml generate process, after parsing the .baml source files but before the final client code is generated. This does not modify the original .baml source files; it only influences the generated client code for that specific build.
Mechanism:
- Configuration:
generator target {
output_type "python/pydantic"
output_dir "../"
build_hooks [
"../baml_hooks/hook.sh" // will pass BamlSchema and BuildContext as json arg
"../baml_hooks/hooks/*.sh" // same as above but for all the files individually
"uvx some-pypi-package==1.2.3" // will pass BamlSchema and BuildContext as json arg
"npx @org/some-npm-package@1.2.3"
]
default_client_mode "sync"
version "0.3592.0" // hopefully earlier
}
- Hook Function Signature: Something like
# Example hook function in baml_hooks/hooks.py
from baml_py.tooling import parse_schema
from baml_py.tooling.types import BamlSchema, BuildContext
if __name__ == '__main__':
s, c = parse_schema(sys.args[0])
print(json.dumps(hook(s, c)))
def hook(s: BamlSchema, c: BuildContext) -> BamlSchema:
...-
MutableBamlSchemaObject: This object, passed to the hook, is the key.- It represents the entire parsed structure (functions, classes, enums, clients, metadata) from the
.bamlfiles. - Its structure would mirror the read-only schema from
get_baml_schema()(Idea 1). - Crucially, it's mutable: It provides methods/attributes to change properties (e.g.,
func.client = "...",cls.fields['name'].description = "..."), add elements (schema.add_function(...)), or remove elements. - Changes made to this object directly affect the subsequent code generation step for this build only.
- It represents the entire parsed structure (functions, classes, enums, clients, metadata) from the
-
BuildContextObject: Provides context about the current build operation.context.target_language: e.g., "python", "typescript" (if BAML supports multiple targets).context.project_root: Path to the project root.context.output_dir: Path where the generated client will be written.- Other relevant build flags or options.
-
Execution Lifecycle (During
baml build):- Parse all
.bamlfiles -> Create initialMutableBamlSchema. - Load hook configuration.
- For each registered hook function (in order):
- Call
hook_function(schema, context). - The function modifies the
schemaobject in memory.
- Call
- Use the final, potentially modified
schemaobject to generate the client code files. - Write generated files to the output directory.
- Parse all
Explicitness Summary:
- Configuration: Hooks are explicitly declared in build configuration.
- Naming: Terms like
build.hooks,BuildContext,MutableBamlSchemaclearly signal a build-time process. - Signature: The hook function signature takes build-specific objects.
- Effect: Modifications are explicitly made to an in-memory schema object, influencing only the generated code for that build.
- Separation: Distinct from runtime (
get_baml_schema) and source file modification (save_baml_schema).