# Tutorial 3: Customizing the Runtime

In addition to a modular authoring surface, the runtime is also modular and can
be customized for your own platform, deployment environment, or product domain.
In this tutorial, we explore the following topics:

*   **Custom functions**. You can define an arbirary function, and make it
    available as a building block in GenC that we can reference from the IR,
    without the need for plumbing required to make it a full-blown reusable
    component included with the framework. This is often the way to go if
    you just need a one-off for a rarely used feature. For this example, we
    will define a custom formatter and parser for calling a JSON backend.

*   **Custom operators**. If the feature you want to add is broadly applicable
    to many scenarios, it's worth adding it to GenC as a resuable building block
    for ease of use by others. We will walk you through the process of defining
    a new custom operator that can be added to GenC's operator library. For
    this example, we will walk you through the embedding of Wolfram (an external
    service) as a reusable GenC component.

*   **Custom model backends**. Whereas GenC includes integration with a handful
    of on-device and cloud LLMs , you can also define your own model
    backends. We'll show you how to do this.

*   **Custom runtime**. We'll illustrate how the various customizations we have
    covered in this tutorial can all be combined together to define a customized
    runtime that you can use instead of the example runtime we provided to power
    your specialized use cases.

To motivate this reading, in the next tutorial, we'll use all these building
blocks to create LLM-powered agents.

NOTE: In addition to this tutorial, you might want to review, at minimum, the
documentation on the extensibility APIs in
[api.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/api.md),
and the overview of architecture in
[architecture.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/architecture.md).
(You might also find it useful to skim over an overview of the IR in
[ir.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/ir.md),
and internal runtime concepts in
[runtime.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/runtime.md),
albeit these shouldn't be essential.)

## Initial Setup

Before proceeding, please follow the instructions in
[Tutorial 1](https://github.com/google/generative_computing/tree/master/generative_computing/docs/tutorials/tutorial_1_simple_cascade.ipynb)
to set up your environment, connect Jupyter, and run the command below to run
the GenC imports we're going to use.

In [None]:
import generative_computing.python as genc
from generative_computing.python import authoring
from generative_computing.python import interop
from generative_computing.python import runtime
from generative_computing.python.examples import executor

## Custom functions

By a *custom function* we refer to a piece of code that you can provision to
become available and callable from within the IR (logic expressed in GenC) at
runtime, without needing to incorporate it as a reusable building block within
the ecosystem. This is the easiest, and the most lightweight way of extending
the platform to integrate your new function in a way that you can flexibly mix
and match with all other native building blocks provided by GenC.

For example, many LLM backends take a structured (such as JSON) input and
output. However, in a typical model cascade or a "chain", we want the call to
the model to be text-in-text-out, with only the raw prompt going in, and only
the generated text output going out, and without the associated model-specific
JSON boilerplate, such that it can be used as a part of a larger structure that
requires a backend-independent consistent constract. This means, there are two
helper components that one may want to define to supplement your backend calls:

*   **Format JSON input**. Given text, format it into a JSON request that is
    well-formed for the specific model backend you want to integrate with.

*   **Parse JSON output**. Given a well-formed JSON response from your custom
    backend, parse it into a text, such that it can be plugged into a generic
    backend-independent model cascade or a chain.

Both of these can be expressed as simple stateless C++ functions, each of which
can exist and run on its own. To make them usable by GenC, we can use the
existing `CustomFunction` building block that's designed to support this very
pattern. Let's take a look how this is done.

### Define the function interface

First, let's write the two plain C++ functions to represent the JSON formatting
and parsing. We won't worry about the GenC plumbing yet, but we'll write these
in a manner that makes it easy to integrate them later. Specifically, we'll make
two provisions:

1. First, we'll use the protocol buffer structure that GenC uses at runtime to
   represent both the argument and the result.
   This is the `Value` message (defined in `computation.proto`) that can carry
   either IR, or raw payloads (numbers, strings, tensors, multimodal data).
   This will facilitate easier integration into GenC runtime as well as make
   the function chainable to other operator regardless of your data type. The
   signatures will look as follows:
   
   * `static absl::StatusOr<v0::Value> GetTopCandidateAsText(v0::Value input);`

   * `static absl::StatusOr<v0::Value> WrapTextAsInputJson(v0::Value input);`

2. Second, we'll collect the code that registers these functions with the GenC
   runtime (under symbolic names that we can later use to reference from within the IR), and put them into a single method, such that it's easy to
   call it later as we put all the pieces together at the end of this tutorial.
   This is captured in a method named `SetCustomFunctions` with the signature
   shown below. The code (discussed below) relies on concepts discussed in the
   extensibility API documentation in
   [api.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/api.md)
   that you might wish to review before proceeding through the rest of this
   tutorial.

  `static absl::Status SetCustomFunctions(intrinsics::CustomFunction::FunctionMap& fn_map);`

The declarations mentioned above can be found under
[`gemini_parser.h`](https://github.com/google/generative_computing/tree/master/generative_computing/cc/modules/parsers/gemini_parser.h),
and the associated implementation in the associated C++ file
[`gemini_parser.cc`](https://github.com/google/generative_computing/tree/master/generative_computing/cc/modules/parsers/gemini_parser.cc)
in the same directory.

```c++
// Parsers for Gemini model.
class GeminiParser final {
 public:
  ~GeminiParser() = default;

  // Extract Top Candidate as Text.
  static absl::StatusOr<v0::Value> GetTopCandidateAsText(v0::Value input);

  // Wraps a text as Gemini request JSON.
  static absl::StatusOr<v0::Value> WrapTextAsInputJson(v0::Value input);

  // Make Parser functions visible to the runtime.
  static absl::Status SetCustomFunctions(
      intrinsics::CustomFunction::FunctionMap& fn_map);

  // Not copyable or movable.
  GeminiParser(const GeminiParser&) = delete;
  GeminiParser& operator=(const GeminiParser&) = delete;

 private:
  // Do not hold states in this class.
  GeminiParser() = default;
};
```

### Define the functions

Here are the bodies of the two custom converter functions mentioned above. As
you can see, there's nothing particularly GenC-specific here other than the use
of the protocol buffer message (`v0::Value` below) that GenC uses to represent
the IR as well as values in its runtime. As noted above, you can find the code under
[`gemini_parser.cc`](https://github.com/google/generative_computing/tree/master/generative_computing/cc/modules/parsers/gemini_parser.cc).

``` c++
absl::StatusOr<v0::Value> GeminiParser::GetTopCandidateAsText(v0::Value input) {
  auto parsed_json = nlohmann::json::parse(input.str(), /*cb=*/nullptr,
                                           /*allow_exceptions=*/false);
  if (parsed_json.is_discarded()) {
    return absl::InternalError(absl::Substitute(
        "Failed parsing json output from Gemini: $0", input.DebugString()));
  }

  std::string extract_first_candidate_as_text =
      "{% if candidates %}{% for p in candidates.0.content.parts "
      "%}{{p.text}}{% endfor %}{%   endif %}";

  inja_status_or::Environment env;

  v0::Value result;
  std::string result_str =
      GENC_TRY(env.render(extract_first_candidate_as_text, parsed_json));
  result.set_str(result_str);
  return result;
}

absl::StatusOr<v0::Value> GeminiParser::WrapTextAsInputJson(v0::Value input) {
  std::string json_request = absl::Substitute(
      R"pb(
        {
          "contents":
          [ {
            "parts":
            [ { "text": "$0" }]
          }]
        }
      )pb",
      input.str());
  v0::Value result;
  result.set_str(json_request);
  return result;
}
```

### Make your functions visible to the runtime

Now, onto the registration (shown below, and also found in
[`gemini_parser.cc`](https://github.com/google/generative_computing/tree/master/generative_computing/cc/modules/parsers/gemini_parser.cc)).

The parameter `FunctionMap` is a data structure defined in
[`custom_function.h`](https://github.com/google/generative_computing/tree/master/generative_computing/cc/intrinsics/custom_function.h) and provided by GenC at runtime
construction time, where the keys are symbolic names of the custom functions
that you will reference in the IR, and values are C++ lambdas with their bodies.
The symbolic names (e.g., `/gemini_parser/get_top_candidate_as_text` below)
are totally up to you to define - you just have to make sure that the names
registered with the runtime match those that you will use later when authoring
your IR.

```c++
absl::Status GeminiParser::SetCustomFunctions(
    intrinsics::CustomFunction::FunctionMap& fn_map) {
  fn_map["/gemini_parser/get_top_candidate_as_text"] =
      [](const v0::Value& arg) {
        return GeminiParser::GetTopCandidateAsText(arg);
      };

  fn_map["/gemini_parser/wrap_text_as_input_json"] = [](const v0::Value& arg) {
    return GeminiParser::WrapTextAsInputJson(arg);
  };

  return absl::OkStatus();
}
```

Towards the end of this tutorial, we'll show you how to setup a custom runtime
where everything we define here comes together (you can find all that code in
[executor_stacks.cc](https://github.com/google/generative_computing/tree/master/generative_computing/cc/examples/executors/executor_stacks.cc)).

During the runtime setup, we'll be setting up a `config` object that, among a
handful of other things, contains a `custom_function_map`. This is the same
map that's declared as a formal parameter to the registration function you've
just defined. We will invoke it there to register custom functions as shown
below.

```c++
GENC_TRY(GeminiParser::SetCustomFunctions(config.custom_function_map));
```

As noted earlier, if you'd like to understand the concepts behind runtime
customization before proceeding, take a look at extensibility APIs in
[api.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/api.md).
Otherwise, continue on to see how the custom functions you've defined are used
when authoring the application logic in GenC.

### Authoring the IR that calls your custom functions

Assuming the custom runtime with your custom functions is setup (as noted,
we'll show you the full example at the end), here's how you can use them, along
with the REST calls to your custom backend and other building blocks, to
construct a chain that calls your backend and satisfies a simple
backend-independent text-in-text-out contract.

Since we have used Python as the authoring surface in the earlier tutorials,
we're going to stick to Python here as well, but keep in mind that you could do
this also in C++ (see
[api.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/api.md)
for details).

In order to make things easier for you, the example runtime we setup to support
the tutorials already has the custom functions defined above wired in, so that
you can test authoring with your functions right away, as below.

NOTE: You will need to fill in a correct Gemini endpoint address with API key
in the code below (see the instructions at the beginning of
[Tutorial 1](https://github.com/google/generative_computing/tree/master/generative_computing/docs/tutorials/tutorial_1_simple_cascade.ipynb)
for how to get those if you don't have them yet). We included an example of
what this may look like for an example model, but keep in mind this may change.
You may also want to look at
[math_tool_agent.py](https://github.com/google/generative_computing/tree/master/generative_computing/python/examples/math_tool_agent.py)
(Python) and
[run_gemini_on_ai_studio.cc](https://github.com/google/generative_computing/tree/master/generative_computing/cc/examples/run_gemini_on_ai_studio.cc)
(C++) for similar examples that involve the use of Germini model and REST calls.

In [None]:
# An example endpoint may look like below, but please verify as this can change
# https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=XYZ"
rest_call = genc.authoring.create_rest_call("<end point with api key>")
str_to_json_request = genc.authoring.create_custom_function(
    "/gemini_parser/wrap_text_as_input_json"
)
extrat_top_candidate = genc.authoring.create_custom_function(
    "/gemini_parser/get_top_candidate_as_text"
)

my_backend_chain = (
    genc.interop.langchain.CustomChain()
    | str_to_json_request
    | rest_call
    | extrat_top_candidate
)

portable_ir = genc.interop.langchain.create_computation(my_backend_chain)
executor = genc.examples.executor.create_default_executor()
runner = genc.runtime.Runner(portable_ir, executor)
runner("Tell me a story")

## Custom operators

As noted above, a custom operator is a great way to capture logic that's likely
to be used more than once, and worth capturing as a reusable component that you
can either contribute to GenC, or keep in your own repo (and use to setup your
specialized runtimes).

In an GenAI application, we often use tools - for instance, WolframAlpha. This could fit as `CustomFunction`, but if you're going to use it often, you may
want to be able to write code like `genc.authoring.create_wolfram_alpha(appid)`
to make it more readily available, and to make it look and feel just like any
of the other operators included with GenC.

### Declare a handler

First, we need to define a handler - code that implements the custom operator
and plugs into the runtime. It's implemented in C++, and inherits from one of
two interfaces defined in
[intrinsic_handler.h](https://github.com/google/generative_computing/tree/master/generative_computing/cc/runtime/intrinsic_handler.h).
In this case we're going to derive from the base class
`InlineIntrinsicHandlerBase`
since the operator we're writing is a simple in-and-out type of processing,
not a new control flow abstraction (see the section on extensibility APis in
[api.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/api.md)
and runtime documentation in
[runtime.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/runtime.md)
for a discussion of the different types of operators).

Here's what the declaration would look like (full code in
[wolfram_alpha.h](https://github.com/google/generative_computing/tree/master/generative_computing/cc/modules/tools/wolfram_alpha.h)):

```c++
// Tools for calling WolframAlpha API.
class WolframAlpha : public InlineIntrinsicHandlerBase {
 public:
  WolframAlpha() : InlineIntrinsicHandlerBase(kWolframAlpha) {}
  virtual ~WolframAlpha() {}

  absl::Status CheckWellFormed(const v0::Intrinsic& intrinsic_pb) const final;

  absl::Status ExecuteCall(const v0::Intrinsic& intrinsic_pb,
                           const v0::Value& arg, v0::Value* result) const final;
};
```

The method `ExecuteCall` is the one that defines the processing logic.

### Implement the handler logic

Firts, here's an example implementation you might write in C++ without GenC.

```c++
// Callback fn to write the response.
size_t WriteCallback(void* contents, size_t size, size_t nmemb,
                     std::string* output) {
  size_t totalSize = size * nmemb;
  output->append(static_cast<char*>(contents), totalSize);
  return totalSize;
}

// Calls Wolfram Alpha API to get a string response.
absl::StatusOr<std::string> CallShortAnswersAPI(const std::string& app_id,
                                                const std::string& query) {
  CURL* curl;
  CURLcode curl_code;
  std::string readBuffer;

  curl = curl_easy_init();
  if (curl == nullptr) return absl::InternalError("Unable to init CURL");

  char* escaped_query = curl_easy_escape(curl, query.c_str(), 0);
  std::string url =
      "http://api.wolframalpha.com/v1/result?i=" + std::string(escaped_query) +
      "&appid=" + app_id + "&output=json&format=plaintext";

  curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
  curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
  curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);

  curl_code = curl_easy_perform(curl);

  // Error out if call fails
  if (curl_code != CURLE_OK) {
    return absl::InternalError(curl_easy_strerror(curl_code));
  }
  curl_easy_cleanup(curl);
  return readBuffer;
}
```

Now, here's how you can plug this implementation into the handler class you've
defined earlier (full code in
[wolfram_alpha.cc](https://github.com/google/generative_computing/tree/master/generative_computing/cc/modules/tools/wolfram_alpha.cc)).

```c++
absl::Status WolframAlpha::CheckWellFormed(
    const v0::Intrinsic& intrinsic_pb) const {
  if (!intrinsic_pb.static_parameter().has_str()) {
    return absl::InvalidArgumentError("Expect template as appid, got none.");
  }
  return absl::OkStatus();
}

absl::Status WolframAlpha::ExecuteCall(const v0::Intrinsic& intrinsic_pb,
                                       const v0::Value& arg,
                                       v0::Value* result) const {
  const std::string& appid = intrinsic_pb.static_parameter().str();
  std::string result_str = GENC_TRY(CallShortAnswersAPI(appid, arg.str()));
  result->set_str(result_str);
  return absl::OkStatus();
}
```

### Make it known to the runtime

Now, we have a C++ handler for the new operator, but the runtime doesn't know
about it. Simiarly to custom functions earlier, we need to include in during
the runtime construction process, and here again, we can do that by adding to
the runtime `config` object, this time to the `custom_intrinsics_list`. You
can do that as shown below (and as mentioned earlier, you will see this in the
full context at the end of this tutorial).

```c++
config.custom_intrinsics_list.push_back(new intrinsics::WolframAlpha());
```

### Write a constructor for authoring

Now that we have a handler, and we have it wired into the runtime, we still
need to create a corresponding piece of the authoring surface to make it into
a fully functional building block, symmetric with the existing ones. You can
do that by defining a *constructor* function in C++ (and subsequently lifting
it into Python via `pybind11`). Constructors construct a piece of IR that
represents your new operator by setting the `uri` field in the `Intrinsic`
message, and populating any additional static parameters that may go in the IR
along with your operator in the `static_parameter` field in the proto, as shown
in the example code below (see also
[ir.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/ir.md)
for a more complete explanation of how operators are represented in the IR).

```c++
absl::StatusOr<v0::Value> CreateWolframAlpha(absl::string_view appid) {
  v0::Value wolfram_alpha_pb;
  v0::Intrinsic* const intrinsic_pb = wolfram_alpha_pb.mutable_intrinsic();
  intrinsic_pb->set_uri("wolfram_alpha"));
  intrinsic_pb->mutable_static_parameter()->set_str(std::string(appid));
  intrinsic_pb->mutable_static_parameter()->set_label("appid");
  return wolfram_alpha_pb;
}
```

Finally, to lift this constructor for use in Python, you need to augment the
appropriate section in `pybind11` in
[constructor_bindings.cc](https://github.com/google/generative_computing/tree/master/generative_computing/cc/authoring/constructor_bindings.cc), as shown in the snippet of code below:

```c++
m.def("create_wolfram_alpha", &CreateWolframAlpha,
      "Returns an operator that makes calls to WolframAlpha ShortAnswer API");
```

If you want to integrate it more tightly with GenC and include it by default,
you migth also want to register it in
[intrinsic_uris.h](https://github.com/google/generative_computing/tree/master/generative_computing/cc/intrinsics/intrinsic_uris.h) and include it in
[handler_sets.cc](https://github.com/google/generative_computing/tree/master/generative_computing/cc/intrinsics/handler_sets.cc), but that's not required (and you wouldn't want to do this for
an operator that's more specific to your domain or environment).

### Try it

Once defined as per above, your operator can be used like any other (note you
will need to populate the Wolfram appid to make this work):

In [None]:
portable_ir = authoring.create_wolfram_alpha("<your appid>")
executor = genc.examples.executor.create_default_executor()
runner = runtime.Runner(portable_ir, executor)
print(runner("what is the result of 2^2-2-3+4*100"))

## Model inference

GenC provides a few predefined ways to call popular model backends such as ChatGPT and Gemini to get you started. In many applications, you will want to
include custom model backends for your own specialized use cases.

Among the many model backends out there (Gemini, ChatGPT, self-hosted LLAMA,
and so on), not only the ways of calling them, but also the formats of inputs
and outputs may vary. For example, as noted earlier, many backends interact
with you through custom JSON blobs that vary across models.

In order to promote composability, we want to make these models usable through
a standardized protocol, such that, e.g., they can appear in a model cascade,
or as a part of chains. This section shows how to do that.

Recall that at the begining of this tutorial, we introduced a model call chain
with parsers and formatters, and a custom REST backend call. Now, let's
consider how to package all this as a standardized reusable "model inference"
abstraction, by integrating the model input formatter, model call, and a model
output parser into one operator that accepts and returns just the pure generic
payloads (prompt or response string or multimodal data).

As in the previous examples, we're going to capture all this in a single class,
as shown below (and see
[google_ai.h](https://github.com/google/generative_computing/tree/master/generative_computing/cc/interop/backends/google_ai.h)
and
[google_ai.cc](https://github.com/google/generative_computing/tree/master/generative_computing/cc/interop/backends/google_ai.cc)
for the full example).

```c++
class GoogleAI final {
 public:
  ~GoogleAI() = default;

  // Not copyable or movable.
  GoogleAI(const GoogleAI&) = delete;
  GoogleAI& operator=(const GoogleAI&) = delete;

  // Sets the inference map to process model calls.
  static absl::Status SetInferenceMap(
      intrinsics::ModelInferenceWithConfig::InferenceMap& inference_map);

 private:
  // Do not hold states in this class.
  GoogleAI() = default;
};
}  // namespace generative_computing
```


The way this works is going to look, in many ways, somewhat similar to how we
implemented custom functions and operators earlier in this tutorial:

*   When constructing the runtime, we'll be setting up a `config` object in
    C++ that, in this case, will have a field named
    `model_inference_with_config_map` of interest to us. This is a map from
    model names to C++ lambdas that call the corresponding custom backends.
    This registration logic is captured the `SetInferenceMap` method we've
    just declared above.

*   The C++ lambda will accept and return `v0::Value` messages, as well as take
    a message with the `static_parameter` that may contain additional config
    embedded in the IR along with the reference to your model.

See below a possible C++ implementation of the above.

The details of the implementation aren't terribly important for this tutorial,
but note how we use CURL (the model is sitting behind a REST endpoint), that's
something you wight want to use as well.

Note also in this case, it's the Gemini model we used in earlier tutorials,
hence the use of the familar `/cloud/gemini` key in the inference map.

```c++
absl::Status GoogleAI::SetInferenceMap(
    intrinsics::ModelInferenceWithConfig::InferenceMap& inference_map) {
  inference_map["/cloud/gemini"] =
      [](v0::Intrinsic intrinsic, v0::Value arg) -> absl::StatusOr<v0::Value> {
    // Construct input JSON
    std::string input_json = absl::Substitute(
        R"pb(
          {
            "contents":
            [ {
              "parts":
              [ { "text": "$0" }]
            }]
          }
        )pb",
        arg.str());

    // Construct the REST calls parameters from config     
    const v0::Value& config = intrinsic.static_parameter().struct_().element(1);
    const std::string& endpoint =
        config.struct_().element(0).str() + config.struct_().element(1).str();
    std::string api_key = "";

    // Make a REST call.
    v0::Value response_json = GENC_TRY(Post(api_key, endpoint, input_json));

    // Extract text out of JSON
    return GeminiParser::GetTopCandidateAsText(response_json);
  };
  return absl::OkStatus();
}
```

Once the above is defined, a call like the one shown below is used during the
runtime construction to plug it into your custom runtime.

```c++
// Set model inference for Gemini backends.
GENC_TRY(GoogleAI::SetInferenceMap(config.model_inference_with_config_map));
```

### Try it

Notice here the model call becomes much cleaner, it's text-in-text-out. There's no JSON formatting and output parsing involved.

With this cleaner approach, you can enjoy the benefits of composability with
other building blocks. Also, if you have multiple model backends setup the same
way, you can easily swap between model backends and leave the rest of the application logic untouched.

In [None]:
model_config = genc.authoring.create_rest_model_config(
    "https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent",
    "<your api key>",
)
model_call = genc.authoring.create_model_with_config(
    "/cloud/gemini", model_config
)
comp = runtime.Runner(comp_pb=model_call)
print(comp("Tell me a short story"))

## Putting it all together

Now that you've seen three different types of customizations, as promised,
here's a complete example of how it all fits together in a new runtime
constructor that you can use to power your own applications. The main function
of interest in the call to
`CreateLocalExecutor(intrinsics::CreateCompleteHandlerSet(config))` where the
runtime construction actually happens. All the code before that sets up the
custom `config`, as discussed above, to include the customizations you need.

You may also review examples of the included runtime constuctors, like those
you've used. With the explanations above, the code should now be easier to
understand.

Now, the customization in GenC can run much deeper. For more advances topics,
consult the extensibility API documentation in
[api.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/api.md)
and runtime documentation in
[runtime.md](https://github.com/google/generative_computing/tree/master/generative_computing/docs/runtime.md).

```c++
absl::StatusOr<std::shared_ptr<Executor>> CreateDefaultExecutor() {
  // This is where you can wire in all your components for the runtime.

  intrinsics::HandlerSetConfig config;

  // Register custom functions
  GENC_TRY(GeminiParser::SetCustomFunctions(config.custom_function_map));

  // Register your custom operator
  config.custom_intrinsics_list.push_back(new intrinsics::WolframAlpha());

  // Register your custom model backend
  GENC_TRY(GoogleAI::SetInferenceMap(config.model_inference_with_config_map));

  return CreateLocalExecutor(intrinsics::CreateCompleteHandlerSet(config));
}
```

### Don't forget to make it available in Python

Before we depart, it's worth noting that runtimes you define will often be used
in languages other than C++, and as such, it's worth lifting the code to Python
or Java. You've seen examples of usage of the default runtime constructor in
the tutorials (`create_default_executor`). You can make yours available in the
same way, e.g., in Python, by defining an appropriate bindings file, as shown
below (and in the example
[executor_bindings.cc](https://github.com/google/generative_computing/tree/master/generative_computing/cc/examples/executors/executor_bindings.cc)).

```c++
// Executor construction methods.
m.def("create_default_executor", &CreateDefaultExecutor,
      "Creates a defaul executor with predefined components.");
```

## Next tutorial: buidling modular agents!

Congratualtions! You just made enough modular components to build a more interesting LLM agents. In the next tutorial, we'll see in action how all these building blocks will come together to create an LLM agent.