cocoindex-io · badmonster0 · Mar 17, 2025 · Mar 17, 2025
diff --git a/docs/docs/core/custom_function.mdx b/docs/docs/core/custom_function.mdx
@@ -6,14 +6,57 @@ description: Build Custom Functions
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 
-# Build Custom Functions
+A custom function can be defined in one of the following ways:
 
-To build a custom function, you need to define a function spec and an executor.
+*   A standalone function. It's simpler and doesn't allow additional configurations and setup logic.
+*   A function spec and an executor. It's more powerful, allows additional configurations and setup logic.
 
-## Function Spec
+## Option 1: By a standalone function
 
-The function spec of a function defines the function's parameters.
-These parameters configures behavior of a specific instance of the function.
+It fits into simple cases that the function doesn't need to take additional configurations and extra setup logic.
+
+<Tabs>
+<TabItem value="python" label="Python" default>
+
+The standalone function needs to be decorated by `@cocoindex.op.function()`, like this:
+
+```python
+@cocoindex.op.function(...)
+def compute_something(arg1: str, arg2: int | None = None) -> str:
+    """
+    Documentation for the function.
+    """
+    ...
+```
+
+Notes:
+
+*   The `cocoindex.op.function()` function decorator also takes optional parameters.
+    See [Parameters for custom functions](#parameters-for-custom-functions) for details.
+*   Types of arugments and the return value must be annotated, so that CocoIndex will have information about data types of the operation's output fields.
+    See [Data Types](/docs/core/data_types) for supported types.
+
+</TabItem>
+</Tabs>
+
+### Examples
+
+The cocoindex repository contains the following examples of custom functions defined in this way:
+
+*   In the [code_embedding](https://github.com/cocoindex-io/cocoindex/blob/main/examples/code_embedding/main.py) example,
+    `extract_extension` is a custom function to extract the extension of a file name.
+*   In the [manuals_llm_extraction](https://github.com/cocoindex-io/cocoindex/blob/main/examples/manuals_llm_extraction/main.py) example,
+    `summarize_manuals` is a custom function to summarize structured information of a manual page.
+
+
+## Option 2: By a function spec and an executor
+
+This is more advanced and flexible way to define a custom function.
+It allows a function to be configured with the function spec, and allow preparation logic before execution, e.g. initialize a model based on the spec.
+
+### Function Spec
+
+The function spec of a function configures behavior of a specific instance of the function.
 When you use this function in a flow (typically by a [`transform()`](/docs/core/flow_def#transform)), you instantiate this function spec, with specific parameter values.
 
 <Tabs>
@@ -22,8 +65,7 @@ When you use this function in a flow (typically by a [`transform()`](/docs/core/
 A function spec is defined as a class that inherits from `cocoindex.op.FunctionSpec`.
 
 ```python
-
-class DemoFunctionSpec(cocoindex.op.FunctionSpec):
+class ComputeSomething(cocoindex.op.FunctionSpec):
     """
     Documentation for the function.
     """
@@ -40,69 +82,97 @@ Notes:
 </Tabs>
 
 
-## Function Executor
+### Function Executor
 
 A function executor defines behavior of a function. It's initantiated for each operation that uses this function.
 
 The function executor is responsible for:
 
-*   *Prepare* for the function execution, based on the spec. It happens once and only once before execution. e.g. if the function calls a machine learning model, the model name can be a parameter as a field of the spec, and we may load the model in this phase.
-*   *Run* the function, for each specific input arguments. This happens multiple times, for each specific rows of data.
+*   *Prepare* for the function execution, based on the spec.
+    It happens once and only once before execution.
+    e.g. if the function calls a machine learning model, the model name can be a parameter as a field of the spec, and we may load the model in this phase.
+*   *Run* the function, for each specific input arguments. This happens multiple times, for each specific row of data.
 
 <Tabs>
 <TabItem value="python" label="Python" default>
 
-A function executor is defined as a class annotated by `@cocoindex.op.executor_class()`.
+A function executor is defined as a class decorated by `@cocoindex.op.executor_class()`.
 
 
 ```python
 @cocoindex.op.executor_class(...)
-class DemoFunctionExecutor:
-    spec: DemoFunctionSpec
+class ComputeSomethingExecutor:
+    spec: ComputeSomething
     ...
 
     def prepare(self) -> None:
         ...
 
-    def __call__(self, input_value: str) -> str:
+    def __call__(self, arg1: str, arg2: int | None = None) -> str:
         ...
 ```
 
 Notes:
 
-*   The `cocoindex.op.executor_class()` class decorator also takes the following optional arguments:
-
-    *   `gpu: bool`: Whether the executor will use GPU. It will affect the way the function is scheduled.
-
-    *   `cache: bool`: Whether the executor will enable cache for this function.
-         When `True`, the executor will cache the result of the function for reuse during reprocessing.
-         We recommend to set this to `True` for any function that is computationally intensive.
-
-    *   `behavior_version: int`: The version of the behavior of the function.
-        When the version is changed, the function will be re-executed even if cache is enabled.
-        It's required to be set if `cache` is `True`.
-
-    For example, this enables cache for the function:
-
-    ```python
-    @cocoindex.op.executor_class(cache=True, behavior_version=1)
-    class DemoFunctionExecutor:
-        ...
-    ```
+*   The `cocoindex.op.executor_class()` class decorator also takes optional parameters.
+    See [Parameters for custom functions](#parameters-for-custom-functions) for details.
 
 *   A `spec` field must be present in the class, and must be annoated with the spec class name.
 *   The `prepare()` method is optional. It's executed once and only once before any `__call__` execution, to prepare the function execution.
 *   The `__call__()` method is required. It's executed for each specific rows of data.
-    Types of arugments and the return value must be annotated, so that CocoIndex will have information about data types of the operation's output fields. See [Data Types](/docs/core/data_types) for supported types.
+    Types of arugments and the return value must be decorated, so that CocoIndex will have information about data types of the operation's output fields.
+    See [Data Types](/docs/core/data_types) for supported types.
 
 </TabItem>
 </Tabs>
 
-## Examples
+### Examples
 
-The cocoindex repository contains the following examples of custom functions:
+The cocoindex repository contains the following examples of custom functions defined in this way:
 
-*   In the [pdf_embedding](https://github.com/cocoindex-io/cocoindex/blob/main/examples/pdf_embedding/pdf_embedding.py) example, we define a custom function `PdfToMarkdown`
+*   In the [pdf_embedding](https://github.com/cocoindex-io/cocoindex/blob/main/examples/pdf_embedding/main.py) example, we define a custom function `PdfToMarkdown`
 *   The `SentenceTransformerEmbed` function shipped with the CocoIndex Python package is defined by Python SDK.
-    Search for [`SentenceTransformerEmbedExecutor`](https://github.com/search?q=repo%3Acocoindex-io%2Fcocoindex+SentenceTransformerEmbedExecutor&type=code) to see the code.
-
+    Search for [`SentenceTransformerEmbedExecutor`](https://github.com/search?q=repo%3Acocoindex-io%2Fcocoindex+lang%3Apython+SentenceTransformerEmbedExecutor&type=code) to see the code.
+
+## Parameters for custom functions
+
+Custom functions take the following additional parameters:
+
+*   `gpu: bool`: Whether the executor will use GPU. It will affect the way the function is scheduled.
+
+*   `cache: bool`: Whether the executor will enable cache for this function.
+     When `True`, the executor will cache the result of the function for reuse during reprocessing.
+     We recommend to set this to `True` for any function that is computationally intensive.
+
+*   `behavior_version: int`: The version of the behavior of the function.
+    When the version is changed, the function will be re-executed even if cache is enabled.
+    It's required to be set if `cache` is `True`.
+
+For example:
+
+<Tabs>
+<TabItem value="python" label="Python" default>
+
+This enables cache for a standalone function:
+
+```python
+@cocoindex.op.function(cache=True, behavior_version=1)
+def compute_something(arg1: str, arg2: int | None = None) -> str:
+    ...
+```
+
+This enables cache for a function defined by a spec and an executor:
+
+```python
+class ComputeSomething(cocoindex.op.FunctionSpec):
+    ...
+
+@cocoindex.op.executor_class(cache=True, behavior_version=1)
+class ComputeSomethingExecutor:
+    spec: ComputeSomething
+
+    ...
+```
+
+</TabItem>
+</Tabs>