# Extending Bifrost

With the overview of Bifrost and how to build pipelines within the framework out of the way we can turn our attention to extending the core functionality of Bifrost. There are currently three options:

1. **Pure Python implementation** within the low or high level APIs
2. **Just-in-time compilation** via `bifrost.map` for GPU-accelerated custom operations
3. **C/C++/CUDA module** with Python wrapper for maximum performance

This tutorial demonstrates each approach with working examples.

## Option 1: Pure Python Custom Block

The simplest way to extend Bifrost is to create a custom block using pure Python. This is ideal for operations that don't require GPU acceleration or where development speed is more important than performance.

Here's an example of a custom `TransformBlock` that applies a simple scaling operation:

In [None]:
import bifrost as bf
import bifrost.pipeline as bfp
from bifrost.blocks import CopyBlock
import numpy as np

class ScaleBlock(bfp.TransformBlock):
    """A custom block that scales input data by a constant factor.
    
    Args:
        iring: Input ring
        scale_factor: Multiplicative scale factor to apply
    """
    def __init__(self, iring, scale_factor=1.0, *args, **kwargs):
        super(ScaleBlock, self).__init__(iring, *args, **kwargs)
        self.scale_factor = scale_factor
    
    def on_sequence(self, iseq):
        """Called when a new sequence starts. Return output header."""
        ihdr = iseq.header
        # Copy input header and add our scale factor for documentation
        ohdr = ihdr.copy()
        ohdr['scale_applied'] = self.scale_factor
        return ohdr
    
    def on_data(self, ispan, ospan):
        """Process each span of data."""
        idata = ispan.data
        odata = ospan.data
        # Apply the scaling operation
        np.multiply(idata, self.scale_factor, out=odata)

# Quick test of the custom block
print("ScaleBlock defined successfully!")

## Option 2: Using bifrost.map for GPU Acceleration

For GPU-accelerated custom operations, `bifrost.map` provides just-in-time compilation of custom CUDA code. This allows you to write CUDA kernels inline while Bifrost handles the compilation and execution.

Here's an example that computes a custom transformation on the GPU:

In [None]:
# Example: Using bifrost.map for a custom GPU operation
# This computes: output = sqrt(abs(input)) * sign(input)

import bifrost.map as bf_map

# Define a custom transformation using CUDA code
# The 'a' and 'b' variables are automatically mapped to input/output arrays
custom_kernel = """
// Compute signed square root: preserves sign while taking sqrt of magnitude
b = sqrt(abs(a)) * (a >= 0 ? 1 : -1);
"""

def apply_signed_sqrt(input_array, output_array):
    """Apply a signed square root transformation on GPU.
    
    Args:
        input_array: Input bifrost ndarray on GPU
        output_array: Output bifrost ndarray on GPU (same shape)
    """
    bf_map.map(custom_kernel, {'a': input_array, 'b': output_array})

print("bifrost.map example defined!")

### Creating a MapBlock for Pipeline Integration

To use `bifrost.map` in a pipeline, you can wrap it in a block:

In [None]:
class SignedSqrtBlock(bfp.TransformBlock):
    """GPU-accelerated signed square root using bifrost.map.
    
    Computes: output = sqrt(abs(input)) * sign(input)
    """
    def __init__(self, iring, *args, **kwargs):
        super(SignedSqrtBlock, self).__init__(iring, *args, **kwargs)
        # Pre-define the kernel code for efficiency
        self.kernel = "b = sqrt(abs(a)) * (a >= 0 ? 1 : -1);"
    
    def on_sequence(self, iseq):
        return iseq.header
    
    def on_data(self, ispan, ospan):
        bf_map.map(self.kernel, 
                   {'a': ispan.data, 'b': ospan.data})

print("SignedSqrtBlock defined successfully!")

## Option 3: C/C++/CUDA Extension

For maximum performance or when wrapping existing libraries, you can create a native C/C++ extension. There are two approaches:

- **External extension** (recommended): Build your extension separately and link against installed Bifrost
- **In-tree extension**: Add your code directly to the Bifrost source tree

### Approach A: External Extension (Recommended)

This approach lets you develop extensions independently without modifying Bifrost. Use the `bifrost-config` tool to get the correct compiler flags.

#### Check bifrost-config

First, verify that `bifrost-config` is available:

```bash
$ bifrost-config --help
Usage: bifrost-config [OPTIONS]

Options:
  --help          Show this help message
  --version       Print Bifrost version
  --cflags        Print C/C++ compiler flags
  --ldflags       Print linker flags
  --libs          Print libraries to link
  --have-cuda     Print yes if CUDA support is enabled
  --nvccflags     Print NVCC compiler flags (if CUDA enabled)
  ...

$ bifrost-config --cflags --libs
-I/usr/local/include -O3 ... -L/usr/local/lib -lbifrost ...
```

#### Create the Extension

A minimal extension needs:

**1. Header file (`src/bfscale.h`):**
```c
#include <bifrost/common.h>
#include <bifrost/array.h>

#ifdef __cplusplus
extern "C" {
#endif

BFstatus bfScale(BFarray const* in, BFarray* out, float scale);

#ifdef __cplusplus
}
#endif
```

**2. Implementation (`src/bfscale.cpp`):**
```cpp
#include "bfscale.h"

BFstatus bfScale(BFarray const* in, BFarray* out, float scale) {
    if (!in || !out) return BF_STATUS_INVALID_POINTER;
    if (in->dtype != BF_DTYPE_F32) return BF_STATUS_UNSUPPORTED_DTYPE;
    if (in->space != BF_SPACE_SYSTEM) return BF_STATUS_UNSUPPORTED_SPACE;
    
    float const* in_data = (float const*)in->data;
    float* out_data = (float*)out->data;
    
    long n = 1;
    for (int i = 0; i < in->ndim; ++i) n *= in->shape[i];
    
    for (long i = 0; i < n; ++i) {
        out_data[i] = in_data[i] * scale;
    }
    return BF_STATUS_SUCCESS;
}
```

**3. Makefile using bifrost-config:**
```makefile
BIFROST_CONFIG ?= bifrost-config

CXX      ?= $(shell $(BIFROST_CONFIG) --cxx)
CXXFLAGS := $(shell $(BIFROST_CONFIG) --cflags) -fPIC
LDFLAGS  := $(shell $(BIFROST_CONFIG) --ldflags)
LIBS     := $(shell $(BIFROST_CONFIG) --libs)

libbfscale.so: src/bfscale.o
	$(CXX) -shared -o $@ $^ $(LDFLAGS) $(LIBS)

%.o: %.cpp
	$(CXX) $(CXXFLAGS) -Isrc -c -o $@ $<
```

**4. Python wrapper (`python/bfscale.py`):**
```python
import ctypes
from bifrost.libbifrost import _check
from bifrost.ndarray import asarray
import bifrost.libbifrost_generated as _bf

_lib = ctypes.CDLL('./libbfscale.so')
_lib.bfScale.argtypes = [
    ctypes.POINTER(_bf.BFarray),
    ctypes.POINTER(_bf.BFarray),
    ctypes.c_float,
]
_lib.bfScale.restype = _bf.BFstatus

def scale(input_array, output_array, scale_factor):
    """Scale array elements by a constant factor."""
    in_arr = asarray(input_array)
    out_arr = asarray(output_array)
    _check(_lib.bfScale(
        in_arr.as_BFarray(),
        out_arr.as_BFarray(),
        float(scale_factor)
    ))
    return output_array
```

A complete working example is available in `examples/external_extension/`.

### Approach B: In-Tree Extension

If you want to contribute your extension back to Bifrost, or need tight integration with Bifrost internals, you can add code directly to the source tree:

1. Create source file in `src/` (e.g., `src/my_function.cpp`)
2. Create header in `src/bifrost/` (e.g., `src/bifrost/my_function.h`)
3. Add object file to `LIBBIFROST_OBJS` in `src/Makefile`
4. Rebuild Bifrost (`make && make install`)
5. Create Python wrapper in `python/bifrost/`

The ctypesgen tool automatically creates low-level bindings in `bifrost.libbifrost_generated` for any functions declared in `src/bifrost/*.h`.

## Testing Your Extensions

Always test your custom blocks to ensure they work correctly:

In [None]:
# Test the pure Python ScaleBlock
def test_scale_block():
    """Unit test for ScaleBlock."""
    import numpy.testing as npt
    
    # Create test data
    input_data = np.array([1.0, 2.0, 3.0, 4.0], dtype=np.float32)
    scale_factor = 2.5
    expected = input_data * scale_factor
    
    # For a full test, you would run through a pipeline:
    # with bfp.Pipeline() as pipeline:
    #     data = bfp.blocks.read_numpy_block(pipeline, input_data)
    #     scaled = ScaleBlock(data, scale_factor=scale_factor)
    #     # ... validate output
    
    # Simple functional test
    output_data = input_data * scale_factor
    npt.assert_array_almost_equal(output_data, expected)
    print("ScaleBlock test passed!")

test_scale_block()

## Summary

| Approach | Pros | Cons | Best For |
|----------|------|------|----------|
| **Pure Python** | Easy to write and debug | Slower for large data | Prototyping, simple operations |
| **bifrost.map** | GPU-accelerated, flexible | Requires CUDA knowledge | Custom GPU kernels |
| **External C/C++** | Maximum performance, independent development | Requires C knowledge | Production code, wrapping libraries |
| **In-tree C/C++** | Tight Bifrost integration, auto-generated bindings | Must modify Bifrost source | Contributing to Bifrost |

## Additional Resources

- `examples/external_extension/` - Complete working example of an external extension
- `bifrost-config --help` - Show all available build configuration options
- `src/bifrost/*.h` - Bifrost C API headers for reference