Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .github/workflows/verify-generate.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: "Verify Generated Code"
on:
pull_request:

jobs:
verify-generate:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v6

- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: 'go.mod'

- name: Install Task
uses: go-task/setup-task@v1
with:
version: 3.x

- name: Install dependencies
run: go mod download

- name: Run task generate
run: task generate

- name: Check for diffs
run: |
git diff --exit-code || (echo "❌ Generated code is out of sync. Please run 'task generate' and commit the changes." && exit 1)
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@ cluster resources in real-time via audit log events consumed from NATS
JetStream, with declarative indexing policies controlling what gets indexed
using CEL-based filtering. The service integrates natively with kubectl/RBAC and
targets Meilisearch as the search backend.

![](./docs/diagrams/SearchServiceContext.png)
14 changes: 14 additions & 0 deletions Taskfile.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
version: '3'

includes:
docs:
taskfile: ./docs/Taskfile.yaml
dir: ./docs

vars:
TOOL_DIR: "{{.USER_WORKING_DIR}}/bin"
IMAGE_NAME: "ghcr.io/datum-cloud/search"
Expand Down Expand Up @@ -72,6 +77,8 @@ tasks:
# Code generation tasks
generate:
desc: Generate deepcopy and client code
deps:
- task: docs:generate
cmds:
- |
set -e
Expand Down Expand Up @@ -116,3 +123,10 @@ tasks:
- go vet ./...
- echo "✅ Vet complete"
silent: true

# Architecture diagram tasks
diagrams:
desc: Generate architecture diagrams from PlantUML
cmds:
- task: docs:diagrams
silent: true
68 changes: 68 additions & 0 deletions docs/Taskfile.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
version: '3'

vars:
DIAGRAMS_DIR: "{{.USER_WORKING_DIR}}/docs/diagrams"
OUTPUT_FORMAT: "png"

tasks:
generate:
desc: Generate all documentation artifacts (diagrams, etc.)
cmds:
- task: diagrams:render
silent: true

diagrams:
desc: Generate all architecture diagrams from PlantUML
cmds:
- task: diagrams:render
silent: true

diagrams:render:
desc: Render PlantUML diagrams to PNG format using Docker
cmds:
- |
set -e
echo "Rendering PlantUML diagrams..."
echo ""

# Check if PlantUML files exist
if [ ! -f "{{.DIAGRAMS_DIR}}/context.puml" ] || [ ! -f "{{.DIAGRAMS_DIR}}/container.puml" ]; then
echo "❌ Error: PlantUML source files not found in {{.DIAGRAMS_DIR}}"
exit 1
fi

# Render using Docker (no local installation required)
docker run --rm \
-v "{{.DIAGRAMS_DIR}}":/data \
plantuml/plantuml:latest \
-t{{.OUTPUT_FORMAT}} \
/data/*.puml

echo ""
echo "✅ Diagrams rendered in {{.DIAGRAMS_DIR}}"
echo ""
echo "Generated files:"
ls -1 {{.DIAGRAMS_DIR}}/*.{{.OUTPUT_FORMAT}} 2>/dev/null | xargs -n1 basename || echo "No output files found"
silent: true

diagrams:clean:
desc: Remove generated diagram files
cmds:
- |
rm -f {{.DIAGRAMS_DIR}}/*.png {{.DIAGRAMS_DIR}}/*.svg
echo "✅ Generated diagram files removed"
silent: true

diagrams:validate:
desc: Validate PlantUML syntax using Docker
cmds:
- |
set -e
echo "Validating PlantUML diagrams..."
docker run --rm \
-v "{{.DIAGRAMS_DIR}}":/data \
plantuml/plantuml:latest \
-syntax \
/data/*.puml
echo "✅ All diagrams are valid"
silent: true
90 changes: 88 additions & 2 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,89 @@
# Architecture
# Search Architecture

TODO: Document service architecture
## Overview

The Search service is a Kubernetes-native API built on the [aggregated API
server framework][apiserver-aggregation] that provides advanced resource
discovery capabilities through field filtering and full-text search. It enables
platform users to efficiently query and locate resources across the cluster
using powerful indexing and real-time event processing.

## Architecture Diagram

> [!NOTE]
>
> Below is a [C4 container diagram][c4] of the service and it's dependencies.
> This is meant to model individual components in the system and their
> responsibilities. It does not aim to provide visibility into external system
> components that may be a dependency of this system.

<p align="center">
<img src="./diagrams/SearchServiceContainers.png" alt="Search service component software architecture diagram">
</p>

[c4]: https://c4model.com

## Components

### Search API Server

**Purpose**: Expose search capabilities as native Kubernetes APIs

**Responsibilities**:
- Register custom API endpoints under `search.miloapis.com/v1alpha1`
- Handle authentication and authorization via Kubernetes RBAC
- Provide RESTful API for search queries
- Manage custom resource definitions for the search service

**Query Types**:
- **Field Filtering**: Exact match, prefix, range queries on structured fields
- **Full-Text Search**: Fuzzy matching, phrase queries, relevance scoring

### Resource Indexer

**Purpose**: Real-time indexing of platform resources from audit logs

**Responsibilities**:
- Subscribe to [NATS JetStream] audit log topic
- Filter events based on active index policies
- Evaluate [CEL expressions][CEL] for conditional indexing
- Extract and transform resource data for indexing
- Write to index backend with proper error handling and retries
- Manage index lifecycle (creation, updates, deletion)
- Bootstrap indexes from existing state

### Controller Manager

**Purpose**: Manages and validates resources for the search service

**Responsibilities**:
- Validates and activates index policies

### Index Backend Storage

**Purpose**: High-performance full-text search and indexing

**Responsibilities**:
- Structured metadata (namespace, name, labels, annotations) filtering
- Full-text searchable content

> [!NOTE]
>
> We're targeting [Meilisearch] as our first integration backend for indexed
> storage.

[Meilisearch]: https://www.meilisearch.com
[apiserver-aggregation]:
https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/
[NATS Jetstream]: https://nats.io
[CEL]: https://cel.dev

### etcd

**Purpose**: Distributed key-value store that provides reliable data storage
needs

**Responsibilities**:
- Store control plane resources for the Search API server (e.g. index policies)

[etcd]: https://etcd.io
Binary file added docs/diagrams/SearchServiceContainers.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/diagrams/SearchServiceContext.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
27 changes: 27 additions & 0 deletions docs/diagrams/container.puml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
@startuml SearchServiceContainers
!include https://raw.githubusercontent.com/plantuml-stdlib/C4-PlantUML/master/C4_Container.puml

Person(platformUser, "Platform User", "An operator who needs to query and locate platform resources")

System_Ext(platformAPI, "Platform API", "Manages cluster resources and provides aggregation layer for custom APIs")
System_Ext(eventBus, "Event Streaming", "Event stream of audit logs collected from the apiserver")

System_Boundary(searchService, "Search Service") {
Container(searchAPI, "Search APIServer", "Go, Kubernetes API Server Framework", "Provides a platform native API for searching resources")
Container(indexerService, "Resource Indexer", "Go", "Indexes platform resources based on index policies")
Container(controllerManager, "Controller Manager", "Go, Kubernetes Controller", "Manages index policies in the search service")
System(indexBackend, "Index Backend Storage", "High-performance full-text search engine for indexed resources")
System(etcd, "etcd", "Strongly consistent, distributed key-value store for storing metadata")
}

Rel_D(platformUser, searchAPI, "Submits search queries and manages index policies", "HTTPS/JSON/UI")
Rel(searchAPI, indexBackend, "Executes search queries", "HTTPS/JSON")
Rel_L(controllerManager, searchAPI, "Watches search resources", "HTTPS/Watch API")
Rel_L(searchAPI, etcd, "Stores control plane resources in", "gRPC")
Rel_L(indexerService, eventBus, "Consumes events from", "NATS Protocol")
Rel_U(indexerService, searchAPI, "Retrieves index policies from", "HTTPS")
Rel_D(platformAPI, eventBus, "Sends audit log events to", "HTTPS")
Rel_R(indexerService, platformAPI, "Bootstraps indexes from resources in", "HTTPS")
Rel_R(indexerService, indexBackend, "Writes indexed documents", "HTTPS/JSON")

@enduml
20 changes: 20 additions & 0 deletions docs/diagrams/context.puml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
@startuml SearchServiceContext
!include https://raw.githubusercontent.com/plantuml-stdlib/C4-PlantUML/master/C4_Context.puml

left to right direction

Person(platformUser, "Platform User", "An operator who needs to query and locate platform resources")

System(searchService, "Search Service", "Provides advanced resource discovery through field filtering and full-text search of platform resources")

System_Ext(platformAPI, "Platform API", "Manages core resources, authn, authz, and provides aggregation layer for custom APIs")
System_Ext(eventStream, "Event Streaming System", "Streams audit log events from the platform API")

Rel_R(platformUser, searchService, "Searches resources using CLI, UI, or API clients", "HTTPS/JSON")
Rel(platformUser, platformAPI, "Manages platform resources", "kubectl/HTTPS")

Rel_D(searchService, platformAPI, "Indexes resources from", "HTTPS")
Rel(searchService, eventStream, "Subscribes to audit log events", "NATS Protocol")
Rel(platformAPI, eventStream, "Publishes audit logs to")

@enduml