Skip to content

📋 Epic: Custom Metadata Fields - Rich Extensible Metadata System #1359

@crivetimihai

Description

@crivetimihai

📋 Epic: Custom Metadata Fields - Rich Extensible Metadata System

Goal

Implement custom metadata fields that allow administrators to extend MCP servers, tools, resources, and A2A agents with rich, typed metadata beyond simple tags. This enables organizations to track cost centers, owners, SLAs, compliance requirements, business context, and other custom attributes without code changes.

Why Now?

As ContextForge deployments scale across organizations, different teams need different metadata:

  1. Business Context: Cost center, department, project code, budget tracking
  2. Operational Metadata: Owner email, on-call rotation, SLA tier, maintenance windows
  3. Compliance: Data classification, retention policy, audit requirements, regulatory framework
  4. Integration: External IDs (Jira ticket, ServiceNow CMDB, PagerDuty service)
  5. Lifecycle: Created date, last audit date, deprecation date, EOL timeline
  6. Custom Fields: Organization-specific attributes that vary by industry/use case

By making metadata fields configurable and typed, operators can extend ContextForge to fit their operational model without forking the codebase.


📖 User Stories

US-1: Platform Admin - Define Custom Metadata Schema

As a Platform Administrator
I want to define custom metadata fields with types and validation rules
So that users can attach rich, structured metadata to resources

Acceptance Criteria:

Given the configuration in .env:
  MCPGATEWAY_CUSTOM_METADATA_ENABLED=true
  MCPGATEWAY_METADATA_SCHEMA_FILE=metadata_schema.yaml

And metadata_schema.yaml contains:
  fields:
    - name: cost_center
      type: string
      required: true
      pattern: "^CC-[0-9]{4}$"
      description: "Cost center code (CC-XXXX)"
    
    - name: owner_email
      type: string
      required: true
      format: email
      description: "Resource owner email"
    
    - name: sla_tier
      type: enum
      required: false
      options: [gold, silver, bronze]
      default: bronze
      description: "Service level agreement tier"
    
    - name: monthly_budget
      type: number
      required: false
      min: 0
      max: 1000000
      description: "Monthly budget in USD"
    
    - name: is_production
      type: boolean
      required: true
      default: false
      description: "Production environment flag"
    
    - name: next_audit_date
      type: date
      required: false
      description: "Next scheduled audit date"

When a user creates a server with metadata:
  {
    "cost_center": "CC-1234",
    "owner_email": "team@example.com",
    "sla_tier": "gold",
    "is_production": true
  }
Then the server should be created successfully

When a user creates a server with invalid metadata:
  {
    "cost_center": "INVALID",
    "owner_email": "not-an-email"
  }
Then the API should return 400 Bad Request
And the error should describe validation failures

Technical Requirements:

  • YAML/JSON schema for metadata field definitions
  • Support types: string, number, boolean, date, enum, array
  • Validation: required, pattern, min/max, format, options
  • Per-entity-type schemas (server, tool, resource, agent)
  • Default values for optional fields
US-2: Developer - Edit Custom Metadata in Admin UI

As a Developer using the Admin UI
I want to see and edit custom metadata fields with appropriate input controls
So that I can manage resource metadata without writing JSON

Acceptance Criteria:

Given custom metadata fields are configured:
  - cost_center (string, pattern)
  - owner_email (string, email)
  - sla_tier (enum: gold/silver/bronze)
  - monthly_budget (number)
  - is_production (boolean)
  - next_audit_date (date)

When I navigate to Create Server page
Then I should see a "Custom Metadata" section with:
  - Text input for cost_center (with pattern hint)
  - Email input for owner_email
  - Dropdown for sla_tier
  - Number input for monthly_budget
  - Checkbox for is_production
  - Date picker for next_audit_date

When I enter invalid data (e.g., cost_center: "INVALID")
Then I should see inline validation errors
And the form should not submit

When I submit valid metadata
Then the server should be created with custom_metadata populated

Technical Requirements:

  • Fetch metadata schema from GET /metadata/schema
  • Dynamically render form fields based on type
  • Client-side validation using schema rules
  • Support for all field types with appropriate controls
  • Show descriptions as tooltips/help text
US-3: Operations Engineer - Search by Custom Metadata

As an Operations Engineer
I want to search and filter resources by custom metadata
So that I can find all resources for a cost center, owner, or SLA tier

Acceptance Criteria:

Given servers exist with custom metadata:
  - Server A: cost_center="CC-1234", sla_tier="gold"
  - Server B: cost_center="CC-1234", sla_tier="silver"
  - Server C: cost_center="CC-5678", sla_tier="gold"

When I query: GET /servers?metadata.cost_center=CC-1234
Then I should receive: [Server A, Server B]

When I query: GET /servers?metadata.sla_tier=gold
Then I should receive: [Server A, Server C]

When I query: GET /servers?metadata.cost_center=CC-1234&metadata.sla_tier=gold
Then I should receive: [Server A]

When I query: GET /servers?metadata.is_production=true
Then I should receive servers where is_production is true

Technical Requirements:

  • Support filtering by custom metadata fields
  • Query parameter format: metadata.field_name=value
  • Support multiple metadata filters (AND logic)
  • Index custom_metadata JSONB column for performance
  • Support type-aware filtering (numbers, dates, booleans)
US-4: Compliance Officer - Audit Metadata Changes

As a Compliance Officer
I want to track changes to custom metadata fields
So that I can audit who changed cost centers, owners, or compliance attributes

Acceptance Criteria:

Given a server exists with metadata:
  {
    "cost_center": "CC-1234",
    "owner_email": "team@example.com"
  }

When a user updates the server metadata to:
  {
    "cost_center": "CC-5678",
    "owner_email": "new-team@example.com"
  }

Then an audit log entry should be created with:
  - timestamp
  - user_id
  - resource_type: "server"
  - resource_id
  - field_name: "cost_center"
  - old_value: "CC-1234"
  - new_value: "CC-5678"
  - field_name: "owner_email"
  - old_value: "team@example.com"
  - new_value: "new-team@example.com"

And I can query: GET /admin/audit/metadata-changes
And export the audit log as CSV

Technical Requirements:

  • Track metadata changes in audit log table
  • Store before/after values for each field
  • Link to user who made the change
  • Filterable by entity type, field name, date range
  • Export audit trail for compliance reports
US-5: SRE - Bulk Update Metadata

As an SRE
I want to bulk update custom metadata across multiple resources
So that I can migrate cost centers or update owners for entire teams

Acceptance Criteria:

Given 50 servers have metadata: {"cost_center": "CC-1234"}

When I execute bulk update:
  POST /servers/bulk-metadata-update
  {
    "filter": {"metadata.cost_center": "CC-1234"},
    "updates": {"cost_center": "CC-5678", "owner_email": "new-owner@example.com"}
  }

Then all 50 servers should have updated metadata
And each update should be audited
And a summary should be returned: {updated: 50, failed: 0}

When validation fails for some resources
Then partial success should be allowed
And failed resources should be listed in response

Technical Requirements:

  • Bulk update endpoint with filtering
  • Transactional updates where possible
  • Partial success handling with rollback options
  • Audit trail for bulk operations
  • Rate limiting to prevent abuse
US-6: Integration Developer - Sync External Metadata

As an Integration Developer
I want to sync metadata from external systems (CMDB, ServiceNow)
So that ContextForge metadata stays synchronized with source of truth

Acceptance Criteria:

Given metadata schema includes:
  - servicenow_ci_id (string, external ID)
  - pagerduty_service_id (string, external ID)

When external system updates occur:
  - ServiceNow CI updated
  - PagerDuty service created

Then I can call: POST /webhooks/metadata-sync
With payload:
  {
    "source": "servicenow",
    "resource_type": "server",
    "resource_id": "uuid",
    "metadata": {
      "servicenow_ci_id": "CI123456",
      "owner_email": "updated@example.com"
    }
  }

And the server metadata should be updated
And conflicts should be logged

Technical Requirements:

  • Webhook endpoint for external metadata sync
  • Conflict resolution strategy (external wins, internal wins, merge)
  • Metadata source tracking (system of record)
  • Sync history and conflict log
  • API for bidirectional sync

🏗 Architecture

Metadata Schema Structure

# metadata_schema.yaml

# Global settings
version: "1.0"
allow_unknown_fields: false  # Reject undefined fields
merge_strategy: "replace"     # replace | merge | error on conflict

# Entity-specific schemas
entities:
  server:
    fields:
      - name: cost_center
        type: string
        required: true
        pattern: "^CC-[0-9]{4}$"
        description: "Cost center code"
        examples: ["CC-1234", "CC-5678"]
      
      - name: owner_email
        type: string
        required: true
        format: email
        description: "Primary owner email"
      
      - name: sla_tier
        type: enum
        required: false
        options: [gold, silver, bronze]
        default: bronze
        description: "Service level tier"
      
      - name: monthly_budget
        type: number
        required: false
        min: 0
        max: 1000000
        description: "Monthly budget USD"
      
      - name: is_production
        type: boolean
        required: true
        default: false
        description: "Production flag"
      
      - name: next_audit_date
        type: date
        required: false
        description: "Next audit date"
      
      - name: tags_extended
        type: array
        required: false
        item_type: string
        description: "Additional tags"
      
      - name: contacts
        type: object
        required: false
        description: "Contact information"
        properties:
          primary: 
            type: string
            format: email
          secondary:
            type: string
            format: email

  tool:
    fields:
      - name: execution_timeout
        type: number
        required: false
        min: 1
        max: 3600
        description: "Max execution time (seconds)"
      
      - name: rate_limit
        type: number
        required: false
        description: "Requests per minute"

  resource:
    fields:
      - name: data_classification
        type: enum
        required: true
        options: [public, internal, confidential, secret]
        default: internal
        description: "Data sensitivity level"

  a2a_agent:
    fields:
      - name: model_provider
        type: enum
        required: false
        options: [openai, anthropic, custom]
        description: "AI model provider"

Database Schema

-- Existing tables already have custom_metadata JSONB column
-- No migration needed for basic functionality

-- Add GIN index for JSONB queries
CREATE INDEX IF NOT EXISTS idx_servers_custom_metadata_gin 
ON servers USING gin(custom_metadata);

CREATE INDEX IF NOT EXISTS idx_tools_custom_metadata_gin 
ON tools USING gin(custom_metadata);

CREATE INDEX IF NOT EXISTS idx_resources_custom_metadata_gin 
ON resources USING gin(custom_metadata);

CREATE INDEX IF NOT EXISTS idx_a2a_agents_custom_metadata_gin 
ON a2a_agents USING gin(custom_metadata);

-- Metadata change audit log
CREATE TABLE metadata_audit_log (
    id UUID PRIMARY KEY,
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    user_id UUID REFERENCES users(id),
    entity_type VARCHAR(50) NOT NULL, -- 'server', 'tool', 'resource', 'a2a_agent'
    entity_id UUID NOT NULL,
    field_name VARCHAR(100) NOT NULL,
    old_value JSONB,
    new_value JSONB,
    source VARCHAR(50), -- 'ui', 'api', 'sync', 'bulk'
    request_id VARCHAR(100),
    INDEX idx_entity (entity_type, entity_id),
    INDEX idx_timestamp (timestamp),
    INDEX idx_user_id (user_id),
    INDEX idx_field_name (field_name)
);

-- Metadata schema cache (for performance)
CREATE TABLE metadata_schema_cache (
    id INTEGER PRIMARY KEY,
    schema_version VARCHAR(50) NOT NULL,
    entity_type VARCHAR(50) NOT NULL,
    schema_json JSONB NOT NULL,
    loaded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    UNIQUE(entity_type)
);

Validation Flow

sequenceDiagram
    participant Client as MCP Client
    participant API as FastAPI Router
    participant Schema as Pydantic Schema
    participant Validator as MetadataValidator
    participant Config as Schema File
    participant DB as Database

    Client->>API: POST /servers (name, custom_metadata)
    API->>Schema: Validate ServerCreate schema
    
    Schema->>Validator: validate_custom_metadata(metadata, entity_type="server")
    
    Validator->>Config: Load schema for entity_type="server"
    Config-->>Validator: field definitions
    
    loop For each metadata field
        Validator->>Validator: Check field defined in schema
        Validator->>Validator: Validate type (string/number/bool/date/enum)
        Validator->>Validator: Check required fields
        Validator->>Validator: Apply validation rules (pattern/min/max/format)
    end
    
    alt All validations pass
        Validator-->>Schema: Valid metadata
        Schema-->>API: Validation passed
        API->>DB: Create server with custom_metadata
        API->>DB: Log metadata in audit log (if configured)
        DB-->>API: Server created
        API-->>Client: 201 Created
    else Validation fails
        Validator-->>Schema: ValidationError(field, reason)
        Schema-->>API: 422 Unprocessable Entity
        API-->>Client: {detail: "Invalid metadata.cost_center: must match pattern ^CC-[0-9]{4}$"}
    end
Loading

Configuration Schema (Python)

# mcpgateway/config.py

class Settings(BaseSettings):
    # ... existing fields ...
    
    # Custom metadata settings
    MCPGATEWAY_CUSTOM_METADATA_ENABLED: bool = Field(
        default=False,
        description="Enable custom metadata fields"
    )
    
    MCPGATEWAY_METADATA_SCHEMA_FILE: str = Field(
        default="metadata_schema.yaml",
        description="Path to metadata schema definition file"
    )
    
    MCPGATEWAY_METADATA_ALLOW_UNKNOWN: bool = Field(
        default=False,
        description="Allow undefined metadata fields"
    )
    
    MCPGATEWAY_METADATA_AUDIT_CHANGES: bool = Field(
        default=True,
        description="Audit metadata changes to database"
    )
    
    MCPGATEWAY_METADATA_CACHE_TTL: int = Field(
        default=3600,
        description="Metadata schema cache TTL (seconds)"
    )

Pydantic Schema Integration

# mcpgateway/schemas.py

from mcpgateway.services.metadata_validator import validate_custom_metadata

class ServerCreate(BaseModel):
    name: str
    custom_metadata: dict[str, Any] = Field(default_factory=dict)
    # ... other fields ...
    
    @field_validator("custom_metadata")
    @classmethod
    def validate_custom_metadata_field(cls, metadata: dict[str, Any]) -> dict[str, Any]:
        """Validate custom metadata against schema if enabled."""
        from mcpgateway.config import get_settings
        
        settings = get_settings()
        
        if not settings.MCPGATEWAY_CUSTOM_METADATA_ENABLED:
            return metadata
        
        # Validate against schema
        validated = validate_custom_metadata(
            metadata=metadata,
            entity_type="server",
            schema_file=settings.MCPGATEWAY_METADATA_SCHEMA_FILE,
            allow_unknown=settings.MCPGATEWAY_METADATA_ALLOW_UNKNOWN
        )
        
        return validated

# Apply to: ServerUpdate, ToolCreate, ToolUpdate, ResourceCreate, ResourceUpdate, A2AAgentCreate, A2AAgentUpdate

📋 Implementation Tasks

  • Metadata Schema Loader

    • Create MetadataSchemaLoader class to parse YAML/JSON schema files
    • Support field types: string, number, boolean, date, enum, array, object
    • Parse validation rules: required, pattern, min, max, format, options, default
    • Cache parsed schemas in memory (TTL-based)
    • Support entity-specific schemas (server, tool, resource, agent)
    • Validate schema file on startup (fail fast)
    • Support schema versioning
    • Hot reload schema on file change (optional)
  • Metadata Validator

    • Create MetadataValidator class with validation logic
    • Type validation: string, number, boolean, date (ISO 8601), enum, array, object
    • Pattern validation: regex matching for strings
    • Range validation: min/max for numbers, dates
    • Format validation: email, url, uuid, ip
    • Required field checking
    • Default value application
    • Unknown field handling (reject or allow)
    • Comprehensive error messages with field path
    • Unit tests for all validation types (50+ tests)
  • Configuration

    • Add MCPGATEWAY_CUSTOM_METADATA_ENABLED to config.py
    • Add MCPGATEWAY_METADATA_SCHEMA_FILE (path to YAML)
    • Add MCPGATEWAY_METADATA_ALLOW_UNKNOWN (bool)
    • Add MCPGATEWAY_METADATA_AUDIT_CHANGES (bool)
    • Add MCPGATEWAY_METADATA_CACHE_TTL (int)
    • Update .env.example with metadata settings
    • Create example metadata_schema.yaml file
  • Pydantic Schema Integration

    • Add validate_custom_metadata_field() validator to ServerCreate
    • Add validator to ServerUpdate
    • Add validator to ToolCreate, ToolUpdate
    • Add validator to ResourceCreate, ResourceUpdate
    • Add validator to A2AAgentCreate, A2AAgentUpdate
    • Return clear validation errors (field path + reason)
    • Apply default values from schema
  • Database Indexes

    • Create GIN indexes on custom_metadata JSONB columns
    • Indexes for: servers, tools, resources, a2a_agents
    • Test query performance with metadata filtering
    • Document index usage in migration guide
  • Metadata Audit Log

    • Create Alembic migration for metadata_audit_log table
    • Create SQLAlchemy model for MetadataAuditLog
    • Implement audit logging service
    • Track before/after values for each field change
    • Log bulk operations with summary
    • Add indexes for common queries
  • API Endpoints

    • GET /metadata/schema - Return schema for entity type
    • GET /metadata/schema/{entity_type} - Entity-specific schema
    • GET /servers?metadata.field=value - Filter by custom metadata
    • POST /servers/bulk-metadata-update - Bulk update metadata
    • GET /admin/audit/metadata-changes - Query audit log
    • POST /webhooks/metadata-sync - External metadata sync
    • Add OpenAPI documentation for all endpoints
    • Add integration tests (10+ tests)
  • Query/Filter Support

    • Add metadata filtering to GET /servers
    • Add metadata filtering to GET /tools
    • Add metadata filtering to GET /resources
    • Add metadata filtering to GET /a2a
    • Support query format: metadata.field=value
    • Support multiple filters (AND logic)
    • Support type-aware filtering (numbers, booleans, dates)
    • Use GIN indexes for performance
    • Add tests for complex queries
  • Admin UI - Dynamic Forms

    • Fetch schema from GET /metadata/schema on page load
    • Dynamically render form fields based on type:
      • String → text input (with pattern validation)
      • Number → number input (with min/max)
      • Boolean → checkbox
      • Date → date picker
      • Enum → dropdown/select
      • Array → tag input (multi-value)
      • Object → nested fieldset
    • Show field descriptions as tooltips/help text
    • Client-side validation using schema rules
    • Display validation errors inline
    • Update Create/Edit forms: Server, Tool, Resource, A2A Agent
    • Show custom metadata in detail views
    • Playwright tests for UI (5+ tests)
  • Bulk Operations

    • Implement bulk update endpoint logic
    • Support filtering resources by metadata
    • Validate all updates before applying
    • Transactional updates where possible
    • Partial success handling
    • Return summary: {updated, failed, errors}
    • Audit bulk operations
    • Rate limiting protection
    • Add integration tests
  • External Sync

    • Create webhook endpoint for metadata sync
    • Support multiple sources (ServiceNow, PagerDuty, etc.)
    • Conflict resolution strategies: external_wins, internal_wins, merge
    • Track metadata source (system of record)
    • Log sync operations and conflicts
    • Add authentication for webhook endpoint
    • Document integration patterns
  • Testing

    • Unit tests: Schema loader (10+ tests)
    • Unit tests: Metadata validator (50+ tests covering all types)
    • Unit tests: Type validation (string, number, bool, date, enum, array, object)
    • Unit tests: Validation rules (required, pattern, min/max, format)
    • Integration tests: API endpoints with metadata (10+ tests)
    • Integration tests: Filtering by metadata (5+ tests)
    • Integration tests: Bulk operations (3+ tests)
    • Integration tests: Audit logging (3+ tests)
    • Playwright tests: Admin UI forms (5+ tests)
    • Performance tests: JSONB queries with GIN indexes
    • Test coverage: 90%+ for metadata validation code
  • Documentation

    • Update .env.example with metadata settings
    • Create example metadata_schema.yaml with all field types
    • Update CLAUDE.md with metadata section
    • Document schema file format and validation rules
    • Document API endpoints for metadata operations
    • Document filtering syntax (metadata.field=value)
    • Document bulk update operations
    • Document external sync webhook
    • Add migration guide for existing deployments
    • Add examples: common metadata schemas by use case
  • Code Quality

    • Run make autoflake isort black
    • Run make flake8 and fix issues
    • Run make pylint and address warnings
    • Run make doctest test htmlcov
    • Pass make verify checks

⚙️ Configuration Example

Minimal Configuration (.env)

# Enable custom metadata
MCPGATEWAY_CUSTOM_METADATA_ENABLED=true

# Path to schema file
MCPGATEWAY_METADATA_SCHEMA_FILE=metadata_schema.yaml

# Reject undefined fields (strict mode)
MCPGATEWAY_METADATA_ALLOW_UNKNOWN=false

# Audit all metadata changes
MCPGATEWAY_METADATA_AUDIT_CHANGES=true

# Schema cache TTL (1 hour)
MCPGATEWAY_METADATA_CACHE_TTL=3600

Flexible Configuration (Development)

# Enable custom metadata
MCPGATEWAY_CUSTOM_METADATA_ENABLED=true

# Allow any fields (flexible)
MCPGATEWAY_METADATA_ALLOW_UNKNOWN=true

# Don't audit changes (dev only)
MCPGATEWAY_METADATA_AUDIT_CHANGES=false

Example Metadata Schema (metadata_schema.yaml)

version: "1.0"
allow_unknown_fields: false

entities:
  server:
    fields:
      - name: cost_center
        type: string
        required: true
        pattern: "^CC-[0-9]{4}$"
        description: "Cost center code (CC-XXXX)"
        examples: ["CC-1234"]
      
      - name: owner_email
        type: string
        required: true
        format: email
        description: "Primary owner email"
      
      - name: sla_tier
        type: enum
        required: false
        options: [gold, silver, bronze]
        default: bronze
        description: "Service level tier"
      
      - name: monthly_budget
        type: number
        required: false
        min: 0
        max: 1000000
        description: "Monthly budget in USD"
      
      - name: is_production
        type: boolean
        required: true
        default: false
        description: "Production environment"
      
      - name: next_audit_date
        type: date
        required: false
        description: "Next audit date (ISO 8601)"
      
      - name: external_ids
        type: object
        required: false
        description: "External system IDs"
        properties:
          servicenow_ci:
            type: string
          pagerduty_service:
            type: string
          jira_project:
            type: string

  tool:
    fields:
      - name: execution_timeout
        type: number
        required: false
        min: 1
        max: 3600
        default: 30
        description: "Max execution time (seconds)"
      
      - name: rate_limit_rpm
        type: number
        required: false
        min: 0
        description: "Rate limit (requests per minute)"
      
      - name: allowed_environments
        type: array
        required: false
        item_type: string
        description: "Allowed execution environments"

  resource:
    fields:
      - name: data_classification
        type: enum
        required: true
        options: [public, internal, confidential, secret]
        default: internal
        description: "Data sensitivity"
      
      - name: retention_days
        type: number
        required: false
        min: 1
        max: 3650
        description: "Data retention (days)"

  a2a_agent:
    fields:
      - name: model_provider
        type: enum
        required: false
        options: [openai, anthropic, custom]
        description: "AI model provider"
      
      - name: max_tokens
        type: number
        required: false
        min: 1
        max: 100000
        description: "Max response tokens"

Enterprise Schema Example

version: "1.0"
allow_unknown_fields: false

entities:
  server:
    fields:
      # Financial
      - name: cost_center
        type: string
        required: true
        pattern: "^CC-[0-9]{4}$"
      
      - name: project_code
        type: string
        required: true
        pattern: "^PRJ-[A-Z0-9]{6}$"
      
      - name: monthly_budget_usd
        type: number
        required: true
        min: 0
      
      # Ownership
      - name: owner_email
        type: string
        required: true
        format: email
      
      - name: team_name
        type: string
        required: true
      
      - name: backup_contacts
        type: array
        required: false
        item_type: string
      
      # Compliance
      - name: data_classification
        type: enum
        required: true
        options: [public, internal, confidential, restricted]
      
      - name: compliance_frameworks
        type: array
        required: false
        item_type: string
        description: "SOC2, HIPAA, FedRAMP, etc."
      
      - name: retention_policy_days
        type: number
        required: true
        default: 90
      
      # Operations
      - name: sla_tier
        type: enum
        required: true
        options: [critical, high, standard, low]
        default: standard
      
      - name: maintenance_window
        type: string
        required: false
        description: "e.g., 'Sunday 02:00-04:00 UTC'"
      
      - name: on_call_rotation_url
        type: string
        required: false
        format: url
      
      # Lifecycle
      - name: deployment_date
        type: date
        required: false
      
      - name: last_audit_date
        type: date
        required: false
      
      - name: next_review_date
        type: date
        required: true
      
      - name: deprecation_date
        type: date
        required: false
      
      - name: eol_date
        type: date
        required: false
      
      # External Systems
      - name: external_ids
        type: object
        required: false
        properties:
          servicenow_ci_id:
            type: string
          jira_project_key:
            type: string
          pagerduty_service_id:
            type: string
          confluence_page_url:
            type: string
            format: url

✅ Success Criteria

  • Schema Loading: YAML/JSON schema files parsed and cached
  • Type Validation: Support string, number, boolean, date, enum, array, object
  • Validation Rules: Required, pattern, min/max, format, options, defaults
  • Configuration: Environment variables in .env and config.py
  • Pydantic Integration: Validators enforce schema on all entity types
  • Database: GIN indexes on custom_metadata JSONB columns
  • API Endpoints: Schema retrieval, filtering, bulk update, audit log
  • Query/Filter: Filter by custom metadata with type-aware queries
  • Admin UI: Dynamic forms with field type-specific controls
  • Audit Trail: Track metadata changes with before/after values
  • Bulk Operations: Update metadata across multiple resources
  • External Sync: Webhook endpoint for external system integration
  • Testing: 90%+ coverage; integration, unit, and UI tests pass
  • Documentation: Schema format, API docs, migration guide
  • Performance: JSONB queries with GIN indexes <50ms
  • Backward Compatible: Disabled by default, no breaking changes
  • Quality: Passes make verify checks

🏁 Definition of Done

  • MetadataSchemaLoader implemented with YAML/JSON parsing
  • MetadataValidator with all type validations and rules
  • Configuration settings in config.py
  • Environment variables documented in .env.example
  • Example metadata_schema.yaml created
  • Pydantic validators added to all schemas with custom_metadata
  • GIN indexes created via Alembic migration
  • Metadata audit log table and ORM model created
  • GET /metadata/schema endpoint implemented
  • Query filtering by custom metadata (GET /servers?metadata.field=value)
  • Bulk update endpoint (POST /servers/bulk-metadata-update)
  • Audit log query endpoint (GET /admin/audit/metadata-changes)
  • External sync webhook (POST /webhooks/metadata-sync)
  • Admin UI dynamic forms for all entity types
  • Unit tests: Schema loader (10+ tests)
  • Unit tests: Metadata validator (50+ tests)
  • Integration tests: API endpoints (15+ tests)
  • Playwright tests: Admin UI forms (5+ tests)
  • Performance tests: JSONB query benchmarks
  • Documentation: CLAUDE.md, schema format, API docs
  • Migration guide for existing deployments
  • Code passes make autoflake isort black pre-commit
  • Code passes make flake8 bandit interrogate pylint verify
  • Test coverage: 90%+ for new code
  • Backward compatible: disabled by default

📝 Additional Notes

🔹 Use Cases:

  • Financial Tracking: Cost centers, budgets, project codes
  • Operational Excellence: Owners, SLAs, maintenance windows, on-call rotations
  • Compliance: Data classification, retention policies, audit schedules
  • Lifecycle Management: Deployment dates, review cycles, deprecation timelines
  • External Integration: CMDB IDs, ticketing systems, monitoring services

🔹 Supported Field Types:

  • string: Text with optional pattern/format validation
  • number: Integers/floats with min/max
  • boolean: True/false flags
  • date: ISO 8601 dates (YYYY-MM-DD)
  • enum: Fixed set of allowed values
  • array: Lists of items (with item_type)
  • object: Nested structures with properties

🔹 Validation Rules:

  • required: Field must be present
  • pattern: Regex match for strings
  • format: email, url, uuid, ip validation
  • min/max: Range for numbers and dates
  • options: Allowed enum values
  • default: Default value if not provided

🔹 Performance Considerations:

  • Schema cached in memory (configurable TTL)
  • GIN indexes for JSONB queries (PostgreSQL)
  • Validation occurs at API layer (Pydantic)
  • Bulk operations optimized with transactions
  • Expected overhead: <5ms per request

🔹 Future Enhancements:

  • Tenant-Specific Schemas: Different schemas per tenant_id
  • Schema Versioning: Support multiple schema versions
  • Computed Fields: Derive values from other fields
  • Conditional Validation: Rules that depend on other field values
  • Metadata Templates: Predefined schemas for common use cases
  • UI Schema Builder: Visual editor for metadata schemas
  • Metadata Inheritance: Server metadata inherited by tools/resources
  • GraphQL Support: Query nested metadata with GraphQL

🔹 Migration Strategy:

  1. Deploy with metadata disabled (default)
  2. Create metadata_schema.yaml file
  3. Enable metadata and test validation
  4. Gradually populate metadata for existing resources
  5. Enable audit logging for production
  6. Integrate with external systems (CMDB, PagerDuty, etc.)

🔗 Related Issues

  • Tag management system
  • Custom metadata fields
  • Audit logging
  • Admin UI improvements
  • External system integration

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions