Skip to content

Research: Git/GitHub/GitOps vs ThemisDB version control comparison with existing YAML analysis#470

Merged
makr-code merged 7 commits intodevelopfrom
copilot/compare-git-and-themis
Jan 14, 2026
Merged

Research: Git/GitHub/GitOps vs ThemisDB version control comparison with existing YAML analysis#470
makr-code merged 7 commits intodevelopfrom
copilot/compare-git-and-themis

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 14, 2026

Description

Research documentation comparing Git/GitHub/GitOps version control concepts against ThemisDB's MVCC system, with concrete recommendations for adopting YAML-based declarative configuration. Includes comprehensive analysis of ThemisDB's existing YAML usage across external (PII, compliance) and internal (configuration, deployment) systems.

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📝 Documentation update
  • ♻️ Code refactoring (no functional changes)
  • ⚡ Performance improvement
  • ✅ Test addition or update
  • 🔧 Configuration change
  • 🎨 UI/UX change

Related Issues

Research request for Git/GitOps comparison and YAML adoption strategy for ThemisDB.

Changes Made

Research Documents (all in docs/research/)

  • git_gitops_themis_vergleich.md (875 lines)

    • Conceptual mapping: Git Commit ↔ Transaction, Git Branch ↔ Concurrent Transaction, Git Merge ↔ Commit, Git Conflict ↔ Write-Write Conflict
    • Detailed comparison of Git, GitHub, GitOps, and ThemisDB MVCC architectures
    • YAML adoption strategies: declarative schema definition, CI/CD integration, GitOps deployment, schema versioning
    • Implementation roadmap Q1-Q4 2026
  • bestehende_yaml_nutzung.md (480 lines) - NEW

    • Comprehensive analysis of ThemisDB's existing YAML usage
    • External YAML: PII patterns, retention policies (GDPR/eIDAS), ethical guidelines, document metadata schemas
    • Internal YAML: server config, Kubernetes CRDs, NLP config, LLM models, sharding, OpenAPI specs
    • Demonstrates that YAML is already established practice in ThemisDB
    • Shows proposed schema definition is evolutionary, not revolutionary

YAML Schema Examples (docs/research/schema/)

  • themis-schema.example.yaml (152 lines)

    • Declarative schema definition for social media app (users, posts)
    • Index definitions: secondary, vector (HNSW), fulltext
    • Security/RBAC, backup/PITR configuration
  • README.md (297 lines)

    • Usage patterns and CLI commands (planned)
    • GitOps workflow integration examples
    • Migration strategies and best practices

Documentation Updates

  • Cross-referenced from README.md Core Concepts
  • Linked from architecture/README.md and architecture_mvcc.md
  • All documents moved to docs/research/ per maintainer feedback

Testing

Test Environment

  • OS: Documentation-only change
  • Compiler: N/A
  • Build Type: N/A

Test Results

  • All existing tests pass
  • New tests added for changes
  • Manual testing performed

Test Commands

# Verified documentation structure and links
grep -c "^#" docs/research/git_gitops_themis_vergleich.md  # 74 sections
wc -l docs/research/git_gitops_themis_vergleich.md          # 875 lines
wc -l docs/research/bestehende_yaml_nutzung.md              # 480 lines

# Verified existing YAML files referenced
find config -name "*.yaml" | wc -l  # 20+ YAML config files

Checklist

  • My code follows the coding standards
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Code Quality

  • Code builds without errors
  • Code builds without warnings
  • Static analysis (cppcheck) passes
  • No memory leaks detected
  • Code follows C++17 standards

Documentation

  • README.md updated (if applicable)
  • CHANGELOG.md updated
  • API documentation updated (if applicable)
  • Code comments added/updated

Branch Strategy Compliance

  • PR targets the correct branch (develop for features, main for releases/hotfixes)
  • Branch naming follows convention (e.g., feature/, bugfix/, hotfix/, release/)
  • No direct commits to main or develop

Performance Impact

  • No significant performance impact
  • Performance improvement (describe below)
  • Performance regression (justify below)

Performance Notes:
Documentation-only change with no performance impact.

Breaking Changes

No breaking changes.

Security Considerations

  • No security implications
  • Security review required
  • Dependencies updated to secure versions

Additional Notes

Key Finding: ThemisDB Already Uses YAML Extensively!

Analysis reveals ThemisDB uses YAML at 20+ locations:

External (Compliance & Security):

  • PII pattern detection with confidence scoring
  • Retention policies (GDPR/eIDAS compliance)
  • Ethical guidelines (UN Human Rights-based)
  • Document metadata schemas (DocumentManager)

Internal (Operations):

  • Server configuration (RocksDB, LLM, security)
  • Kubernetes CRDs (declarative deployments)
  • NLP configuration (stopwords, tokenization)
  • LLM model definitions
  • Sharding/RAID configuration
  • OpenAPI specifications

Implication: The proposed schema definition is not new, but a logical extension of existing patterns. Implementation is evolutionary, not revolutionary.

YAML Adoption Roadmap

Q1 2026: Schema definition format, parser, validation engine
Q2 2026: CLI tools (themis schema apply/diff/export), GitHub Actions templates
Q3 2026: Data branching (named snapshots), time-travel queries
Q4 2026: Data pull requests, distributed sync

Conceptual Benefits

  • Familiar Git semantics for database operations (branching, merging, conflicts)
  • Infrastructure as Code for database schemas
  • GitOps-compatible deployment workflows
  • Auditability through version control
  • Builds on established patterns (DocumentManager schemas, Kubernetes CRDs)

Example: Declarative Schema Definition

Current (imperative):

curl -X POST /index/create -d '{"table":"users","column":"email","type":"secondary","unique":true}'
curl -X POST /index/create -d '{"table":"users","column":"embedding","type":"vector","algorithm":"hnsw"}'

Proposed (declarative YAML):

tables:
  users:
    primary_key: user_id
    indexes:
      - name: idx_users_email
        columns: [email]
        type: secondary
        unique: true
      
      - name: idx_users_embedding
        columns: [embedding]
        type: vector
        algorithm: hnsw
        config:
          m: 16
          ef_construction: 200

Apply with: themis schema apply schema.yaml

Best Practices Identified

  1. Plugin Architecture: PII patterns support hot-reload
  2. Compliance-First: Legal references embedded in config
  3. Security-by-Default: Safe defaults, explicit opt-ins
  4. Multilingual: i18n directly in configuration
  5. Self-Documenting: Descriptions and comments embedded

Screenshots/Logs

N/A - Documentation update


For Maintainers:

Review Checklist

  • Code quality acceptable
  • Tests adequate
  • Documentation complete
  • No security concerns
  • Ready to merge

Merge Strategy

  • Squash and merge (✅ Recommended for feature/bugfix PRs - cleaner history)
  • Merge commit (Only for release/hotfix branches)
  • Rebase and merge
Original prompt

Beschreibe git, github, gitops im Vergleich zur Versionskontrolle von Themis. Ich denke wir können von git viele Elemente noch übernehmen insbesondere die YAML


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits January 14, 2026 06:35
Co-authored-by: makr-code <150588092+makr-code@users.noreply.github.com>
Co-authored-by: makr-code <150588092+makr-code@users.noreply.github.com>
Co-authored-by: makr-code <150588092+makr-code@users.noreply.github.com>
Copilot AI changed the title [WIP] Compare git, github, and gitops with Themis version control Add comprehensive Git/GitHub/GitOps vs ThemisDB version control comparison Jan 14, 2026
Copilot AI requested a review from makr-code January 14, 2026 06:39
@makr-code
Copy link
Copy Markdown
Owner

@copilot Da es research anfragen sind (ohne Implementierungen) sollen diese Dokumente nach .\docs\research\ gespeichert werden

Co-authored-by: makr-code <150588092+makr-code@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jan 14, 2026

@copilot Da es research anfragen sind (ohne Implementierungen) sollen diese Dokumente nach .\docs\research\ gespeichert werden

Dokumente wurden nach docs/research/ verschoben wie gewünscht (commit 3e1db5e). Alle Referenzen in README.md und Architecture-Dokumentation wurden entsprechend aktualisiert.

Co-authored-by: makr-code <150588092+makr-code@users.noreply.github.com>
Copilot AI changed the title Add comprehensive Git/GitHub/GitOps vs ThemisDB version control comparison Research: Git/GitHub/GitOps vs ThemisDB version control comparison with existing YAML analysis Jan 14, 2026
@makr-code makr-code marked this pull request as ready for review January 14, 2026 08:39
@makr-code makr-code merged commit e7165e9 into develop Jan 14, 2026
@makr-code makr-code added this to the v1.3.0 milestone Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants