Skip to content

ElGap/edukaai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

EdukaAI

Privacy-first training data tool for LLM fine-tuning
From AI conversations to custom-trained models β€” 100% local, zero configuration

npm version CI npm downloads License: MIT

EdukaAI is a self-hosted tool that helps you collect, curate, and export training data for fine-tuning Large Language Models. It captures your best AI conversations β€” from OpenCode, OpenWebUI, or manual entry β€” and turns them into high-quality datasets, all while keeping your data local and private.

EdukaAI Screenshot


🎯 Why EdukaAI?

πŸ”’ Privacy First β€” Your data never leaves your machine. Local SQLite database, no cloud, no tracking.

⚑ Zero Configuration β€” Install and run. No complex setup. Start collecting in minutes.

🎣 Capture Anywhere β€” Manual entry, file imports, or live capture from your favorite AI tools.

βœ… Quality Control β€” Review workflow ensures only your best examples reach your training dataset.

πŸ“€ Multiple Export Formats β€” Alpaca, ShareGPT, JSONL, CSV β€” works with any training pipeline.


πŸš€ Quick Start

NPM (Recommended)

# One-time use
npx @elgap/edukaai

# Or install globally
npm install -g @elgap/edukaai
edukaai

Open http://localhost:3030 and start capturing.


✨ Four Ways to Capture Data

1️⃣ Manual Entry

Create samples directly in the web UI β€” perfect for human-crafted "golden" examples.

2️⃣ File Import

Upload existing datasets (Alpaca, ShareGPT, CSV) β€” ideal for migration.

3️⃣ OpenCode Plugin πŸ†•

Capture coding conversations with one click from OpenCode CLI.

Plugin: github.com/ElGap/edukaai-opencode

4️⃣ OpenWebUI Plugin πŸ†•

Export conversations from your self-hosted OpenWebUI instance.

Plugin: github.com/ElGap/edukaai-openwebui


πŸ“Š Core Features

Dataset Management

  • Create multiple datasets for different projects
  • Set goals and track progress with visual milestones
  • Organize by purpose: coding, creative writing, Q&A, roleplay

Training Sample Management

  • Core Fields: Instruction, Input, Output, System Prompt
  • Metadata: Category, Difficulty, Quality (1-5 stars), Tags
  • Review Workflow: Draft β†’ In Review β†’ Approved/Rejected
  • Bulk Operations: Approve, categorize, or delete multiple samples

Quality Control

  • Draft-First: All captures start in Draft for your review
  • Duplicate Detection: Automatic semantic similarity matching
  • Auto-Enrichment: Smart categorization and quality suggestions

Export Formats

  • Alpaca (JSON) β€” Industry standard
  • ShareGPT (JSON) β€” Conversation format
  • JSONL β€” For training pipelines
  • CSV β€” For analysis

πŸ”Œ Live Capture API

Any tool can send data to EdukaAI via the Universal Capture API:

curl -X POST http://localhost:3030/api/capture \
  -H "Content-Type: application/json" \
  -d '{
    "source": "my-plugin",
    "apiVersion": "1.0",
    "records": [{
      "instruction": "Explain quicksort",
      "output": "Quicksort is a divide-and-conquer algorithm...",
      "category": "coding",
      "qualityRating": 4
    }]
  }'

Endpoint: POST /api/capture
Docs: http://localhost:3030/docs


πŸ’» CLI Reference

Command Description
edukaai Start server (http://localhost:3030)
edukaai reset Reset database
edukaai help Show all commands

Environment Variables:

  • EDUKAAI_HOST (default: localhost)
  • EDUKAAI_PORT (default: 3030)
  • EDUKAAI_DATA_DIR (default: ~/.edukaai)

πŸ”’ Privacy & Security

  • 100% Local β€” SQLite database on your machine
  • No Cloud β€” No external API calls
  • No Tracking β€” Zero analytics or telemetry
  • MIT License β€” Full transparency

πŸ› οΈ For Developers

Tech Stack

  • Frontend: Vue 3 + Nuxt 4 + Tailwind CSS
  • Backend: Nuxt 4 API routes
  • Database: SQLite (Drizzle ORM)

Build from Source

git clone https://github.com/elgap/edukaai.git
cd edukaai
npm install
npm run dev

Commands

npm run db:reset      # Reset database
npm run test          # Run tests
npm run typecheck     # Type checking
npm run build         # Production build

Project Structure

edukaai/
β”œβ”€β”€ app/              # Nuxt frontend
β”œβ”€β”€ server/           # Backend API
β”œβ”€β”€ bin/              # CLI scripts
└── docs/             # Documentation

πŸ“– Documentation


🀝 Contributing

Contributions welcome:

  • Plugins β€” Build integrations for your favorite tools
  • Documentation β€” Tutorials and examples
  • Bug Reports β€” Help us improve

Contribution guidelines will be added soon.


πŸ“„ License

MIT License β€” see LICENSE


πŸ™ Acknowledgments

  • Inspired by the need for simple, private LLM training tools
  • Built with Nuxt, Vue, and Tailwind
  • Icons by Lucide

Built with ❀️ for the AI community

⬆ Back to Top

About

Dataset Management for LLM Fine-Tuning. Import from files or live capture, organize samples, manage quality, and export for fine-tuning.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors