Skip to content

Complete database design project: LLM platform with 10 tables, 6000+ records, SQL scripts, and comprehensive documentation

Notifications You must be signed in to change notification settings

softkleenex/llm-database-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LLM Database Platform Project

A comprehensive database design and implementation project for a Large Language Model (LLM) management platform. This project demonstrates enterprise-level database architecture, SQL programming, and data modeling skills.

πŸ“‹ Project Overview

This project implements a complete database system for managing:

  • Users & Departments: Multi-user organization management
  • AI Projects: Project tracking and team collaboration
  • ML Models: Model lifecycle and version management
  • Datasets: Training and validation data management
  • Deployments: Production environment tracking
  • Sessions & Logs: User activity and system auditing
  • Prompt Templates: Reusable AI prompt library

Key Features

  • βœ… 10 interconnected tables with proper referential integrity
  • βœ… 6,118 test records across all tables
  • βœ… 43 validation queries testing all relationships
  • βœ… Complete documentation with execution guides
  • βœ… Automated testing with comprehensive error handling

πŸ—„οΈ Database Schema

Entity-Relationship Overview

DEPARTMENT (10 records)
    ↓ manages
USER (150 records) ← supervises each other
    ↓ creates
PROJECT (100 records)
    ↓ uses
MODEL (22 records)
    ↓ configured via
MODEL_CONFIG (132 records)
    ↓ trained on
DATASET (30 records)
    ↓ deployed to
DEPLOYMENTS (44 records)
    ↓ generates
SESSIONS (650 records)
    ↓ logs
SESSION_LOGS (4,860 records)
    ↑ references
PROMPT_TEMPLATE (120 records)

Table Summary

Table Records Purpose
DEPARTMENT 10 Organization structure
USER 150 System users
PROJECT 100 AI/ML projects
MODEL 22 ML model registry
MODEL_CONFIG 132 Model configurations
DATASET 30 Training datasets
DEPLOYMENTS 44 Production deployments
PROMPT_TEMPLATE 120 AI prompt templates
SESSIONS 650 User sessions
SESSION_LOGS 4,860 Activity logs
TOTAL 6,118

πŸš€ Quick Start

Prerequisites

  • Oracle Database 19c or later
  • SQL*Plus or SQL Developer
  • User account with CREATE TABLE, INSERT, SELECT privileges

Installation

  1. Clone the repository

    git clone https://github.com/softkleenex/llm-database-project.git
    cd llm-database-project
  2. Connect to Oracle

    sqlplus your_username/your_password@your_database
  3. Run the automated setup

    @run_all.sql

That's it! The script will:

  • Create all 10 tables
  • Insert 6,118 test records
  • Run 43 validation queries
  • Generate a log file with results

Manual Installation

If you prefer step-by-step execution:

-- 1. Create tables
@0_create_table.sql

-- 2. Insert data (in order)
@1_insert_department.sql
@2_insert_user.sql
@3_insert_project.sql
@4_insert_model.sql
@5_insert_model_config.sql
@6_insert_dataset.sql
@7_insert_deployments.sql
@8_insert_prompt_template.sql
@9_insert_sessions.sql
@10_insert_session_logs.sql

-- 3. Run tests
@test_all_queries.sql

πŸ“ Project Structure

database-project/
β”œβ”€β”€ run_all.sql                     # ⭐ Automated setup script
β”œβ”€β”€ test_all_queries.sql            # ⭐ Comprehensive test suite
β”‚
β”œβ”€β”€ SQL Scripts/
β”‚   β”œβ”€β”€ 0_create_table.sql          # Table definitions
β”‚   β”œβ”€β”€ 1_insert_department.sql     # Department data
β”‚   β”œβ”€β”€ 2_insert_user.sql           # User data
β”‚   β”œβ”€β”€ 3_insert_project.sql        # Project data
β”‚   β”œβ”€β”€ 4_insert_model.sql          # Model data
β”‚   β”œβ”€β”€ 5_insert_model_config.sql   # Config data
β”‚   β”œβ”€β”€ 6_insert_dataset.sql        # Dataset data
β”‚   β”œβ”€β”€ 7_insert_deployments.sql    # Deployment data
β”‚   β”œβ”€β”€ 8_insert_prompt_template.sql # Template data
β”‚   β”œβ”€β”€ 9_insert_sessions.sql       # Session data
β”‚   └── 10_insert_session_logs.sql  # Log data
β”‚
└── Documentation/
    β”œβ”€β”€ README.md                   # ⭐ This file
    β”œβ”€β”€ README_EXECUTION.md         # Detailed execution guide
    β”œβ”€β”€ TEST_SUMMARY.md             # Test results summary
    β”œβ”€β”€ FINAL_REPORT.md             # Project final report
    β”œβ”€β”€ EXECUTION_PLAN.md           # Implementation plan
    β”œβ”€β”€ ERROR_LOG.md                # Known issues & solutions
    β”œβ”€β”€ FINAL_CHECKLIST.md          # Quality assurance
    β”œβ”€β”€ ERD_SQL_MISMATCH_REPORT.md  # Schema validation
    └── database_team_proj_erd.pdf  # ER diagram

πŸ§ͺ Testing

The project includes comprehensive testing:

Running Tests

@test_all_queries.sql

Test Coverage

  • βœ… Table creation and constraints
  • βœ… Foreign key relationships
  • βœ… Data insertion validation
  • βœ… Query performance (JOINs, aggregations)
  • βœ… Complex queries (nested, correlated)
  • βœ… Edge cases and error handling

Expected Test Results

-- Verify record counts
SELECT 'DEPARTMENT' as table_name, COUNT(*) as records FROM DEPARTMENT
UNION ALL
SELECT 'USER', COUNT(*) FROM USER_ACCOUNT
-- ... (should match the table above)

πŸ“Š Sample Queries

1. Active Users by Department

SELECT d.department_name, COUNT(u.user_id) as user_count
FROM DEPARTMENT d
LEFT JOIN USER_ACCOUNT u ON d.department_id = u.department_id
WHERE u.status = 'active'
GROUP BY d.department_name
ORDER BY user_count DESC;

2. Project Models with Deployments

SELECT p.project_name, m.model_name, m.model_type,
       COUNT(dep.deployment_id) as deployment_count
FROM PROJECT p
JOIN MODEL m ON p.project_id = m.project_id
LEFT JOIN DEPLOYMENTS dep ON m.model_id = dep.model_id
GROUP BY p.project_name, m.model_name, m.model_type
ORDER BY deployment_count DESC;

3. Most Active Users (by Session Logs)

SELECT u.username, u.full_name, COUNT(sl.log_id) as activity_count
FROM USER_ACCOUNT u
JOIN SESSIONS s ON u.user_id = s.user_id
JOIN SESSION_LOGS sl ON s.session_id = sl.session_id
WHERE s.session_date >= ADD_MONTHS(SYSDATE, -1)
GROUP BY u.username, u.full_name
ORDER BY activity_count DESC
FETCH FIRST 10 ROWS ONLY;

πŸ“ˆ Database Design Highlights

Normalization

  • 3NF (Third Normal Form) compliance
  • No transitive dependencies
  • Minimal data redundancy

Constraints

  • Primary Keys: All tables
  • Foreign Keys: 15+ relationships
  • Check Constraints: Data validation
  • NOT NULL: Critical fields
  • UNIQUE: Usernames, emails

Indexing Strategy

  • Primary key indexes (automatic)
  • Foreign key indexes (recommended)
  • Suggested indexes for frequent queries

πŸ› οΈ Technology Stack

  • Database: Oracle Database 19c
  • Language: SQL (PL/SQL)
  • Tools: SQL*Plus, SQL Developer
  • Testing: Custom SQL test suite
  • Documentation: Markdown, PDF

πŸ“š Documentation

Detailed documentation is available in the following files:

  1. README_EXECUTION.md - Step-by-step execution guide with troubleshooting
  2. FINAL_REPORT.md - Complete project report with analysis
  3. TEST_SUMMARY.md - Test results and validation
  4. database_team_proj_erd.pdf - Visual ER diagram

🀝 Contributing

This is an academic project for educational purposes. Feel free to:

  • Fork the repository
  • Experiment with the schema
  • Add new queries
  • Improve documentation

πŸ“„ License

This project is available for educational and non-commercial use.

πŸ‘€ Author

SoftKleenex

  • GitHub: @softkleenex
  • Project: LLM Database Platform
  • Course: Database Systems (COMP322)

πŸ™ Acknowledgments

  • Database Systems course instructors
  • Oracle Database documentation
  • Team members who contributed to the original design

⭐ Star this repository if you find it helpful for learning database design!

πŸ“§ For questions or feedback, please open an issue on GitHub.

About

Complete database design project: LLM platform with 10 tables, 6000+ records, SQL scripts, and comprehensive documentation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages