Conversation
# Description This PR introduces the primary entry point for the QUADTRIX engine in src/main.cpp. It establishes a unified workflow that handles model lifecycle management without relying on TorchScript, utilizing our custom internal headers for model architecture. # Key Features - Dual-Mode Execution: Integrated support for both a training loop and an interactive chat mode. - Infinite Generation: Implemented an unconstrained inference loop for continuous text generation. - C++ Architecture: Bypasses TorchScript to use custom-defined layers and headers, ensuring direct control over the execution graph. - Resource Management: only for CPU
# Description This PR synchronizes the model interaction logic across both the Python backend utilities and the web frontend. It establishes a consistent way to interface with the model weights and the C++ engine. ## Python Backend (inference.py) - Goal: Refactor the standalone inference script to support modern weight loading. - Weight Mapping: Updated to load and map .pt files directly using the refactored architecture. - Chat Mode: Implemented a robust interactive loop for rapid model testing and verification. ## Frontend Layer (frontend/src/api) - Goal: Establish the bridge between the UI and the Quadtrix engine. - Service Definition: Created the base API client to handle requests to the C++ backend. - Dual-Path Logic: Added handlers for both Training control and Inference/Chat endpoints. - Stream Support: Prepared the API layer to handle "generation" data chunks for real-time UI updates. ## other PR merge #7 #6 #5 #4 #3
## Summary <img width="2185" height="829" alt="run_20260430_192930" src="https://github.com/user-attachments/assets/420ebbb4-cadf-4408-bc69-fc32ad081c6f" /> ## Model Configuration | Parameter | Value | |---|---| | Layers | 6 | | Heads | 6 | | Embedding dim | 100 | | Block size | 190 | | Batch size | 64 | | Dropout | 0.2 | | Learning rate | 3e-4 | | Total parameters | **10,837,257** | ## Training Details | Field | Value | |---|---| | Steps | 8,000 | | Eval every | 200 steps | | Optimizer seed | 1337 | | Train tokens | 14,080,249 | | Val tokens | 1,564,473 | | Precision | bf16 | | MFU | 60.0% | ## Results | Metric | Value | |---|---| | Best val loss | **2.3918** | | Final train loss | 2.2825 | | Total loss drop | 8.57 | | Peak throughput | 19,602 tok/s | | Mean throughput | 18,756 tok/s | | Peak grad norm | 2.2504 | | Mean grad norm | 1.6894 | | Training time | **82m 43s** | | Checkpoint | `best_model.pt` |
…#30) ## Summary Publish GitHub Package using npm ## Checks - [ ] C++ build still works - [ ] Backend changes were smoke-tested locally - [ ] Frontend build still passes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
docs improvement with chat images
Checks
C++ build still works
Backend changes were smoke-tested locally
Frontend build still passes
Docs or screenshots were updated if needed