Skip to content

Implement core engine entry point and refactor Python inference#37

Merged
Eamon2009 merged 5 commits into
masterfrom
exp
May 15, 2026
Merged

Implement core engine entry point and refactor Python inference#37
Eamon2009 merged 5 commits into
masterfrom
exp

Conversation

@Eamon2009
Copy link
Copy Markdown
Owner

Summary

docs improvement with chat images

Checks

C++ build still works
Backend changes were smoke-tested locally
Frontend build still passes
Docs or screenshots were updated if needed

codeaddict-119 and others added 5 commits May 1, 2026 10:08
# Description
This PR introduces the primary entry point for the QUADTRIX engine in
src/main.cpp. It establishes a unified workflow that handles model
lifecycle management without relying on TorchScript, utilizing our
custom internal headers for model architecture.

# Key Features
- Dual-Mode Execution: Integrated support for both a training loop and
an interactive chat mode.

- Infinite Generation: Implemented an unconstrained inference loop for
continuous text generation.

- C++ Architecture: Bypasses TorchScript to use custom-defined layers
and headers, ensuring direct control over the execution graph.

- Resource Management: only for CPU
# Description
This PR synchronizes the model interaction logic across both the Python
backend utilities and the web frontend. It establishes a consistent way
to interface with the model weights and the C++ engine.

##  Python Backend (inference.py)
- Goal: Refactor the standalone inference script to support modern
weight loading.

- Weight Mapping: Updated to load and map .pt files directly using the
refactored architecture.

- Chat Mode: Implemented a robust interactive loop for rapid model
testing and verification.

##  Frontend Layer (frontend/src/api)
- Goal: Establish the bridge between the UI and the Quadtrix engine.

- Service Definition: Created the base API client to handle requests to
the C++ backend.

- Dual-Path Logic: Added handlers for both Training control and
Inference/Chat endpoints.

- Stream Support: Prepared the API layer to handle "generation" data
chunks for real-time UI updates.

## other PR merge

#7  #6  #5  #4 #3
## Summary
<img width="2185" height="829" alt="run_20260430_192930"
src="https://github.com/user-attachments/assets/420ebbb4-cadf-4408-bc69-fc32ad081c6f"
/>

 
## Model Configuration
 
| Parameter | Value |
|---|---|
| Layers | 6 |
| Heads | 6 |
| Embedding dim | 100 |
| Block size | 190 |
| Batch size | 64 |
| Dropout | 0.2 |
| Learning rate | 3e-4 |
| Total parameters | **10,837,257** |
 
## Training Details
 
| Field | Value |
|---|---|
| Steps | 8,000 |
| Eval every | 200 steps |
| Optimizer seed | 1337 |
| Train tokens | 14,080,249 |
| Val tokens | 1,564,473 |
| Precision | bf16 |
| MFU | 60.0% |
 
## Results
 
| Metric | Value |
|---|---|
| Best val loss | **2.3918** |
| Final train loss | 2.2825 |
| Total loss drop | 8.57 |
| Peak throughput | 19,602 tok/s |
| Mean throughput | 18,756 tok/s |
| Peak grad norm | 2.2504 |
| Mean grad norm | 1.6894 |
| Training time | **82m 43s** |
| Checkpoint | `best_model.pt` |
…#30)

## Summary
 Publish GitHub Package using npm
## Checks

- [ ] C++ build still works
- [ ] Backend changes were smoke-tested locally
- [ ] Frontend build still passes
@Eamon2009 Eamon2009 merged commit 3062917 into master May 15, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants