feat: implement AI training ban and major CLI/Build pipeline overhaul#308
Merged
LunaStev merged 1 commit intoMar 27, 2026
Conversation
This commit enforces a strict No-AI-Training policy across the entire repository and introduces a massive redesign of the `wavec` CLI and compilation pipeline. [Details] 1. AI/ML Training Prohibition - Added `ai.txt` to explicitly disallow crawling, scraping, pretraining, fine-tuning, and dataset construction for AI models without prior written permission (includes multilingual notices). - Injected an `AI TRAINING NOTICE` header into almost all source files (`.rs`, `.wave`, `.py`, `.sh`, `Dockerfile`). 2. CLI & Build System Overhaul (`src/cli.rs`) - Replaced basic run/build logic with a highly configurable `BuildRequest` and `BuildPlan` architecture. - Introduced `--emit` flag targeting specific outputs: `ast`, `ir`, `bc`, `asm`, `obj`, `bin`, and `check`. - Added `check` command (alias for `--emit=check`) for rapid semantic validation. - Added advanced linking and compilation flags: `--shared`, `--static`, `--pie`, `--no-pie`, `--freestanding`, `--entry`, `--linker-script`, `--no-start-files`. - Added support for lowering and linking non-Wave inputs (IR, Bitcode, Assembly, Object files) using Clang as a backend tool via `--input-type`. - Added `--dry-run` execution mode with both Human and JSON (`--error-format=json`) output formats. - Added `print` command to query compiler details (e.g., `target-list`, `sysroot`). 3. LLVM Backend Enhancements - Plumbed `code_model` and `relocation_model` options through LLVM configuration flags (`-C code-model`, `-C relocation-model`). 4. Error Handling & Runner Refactoring - Updated `CliError` to return specific exit codes (1, 2, 3) depending on the error type. - Refactored `runner.rs` to expose modular AST/IR generation functions (`frontend_prepare_wave_ast`, `emit_wave_ast_text`, etc.) to support the new build plans. Signed-off-by: LunaStev <luna@lunastev.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces a strict "No-AI-Training" policy across the repository and executes a foundational redesign of the
wavecCLI and compilation pipeline. By moving to a structuredBuildPlanarchitecture and adding professional-grade compiler flags, Wave is now a much more capable toolchain for systems-level development and CI/CD integration.Key Changes
1. AI/ML Training Prohibition & Policy Enforcement
ai.txtto the repository root. This file explicitly prohibits the use of the codebase for crawling, scraping, pretraining, fine-tuning, or dataset construction without prior written permission.AI TRAINING NOTICEheader across all source files, including.rs,.wave,.py,.sh, andDockerfile, to ensure the policy is visible at the source level.2. Professional CLI & Build System Redesign
The CLI has been completely refactored from an imperative argument loop to a structured
BuildRequestandBuildPlanarchitecture:--emit=<kinds>flag. Supported outputs includeast,ir,bc,asm,obj,bin, andcheck.checkcommand (alias forbuild --emit=check) to perform rapid syntax and semantic validation without the overhead of code generation.--shared,--static,--pie,--no-pie, and--freestanding.--entry,--linker-script, and--no-start-files.--input-type=<wave|ir|bc|asm|obj>, allowingwavecto act as a wrapper for linking LLVM IR or assembly files.printcommand to query toolchain capabilities such astarget-list,sysroot,cpu-list, andtarget-features.--dry-runmode and supported--error-format=jsonfor better integration with IDEs and automation tools.3. LLVM Backend & Runner Enhancements
code-modelandrelocation-modelthrough the-C(Codegen) flags.src/runner.rsto expose modular functions likefrontend_prepare_wave_astandemit_wave_ast_text. This allows the compiler to handle complex multi-step build plans more efficiently.CliErrorto return structured exit codes (1 for general, 2 for syntax/semantic, 3 for backend errors) to simplify pipeline logic.Example Usage
Emitting LLVM IR and Assembly:
Rapid Semantic Checking:
Linking with a custom linker script for a freestanding target:
Querying supported targets:
Benefits
--emitand linking flags bringwaveccloser to the capabilities ofclangorrustc.