-
Notifications
You must be signed in to change notification settings - Fork 421
hypr-llm deployment groundwork #1094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThe changes introduce support for a new model variant "HyprLLM" across Rust and TypeScript codebases, including grammar selection logic and model metadata handling. Grammar constants and enum variants are refactored for finer control. Template prompts and shell scripts are updated for new directives and paths, and a new GBNF grammar file is added. Changes
Sequence Diagram(s)Model Loading and Grammar SelectionsequenceDiagram
participant Client
participant Server
participant Llama
participant GBNF
Client->>Server: Request with model and grammar metadata
Server->>Llama: Load model
Llama->>Llama: Read metadata, set ModelName (HyprLLM or Other)
Server->>Server: select_grammar(model_name, task)
Server->>GBNF: build grammar for model/task
GBNF-->>Server: Return grammar string
Server-->>Client: Response using selected grammar
Supported Model Handling in Local LLMsequenceDiagram
participant User
participant LocalLLM
participant Model
User->>LocalLLM: Select model (e.g., "HyprLLM")
LocalLLM->>Model: file_name(), model_url(), model_size()
Model-->>LocalLLM: Return model-specific info
LocalLLM-->>User: Model loaded and ready
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Clippy (1.86.0)error: failed to load source for dependency Caused by: Caused by: Caused by: ✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
plugins/local-llm/src/local/model.rs (1)
1-1: Critical: Update SUPPORTED_MODELS array.The array size is still 1 and only includes
Llama3p2_3bQ4, but now there are two supported models. This will cause the newHyprLLMmodel to be unavailable.Apply this fix:
-pub static SUPPORTED_MODELS: &[SupportedModel; 1] = &[SupportedModel::Llama3p2_3bQ4]; +pub static SUPPORTED_MODELS: &[SupportedModel; 2] = &[SupportedModel::Llama3p2_3bQ4, SupportedModel::HyprLLM];
🧹 Nitpick comments (2)
scripts/s3/upload.sh (1)
9-12:--log traceis useful for debugging but noisy in CI—gate it behind a flag.Full trace logging can produce very large logs and slow down jobs. Expose it via an env var so the default remains concise:
- --log trace \ + ${S3CMD_LOG_LEVEL:+--log "$S3CMD_LOG_LEVEL"} \Callers can then opt-in with
S3CMD_LOG_LEVEL=trace.crates/llama/src/lib.rs (1)
235-239: Consider performance impact of debug token printing.The debug build behavior prints each token immediately with stdout flushing. While useful for debugging, this could impact performance during development. Consider if this level of output is necessary.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
crates/gbnf/assets/enhance-other.gbnf(1 hunks)crates/gbnf/src/lib.rs(1 hunks)crates/llama/src/lib.rs(8 hunks)crates/template/assets/create_title.user.jinja(1 hunks)crates/template/assets/enhance.user.jinja(1 hunks)plugins/local-llm/js/bindings.gen.ts(1 hunks)plugins/local-llm/src/local/model.rs(1 hunks)plugins/local-llm/src/server.rs(2 hunks)scripts/s3/upload.sh(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{js,ts,tsx,rs}`: 1. No error handling. 2. No unused imports, variables, or functions. 3. For comments, keep it minimal. It should be about "Why", not "What".
**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
plugins/local-llm/js/bindings.gen.tsplugins/local-llm/src/local/model.rsplugins/local-llm/src/server.rscrates/gbnf/src/lib.rscrates/llama/src/lib.rs
🧬 Code Graph Analysis (1)
plugins/local-llm/js/bindings.gen.ts (1)
plugins/local-stt/js/bindings.gen.ts (1)
SupportedModel(69-69)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: ci (macos, macos-latest)
- GitHub Check: ci (windows, windows-latest)
- GitHub Check: ci
🔇 Additional comments (18)
crates/template/assets/create_title.user.jinja (1)
6-7: LGTM: Clean directive addition.The
/no_thinkdirective is appropriately placed after the title generation prompt.crates/template/assets/enhance.user.jinja (1)
23-24: LGTM: Appropriate directive placement.The
/thinkdirective complements the conditional thinking logic for HyprLocal type.plugins/local-llm/js/bindings.gen.ts (1)
49-49: LGTM: Consistent type definition.The SupportedModel type union correctly includes the new HyprLLM variant, maintaining consistency with the Rust backend.
plugins/local-llm/src/local/model.rs (4)
6-6: LGTM: Clean enum extension.The HyprLLM variant is properly added to the enum.
13-13: LGTM: Appropriate filename.The filename "hypr-llm.gguf" follows the naming convention.
20-20: LGTM: Valid model URL.The URL points to the correct HyprLLM model file.
27-27: LGTM: Reasonable model size.The model size (1107409056 bytes ≈ 1.03 GB) is appropriate for the model.
crates/gbnf/assets/enhance-other.gbnf (1)
1-6: LGTM: Well-formed grammar definition.The GBNF grammar correctly defines the structure for
<think>blocks with 1-4 formatted lines followed by arbitrary content.crates/gbnf/src/lib.rs (3)
1-2: LGTM! Clear separation of grammar constants.The addition of separate constants for
EnhanceOtherandEnhanceHyprgrammars follows the established pattern and provides clear distinction between model-specific grammar handling.
7-8: LGTM! Enum variants match the grammar separation.The
EnhanceOtherandEnhanceHyprvariants appropriately represent the split grammar functionality and maintain consistency with the constant naming convention.
16-17: LGTM! Correct enum-to-constant mapping.The match arms correctly map each enum variant to its corresponding constant, maintaining the established pattern for grammar retrieval.
plugins/local-llm/src/server.rs (2)
270-276: LGTM! Clean refactoring of grammar selection logic.The extraction of grammar selection into a dedicated helper function improves code organization and readability. The parameter mapping correctly passes the model name and grammar task from the request metadata.
333-343: LGTM! Well-structured grammar selection logic.The
select_grammarfunction provides clear model-specific grammar selection:
- Distinguishes between HyprLLM and other models for the "enhance" task
- Maps standard tasks to their corresponding grammar variants
- Returns
Nonefor unknown tasks, allowing fallback behaviorThe implementation aligns with the grammar separation introduced in the GBNF module.
crates/llama/src/lib.rs (5)
26-29: LGTM! Well-designed enum for model type distinction.The
ModelNameenum provides clear type-safe distinction between HyprLLM and other models. TheOthervariant withOption<String>allows for potential future model name tracking while maintaining flexibility.
32-32: LGTM! Appropriate struct extension.Adding the
namefield to theLlamastruct enables model-specific behavior and aligns with the grammar selection requirements introduced in the server module.
265-269: LGTM! Robust model name detection logic.The model name detection correctly:
- Checks the
general.namemetadata field- Maps "hypr-llm" to
ModelName::HyprLLM- Handles both valid names and missing metadata gracefully
- Provides fallback behavior for unknown cases
383-383: LGTM! Test updated to use new grammar variant.The test correctly uses
EnhanceOtherinstead of the previousEnhancevariant, aligning with the grammar separation changes.
88-93: Confirm sampler configuration for Qwen3-1.7B-GGUFThe model
Qwen/Qwen3-1.7B-GGUFis valid (pipeline: text-generation), but the chosen sampling hyperparameters are highly task-dependent. Please verify that these settings produce the desired balance of creativity, coherence, and generation speed in your use case.• File:
crates/llama/src/lib.rs(lines 88–93)
• Current settings:
- Temperature: 0.6
- Penalties:
(repeat_last_n=0, repeat_penalty=1.4, repeat_penalty_slope=0.1, repeat_penalty_scale=1.3)- Mirostat V2:
(seed=1234, tau=3.0, eta=0.2)If you need a different generation profile (e.g., higher diversity or stricter repetition control), adjust these values accordingly or benchmark against sample prompts.
|
|
No description provided.