Skip to content

Add RTX-focused fast path preset via config profile #11

@vecnode

Description

@vecnode

Add an opt‑in “fast path” configuration preset tuned for typical RTX GPUs that reuses the existing configuration system but applies performance‑oriented defaults such as autocast, recommended batch sizes, and CUDA/cuDNN flags. This should complement and build on top of the work tracked in #5 (Adaptive OS-based support for RTX series).

Why

  • Many users will have “standard” RTX setups where a curated set of defaults can provide an immediate speed boost.
  • Centralizing these choices in a named profile makes it easier to test, reproduce, and iterate on performance improvements.
  • Keeping this as a preset ensures existing conservative defaults remain available.

What to do

  • Define a named profile (e.g., rtx-fast) in the existing config system.
  • For that profile, specify:
    • Autocast / mixed precision defaults appropriate for RTX.
    • Recommended batch size / gradient accumulation settings.
    • Any CUDA/cuDNN knobs or environment variables that are safe and beneficial.
  • Expose this profile via CLI/API (for example, --profile rtx-fast), reusing the central config object from Centralize config via Pydantic/dataclass and env overrides #2.
  • Add a short section in the README explaining when to use this preset and its trade-offs.

Acceptance criteria

  • Users with common RTX GPUs can enable the fast preset with a single flag or config option.
  • Measurable speed improvements over the default profile on typical RTX hardware, with no regressions in correctness.
  • This issue is linked conceptually (and in description) to Adaptive OS-based support for RTX series #5 so that future GPU work has a clear place to plug in.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions