Add Temperature and Model Selection Parameters to xAI Realtime Plugin

### Feature Type

Would make my life easier

### Feature Description

The current xAI Realtime plugin (livekit.plugins.xai.realtime.RealtimeModel) has the model hardcoded to "grok-4-1-fast-non-reasoning" with no way to configure temperature or switch between available xAI models. This limits flexibility for developers who need deterministic responses or want to use other Grok model variants.

Current Limitation
In livekit/plugins/xai/realtime/realtime_model.py:

super().__init__(
    base_url=base_url if is_given(base_url) else XAI_BASE_URL,
    model="grok-4-1-fast-non-reasoning",  # Hardcoded - no override option
    voice=voice,
    api_key=api_key,
    # ...
)
Requested Features
1. Temperature Parameter
Add a temperature parameter to control response randomness:

class RealtimeModel(openai.realtime.RealtimeModel):
    def __init__(
        self,
        *,
        voice: NotGivenOr[GrokVoices | str | None] = "Ara",
        temperature: NotGivenOr[float] = NOT_GIVEN,  # NEW: Temperature control
        api_key: str | None = None,
        # ... other parameters
    ) -> None:

2. Model Selection Parameter
Allow developers to choose between available xAI realtime models:

class RealtimeModel(openai.realtime.RealtimeModel):
    def __init__(
        self,
        *,
        model: str = "grok-4-1-fast-non-reasoning",  # NEW: Configurable model
        voice: NotGivenOr[GrokVoices | str | None] = "Ara",
        temperature: NotGivenOr[float] = NOT_GIVEN,
        api_key: str | None = None,
        # ... other parameters
    ) -> None:

Use Cases
Deterministic Medical/Financial Applications: Developers building healthcare or financial voice agents need low temperature (0.1-0.3) for consistent, reliable responses without hallucinations.
Model Experimentation: Access to different Grok model variants (e.g., grok-4-1-fast-reasoning vs grok-4-1-fast-non-reasoning) for performance vs. quality trade-offs.
Creative Applications: Higher temperature settings (0.8-1.2) for conversational AI that needs more varied, creative responses.
Production Control: Ability to tune response behavior without switching to separate STT-LLM-TTS pipeline, maintaining the low-latency benefits of the realtime API.

Proposed API Usage
python
from livekit.agents import AgentSession
from livekit.plugins import xai

#Example 1: Deterministic responses for medical diagnosis
session = AgentSession(
    llm=xai.realtime.RealtimeModel(
        model="grok-4-1-fast-non-reasoning",
        voice="ara",
        temperature=0.2,  # Low randomness
        api_key=api_key,
    )
)

#Example 2: Use reasoning model with balanced creativity
session = AgentSession(
    llm=xai.realtime.RealtimeModel(
        model="grok-4-1-fast-reasoning",
        voice="sal",
        temperature=0.7,  # Balanced
        api_key=api_key,
    )
)
Implementation Notes
The temperature and model parameters should be passed through to the xAI Realtime API session configuration:


def _create_session_update_event(self) -> SessionUpdateEvent:
    event = super()._create_session_update_event() 
    #Add temperature if provided
    if is_given(self._temperature):
        event["session"]["temperature"] = self._temperature    
    return event

Compatibility
This change would maintain backward compatibility since:
Default model remains "grok-4-1-fast-non-reasoning"
Temperature defaults to NOT_GIVEN (uses API default)
Existing code continues to work without modification

References
xAI API documentation: https://docs.x.ai/docs/guides/voice/agent
xAI supports temperature parameter (0-2 range) in text models
OpenAI Realtime plugin already supports temperature configuration

### Workarounds / Alternatives

Currently, developers must use separate STT-LLM-TTS pipeline to control temperature:


#Current workaround - loses realtime API benefits
session = AgentSession(
    stt=deepgram.STT(model="nova-3"),
    llm=xai.LLM(model="grok-4-1-fast-non-reasoning", temperature=0.2),
    tts=cartesia.TTS(voice="sonic-3"),
)
This workaround sacrifices the low-latency benefits and integrated speech-to-speech capabilities of the xAI Realtime API.

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Temperature and Model Selection Parameters to xAI Realtime Plugin #4470

Feature Type

Feature Description

Workarounds / Alternatives

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Temperature and Model Selection Parameters to xAI Realtime Plugin #4470

Description

Feature Type

Feature Description

Workarounds / Alternatives

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions