A Python SDK for connecting to avatar services via WebSocket, supporting audio streaming and receiving animation frames.
import asyncio
from datetime import datetime, timedelta, timezone
from avatar_sdk_python import new_avatar_session
async def main():
# Create session
session = new_avatar_session(
api_key="your-api-key",
app_id="your-app-id",
console_endpoint_url="https://console.us-west.spatialwalk.cloud/v1/console",
ingress_endpoint_url="https://api.us-west.spatialwalk.cloud/v2/driveningress",
avatar_id="your-avatar-id",
expire_at=datetime.now(timezone.utc) + timedelta(minutes=5),
transport_frames=lambda frame, last: print(f"Received frame: {len(frame)} bytes"),
on_error=lambda err: print(f"Error: {err}"),
on_close=lambda: print("Session closed")
)
# Initialize and connect
await session.init()
connection_id = await session.start()
print(f"Connected: {connection_id}")
# Send audio
audio_data = b"..." # Your PCM audio data
request_id = await session.send_audio(audio_data, end=True)
print(f"Sent audio: {request_id}")
# Wait for frames...
await asyncio.sleep(10)
# Close
await session.close()
if __name__ == "__main__":
asyncio.run(main())The SDK provides two ways to configure a session:
from avatar_sdk_python import new_avatar_session
session = new_avatar_session(
avatar_id="avatar-123",
api_key="your-api-key",
app_id="your-app-id",
# For web-style auth, set use_query_auth=True to put (appId, sessionKey)
# in the websocket URL query params instead of headers.
use_query_auth=False,
expire_at=datetime.now(timezone.utc) + timedelta(minutes=5),
console_endpoint_url="https://console.us-west.spatialwalk.cloud/v1/console",
ingress_endpoint_url="https://api.us-west.spatialwalk.cloud/v2/driveningress",
sample_rate=16000, # Default: 16000 Hz
transport_frames=on_frame_received,
on_error=on_error,
on_close=on_close
)from avatar_sdk_python import SessionConfigBuilder, AvatarSession
config = (SessionConfigBuilder()
.with_avatar_id("avatar-123")
.with_api_key("your-api-key")
.with_app_id("your-app-id")
.with_console_endpoint_url("https://console.us-west.spatialwalk.cloud/v1/console")
.with_ingress_endpoint_url("https://api.us-west.spatialwalk.cloud/v2/driveningress")
.with_expire_at(datetime.now(timezone.utc) + timedelta(minutes=5))
.with_transport_frames(on_frame_received)
.build())
session = AvatarSession(config)# 1. Initialize (get session token)
await session.init()
# 2. Start WebSocket connection
connection_id = await session.start()
# 3. Send audio data
request_id = await session.send_audio(audio_bytes, end=True)
# 4. Receive frames via callback
# (automatically handled in background)
# 5. Close session
await session.close()The SDK currently supports mono 16-bit PCM (s16le) audio:
- Sample Rate: one of
[8000, 16000, 22050, 24000, 32000, 44100, 48000] - Channels: 1 (mono)
- Bit Depth: 16-bit
- Format: Raw PCM bytes
# Example: Load PCM audio file
with open("audio.pcm", "rb") as f:
audio_data = f.read()
# Send in chunks or all at once
await session.send_audio(audio_data, end=True)Receives animation frames from the server:
def on_frame_received(frame_data: bytes, is_last: bool):
print(f"Received frame: {len(frame_data)} bytes")
if is_last:
print("This is the last frame")
# Process frame_data (contains serialized Message protobuf)Handles errors from the session:
def on_error(error: Exception):
print(f"Session error: {error}")Called when the session closes:
def on_close():
print("Session has been closed")Main class for managing avatar sessions.
async init()- Initialize session and obtain tokenasync start() -> str- Start WebSocket connection, returns connection IDasync send_audio(audio: bytes, end: bool = False) -> str- Send audio data, returns request IDasync close()- Close the session and clean up resourcesconfig -> SessionConfig- Get session configuration (property)
Configuration dataclass for avatar sessions.
avatar_id: str- Avatar identifierapi_key: str- API key for authenticationapp_id: str- Application identifieruse_query_auth: bool- Send websocket auth via query params (web) instead of headers (mobile)expire_at: datetime- Session expiration timesample_rate: int- Audio sample rate (default: 16000)bitrate: int- Audio bitrate (default: 0; PCM typically uses 0)transport_frames: Callable[[bytes, bool], None]- Frame callbackon_error: Callable[[Exception], None]- Error callbackon_close: Callable[[], None]- Close callbackconsole_endpoint_url: str- Console API URLingress_endpoint_url: str- Ingress WebSocket URL
Builder for constructing SessionConfig with fluent interface.
All methods return self for chaining:
with_avatar_id(avatar_id: str)with_api_key(api_key: str)with_app_id(app_id: str)with_expire_at(expire_at: datetime)with_sample_rate(sample_rate: int)with_transport_frames(handler: Callable)with_on_error(handler: Callable)with_on_close(handler: Callable)with_console_endpoint_url(url: str)with_ingress_endpoint_url(url: str)build() -> SessionConfig- Build the configuration
generate_log_id() -> str- Generate unique log ID in format "YYYYMMDDHHMMSS_<nanoid>"
SessionTokenError- Raised when session token request fails
See the examples directory for complete working examples:
- single_audio_clip - Basic usage with a single audio file
- http_service - Simple HTTP API that returns PCM audio (by sample rate) and generated animation Message binaries
The SDK uses Protocol Buffers for efficient serialization. The proto definitions are in proto/message.proto.
Proto code is generated using buf:
cd proto
buf generateThe generated Python code is placed in src/avatar_sdk_python/proto/generated/.
MESSAGE_CLIENT_CONFIGURE_SESSION(1) - Client session negotiation parametersMESSAGE_SERVER_CONFIRM_SESSION(2) - Server confirms and returnsconnection_idMESSAGE_CLIENT_AUDIO_INPUT(3) - Client audio inputMESSAGE_SERVER_ERROR(4) - Server-side error messageMESSAGE_SERVER_RESPONSE_ANIMATION(5) - Server animation response (endindicates final)
# Install uv if not already installed
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and setup
git clone <repository-url>
cd avatar-sdk-python
uv syncSee LICENSE for details.
Contributions are welcome! Please feel free to submit a Pull Request.