Skip to content

[Bug]: Pydantic Validation Error for speaker_confidence in v5.0.0 #590

@ron-adc

Description

@ron-adc

Summary

When using the Deepgram Python SDK v5.0.0 with diarization enabled, the SDK throws a Pydantic validation error because speaker_confidence is typed as int in the response models, but the API returns it as a float.

What happened?

Actual:

  • When making a transcription request with diarization enabled using SDK v5.0.0, the HTTP request succeeds (200 OK) but the SDK fails to parse the response
  • Pydantic raises validation errors for every word because speaker_confidence is returned as a float (e.g., 0.6162807) but the SDK expects an integer
  • The error prevents the response from being returned, making the SDK unusable with diarization

Expected:

  • The SDK should successfully parse the API response and return a valid ListenV1Response object
    speaker_confidence should be accepted as a float since confidence scores are naturally between 0.0 and 1.0

Steps to reproduce

  1. Install deepgram-sdk==5.0.0
  2. Run the following code with any audio file that has multiple speakers
  3. Observe the type validation error

Minimal code sample

from deepgram import DeepgramClient

client = DeepgramClient(api_key="YOUR_API_KEY")

with open("deepgram_test.opus", "rb") as audio_file:
    response = client.listen.v1.media.transcribe_file(
        request=audio_file.read(),
        model="nova-3",
        detect_language=True,
        smart_format=True,
        punctuate=True,
        diarize=True,
        multichannel=True,
        utterances=True,
    )

Logs / traceback

DEBUG:httpcore.connection:connect_tcp.started host='api.deepgram.com' port=443 local_address=None timeout=5.0 socket_options=None
DEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f8ae87bc8e0>
DEBUG:httpcore.connection:start_tls.started ssl_context=<ssl.SSLContext object at 0x7f8aea427240> server_hostname='api.deepgram.com' timeout=5.0
DEBUG:httpcore.connection:start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f8ae87bc640>
DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_headers.complete
DEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_body.complete
DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'content-type', b'application/json'), (b'dg-project-id', b'REDACTED'), (b'vary', b'origin, access-control-request-method, access-control-request-headers'), (b'vary', b'accept-encoding'), (b'access-control-allow-credentials', b'true'), (b'access-control-expose-headers', b'dg-model-name,dg-model-uuid,dg-char-count,dg-request-id,dg-project-id,dg-error'), (b'content-encoding', b'zstd'), (b'dg-request-id', b'REDACTED'), (b'transfer-encoding', b'chunked'), (b'date', b'Tue, 07 Oct 2025 13:19:02 GMT')])
INFO:httpx:HTTP Request: POST https://api.deepgram.com/v1/listen?detect_language=true&diarize=true&model=nova-3&multichannel=true&punctuate=true&smart_format=true&utterances=true "HTTP/1.1 200 OK"
DEBUG:httpcore.http11:receive_response_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_body.complete
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete

❌ Error: 1318 validation errors for union[ListenV1Response,ListenV1AcceptedResponse]
ListenV1Response.results.utterances.0.words.0.speaker_confidence
  Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=0.6162807, input_type=float]
    For further information visit https://errors.pydantic.dev/2.11/v/int_from_float
ListenV1Response.results.utterances.0.words.1.speaker_confidence
  Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=0.6162807, input_type=float]
    For further information visit https://errors.pydantic.dev/2.11/v/int_from_float
ListenV1Response.results.utterances.0.words.2.speaker_confidence
  Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=0.6162807, input_type=float]
    For further information visit https://errors.pydantic.dev/2.11/v/int_from_float
ListenV1Response.results.utterances.1.words.0.speaker_confidence
  Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=0.6162807, input_type=float]
    For further information visit https://errors.pydantic.dev/2.11/v/int_from_float
[... error repeats for all 1318 words in the transcription ...]

Transport

HTTP

API endpoint / path

/v1/listen

Model(s) used

nova-3

How often?

Always

Is this a regression?

  • Yes, it worked in an earlier version

Last working SDK version (if known)

4.8.1

SDK version

5.0.0

Python version

3.10.18

Install method

pip

OS

Linux (x86_64)

Environment details


Link to minimal repro (optional)

No response

Session ID (optional)

No response

Project ID (optional)

No response

Request ID (optional)

No response

Code of Conduct

  • I agree to follow this project’s Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions