generated from deepgram/oss-repo-template
-
Notifications
You must be signed in to change notification settings - Fork 111
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Summary
When using the Deepgram Python SDK v5.0.0 with diarization enabled, the SDK throws a Pydantic validation error because speaker_confidence is typed as int in the response models, but the API returns it as a float.
What happened?
Actual:
- When making a transcription request with diarization enabled using SDK v5.0.0, the HTTP request succeeds (200 OK) but the SDK fails to parse the response
- Pydantic raises validation errors for every word because
speaker_confidenceis returned as a float (e.g., 0.6162807) but the SDK expects an integer - The error prevents the response from being returned, making the SDK unusable with diarization
Expected:
- The SDK should successfully parse the API response and return a valid ListenV1Response object
speaker_confidence should be accepted as a float since confidence scores are naturally between 0.0 and 1.0
Steps to reproduce
- Install deepgram-sdk==5.0.0
- Run the following code with any audio file that has multiple speakers
- Observe the type validation error
Minimal code sample
from deepgram import DeepgramClient
client = DeepgramClient(api_key="YOUR_API_KEY")
with open("deepgram_test.opus", "rb") as audio_file:
response = client.listen.v1.media.transcribe_file(
request=audio_file.read(),
model="nova-3",
detect_language=True,
smart_format=True,
punctuate=True,
diarize=True,
multichannel=True,
utterances=True,
)Logs / traceback
DEBUG:httpcore.connection:connect_tcp.started host='api.deepgram.com' port=443 local_address=None timeout=5.0 socket_options=None
DEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f8ae87bc8e0>
DEBUG:httpcore.connection:start_tls.started ssl_context=<ssl.SSLContext object at 0x7f8aea427240> server_hostname='api.deepgram.com' timeout=5.0
DEBUG:httpcore.connection:start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f8ae87bc640>
DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_headers.complete
DEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_body.complete
DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'content-type', b'application/json'), (b'dg-project-id', b'REDACTED'), (b'vary', b'origin, access-control-request-method, access-control-request-headers'), (b'vary', b'accept-encoding'), (b'access-control-allow-credentials', b'true'), (b'access-control-expose-headers', b'dg-model-name,dg-model-uuid,dg-char-count,dg-request-id,dg-project-id,dg-error'), (b'content-encoding', b'zstd'), (b'dg-request-id', b'REDACTED'), (b'transfer-encoding', b'chunked'), (b'date', b'Tue, 07 Oct 2025 13:19:02 GMT')])
INFO:httpx:HTTP Request: POST https://api.deepgram.com/v1/listen?detect_language=true&diarize=true&model=nova-3&multichannel=true&punctuate=true&smart_format=true&utterances=true "HTTP/1.1 200 OK"
DEBUG:httpcore.http11:receive_response_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_body.complete
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete
❌ Error: 1318 validation errors for union[ListenV1Response,ListenV1AcceptedResponse]
ListenV1Response.results.utterances.0.words.0.speaker_confidence
Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=0.6162807, input_type=float]
For further information visit https://errors.pydantic.dev/2.11/v/int_from_float
ListenV1Response.results.utterances.0.words.1.speaker_confidence
Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=0.6162807, input_type=float]
For further information visit https://errors.pydantic.dev/2.11/v/int_from_float
ListenV1Response.results.utterances.0.words.2.speaker_confidence
Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=0.6162807, input_type=float]
For further information visit https://errors.pydantic.dev/2.11/v/int_from_float
ListenV1Response.results.utterances.1.words.0.speaker_confidence
Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=0.6162807, input_type=float]
For further information visit https://errors.pydantic.dev/2.11/v/int_from_float
[... error repeats for all 1318 words in the transcription ...]
Transport
HTTP
API endpoint / path
/v1/listen
Model(s) used
nova-3
How often?
Always
Is this a regression?
- Yes, it worked in an earlier version
Last working SDK version (if known)
4.8.1
SDK version
5.0.0
Python version
3.10.18
Install method
pip
OS
Linux (x86_64)
Environment details
Link to minimal repro (optional)
No response
Session ID (optional)
No response
Project ID (optional)
No response
Request ID (optional)
No response
Code of Conduct
- I agree to follow this project’s Code of Conduct
eburgers81
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working