Transcription API response is missing diarized data for response_format=diarized_json

## Issue

When you try to use `openAiClient.audio().transcriptions().create(createParams)`  with response format `AudioResponseFormat.DIARIZED_JSON`, the returned instance of `TranscriptionCreateResponse` does not have a value for the `diarized` field; and instead contains the entire raw JSON response in the `text` field of the `transcription`.

## Expected behavior

`TranscriptionCreateResponse#diarized()` returns a non-empty Optional with the contents of the diarized response.

## Workaround

We can read the raw JSON string and manually parse it.

```java
new ObjectMapper().readValue(response.transcription().get().text(), TranscriptionDiarized.class)
```

## Possible cause

From what I can tell from a little bit of debugging, the issue might be here in the `AudioResponseFormat#isJson` function, where a case for `DIARIZED_JSON` is missing. As a result, the parser considers the response to be plain text.

> https://github.com/openai/openai-java/blob/5729c58d66faa09cde2ea8dc6293411b200cbd86/openai-java-core/src/main/kotlin/com/openai/models/audio/AudioResponseFormat.kt#L155-L162

<img width="1427" height="1090" alt="Image" src="https://github.com/user-attachments/assets/5a164593-003f-4465-b9a7-0d0cff0d695d" />

## Example

An example input/output where I observed the issue:

```input
TranscriptionCreateParams{body=Body{file=MultipartField{value=sun.nio.ch.ChannelInputStream@967d60f, contentType=audio/mpeg, filename=sousei_no_onmyouji_short.mp3}, model=MultipartField{value=gpt-4o-transcribe-diarize, contentType=text/plain; charset=utf-8, filename=null}, chunkingStrategy=MultipartField{value=ChunkingStrategy{auto=auto}, contentType=text/plain; charset=utf-8, filename=null}, include=MultipartField{value=null, contentType=text/plain; charset=utf-8, filename=null}, knownSpeakerNames=MultipartField{value=null, contentType=text/plain; charset=utf-8, filename=null}, knownSpeakerReferences=MultipartField{value=null, contentType=text/plain; charset=utf-8, filename=null}, language=MultipartField{value=ja, contentType=text/plain; charset=utf-8, filename=null}, prompt=MultipartField{value=null, contentType=text/plain; charset=utf-8, filename=null}, responseFormat=MultipartField{value=diarized_json, contentType=text/plain; charset=utf-8, filename=null}, temperature=MultipartField{value=null, contentType=text/plain; charset=utf-8, filename=null}, timestampGranularities=MultipartField{value=null, contentType=text/plain; charset=utf-8, filename=null}, additionalProperties={}}, additionalHeaders=Headers{map={}}, additionalQueryParams=QueryParams{map={}}}
```

This results in the following. Note that the `text` field of `transcription` contains the entire JSON string, but `diarized` is missing / null.

```output
TranscriptionCreateResponse{transcription=Transcription{text={"text":"彼女の名はアダ シノベリオ 強力な怨霊を排出 してきた京都の名家ア ダシノ家の筆頭","segments":[{"type":"transcript.text.segment","text":"彼女の名はアダシノベリオ","speaker":"A","start":1.0000000000000002,"end":3.3,"id":"seg_0"},{"type":"transcript.text.segment","text":"強力な怨霊を排出してきた京都の名家アダシノ家の筆頭","speaker":"A","start":3.8,"end":9.4,"id":"seg_1"}],"usage":{"type":"tokens","total_tokens":405,"input_tokens":97,"input_token_details":{"text_tokens":0,"audio_tokens":97},"output_tokens":308}}, logprobs=, usage=, additionalProperties={}}}
```

## Remark

The JSON data itself seems correct, when I try to parse the raw JSON manually into an instance of `TranscriptionDiarized`, it works:

<img width="1834" height="1090" alt="Image" src="https://github.com/user-attachments/assets/a5d25f89-3b5d-42fd-a8e3-24ba49d40791" />

	when (this) {
	JSON -> true
	TEXT -> false
	SRT -> false
	VERBOSE_JSON -> true
	VTT -> false
	else -> false
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Transcription API response is missing diarized data for response_format=diarized_json #652

Issue

Expected behavior

Workaround

Possible cause

Example

Remark

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Transcription API response is missing diarized data for response_format=diarized_json #652

Description

Issue

Expected behavior

Workaround

Possible cause

Example

Remark

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions