-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Confirm this is a Node library issue and not an underlying OpenAI API issue
- This is an issue with the Node library
Describe the bug
When making a chat completions request to gpt-4o-audio-preview or gpt-4o-mini-audio-preview, the usage response contains text_tokens and image_tokens under prompt_tokens_details:
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 21,
"text_tokens": 11,
"image_tokens": 0
},
And at openai-node/src/resources/completions.ts, image_tokens and text_tokens are not being defined. As a result
details = response.usage.prompt_tokens_details
details.text_ # ← No autocomplete!
details.text_tokens # ← Type checker error!
After my code changes
details = response.usage.prompt_tokens_details
details.text_ # ← IDE autocompletes to "text_tokens"! ✅
details.text_tokens # ← Type checker OK! ✅
To Reproduce
Make a request to chat completions API with model gpt-4o-audio-preview or gpt-4o-mini-audio-preview
Screen.Recording.2025-12-11.at.12.04.49.AM.mov
Code snippets To Reproduce Issue
import OpenAI from "openai";
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';
import { dirname } from 'path';
import dotenv from 'dotenv';
// Get the directory name of the current module
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);
// Load environment variables from .env file in the script's directory
dotenv.config({ path: path.join(__dirname, '.env') });
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
async function analyzeAudio(audioFilePath) {
try {
// Read the audio file and convert it to a base64 string
const audioBuffer = await fs.readFile(audioFilePath);
const base64str = audioBuffer.toString('base64');
const response = await openai.chat.completions.create({
model: "gpt-4o-audio-preview",
modalities: ["text", "audio"],
audio: { voice: "alloy", format: "wav" },
messages: [
{
role: "user",
content: [
{ type: "text", text: "Describe this sound?" },
{ type: "input_audio", input_audio: { data: base64str, format: "wav" }}
]
}
]
});
// Extract and print the transcript
const transcript = response.choices[0].message.audio.transcript;
console.log('Transcript:', transcript);
// Log the full response for debugging
console.log('\nFull Response:', JSON.stringify(response, null, 2));
// Log token usage details
// const usage = response.usage.prompt_tokens_details;
// console.log('\nText Tokens:', usage.text_tokens);
// console.log('Image Tokens:', usage.image_tokens);
// response.usage.prompt_tokens_details
// need to add type for text_tokens and image_tokens
} catch (error) {
console.error('Error:', error);
}
}
// Use the BAK.wav file from the script's directory
const audioFilePath = path.join(__dirname, 'BAK.wav');
analyzeAudio(audioFilePath);
To get BAK.wav, download from kaggle audio datasethttps://www.kaggle.com/datasets/crischir/sample-wav-audio-filesOS
macOS
Node version
v24.11.1
Library version
2.9.0
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working