Skip to content

text_tokens and image_tokens not documented or typed #1718

@NikkiAung

Description

@NikkiAung

Confirm this is a Node library issue and not an underlying OpenAI API issue

  • This is an issue with the Node library

Describe the bug

When making a chat completions request to gpt-4o-audio-preview or gpt-4o-mini-audio-preview, the usage response contains text_tokens and image_tokens under prompt_tokens_details:

"prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 21,
      "text_tokens": 11,
      "image_tokens": 0
    },

And at openai-node/src/resources/completions.ts, image_tokens and text_tokens are not being defined. As a result

details = response.usage.prompt_tokens_details
details.text_ # ← No autocomplete!
details.text_tokens # ← Type checker error!

After my code changes

details = response.usage.prompt_tokens_details
details.text_ # ← IDE autocompletes to "text_tokens"! ✅
details.text_tokens # ← Type checker OK! ✅

To Reproduce

Make a request to chat completions API with model gpt-4o-audio-preview or gpt-4o-mini-audio-preview

Screen.Recording.2025-12-11.at.12.04.49.AM.mov

Code snippets To Reproduce Issue

import OpenAI from "openai";
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';
import { dirname } from 'path';
import dotenv from 'dotenv';

// Get the directory name of the current module
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

// Load environment variables from .env file in the script's directory
dotenv.config({ path: path.join(__dirname, '.env') });

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

async function analyzeAudio(audioFilePath) {
  try {
    // Read the audio file and convert it to a base64 string
    const audioBuffer = await fs.readFile(audioFilePath);
    const base64str = audioBuffer.toString('base64');

    const response = await openai.chat.completions.create({
      model: "gpt-4o-audio-preview",
      modalities: ["text", "audio"],
      audio: { voice: "alloy", format: "wav" },
      messages: [
        {
          role: "user",
          content: [
            { type: "text", text: "Describe this sound?" },
            { type: "input_audio", input_audio: { data: base64str, format: "wav" }}
          ]
        }
      ]
    });

    // Extract and print the transcript
    const transcript = response.choices[0].message.audio.transcript;
    console.log('Transcript:', transcript);
    
    // Log the full response for debugging
    console.log('\nFull Response:', JSON.stringify(response, null, 2));
    
    // Log token usage details
    // const usage = response.usage.prompt_tokens_details;
    // console.log('\nText Tokens:', usage.text_tokens);
    // console.log('Image Tokens:', usage.image_tokens);

    // response.usage.prompt_tokens_details

    // need to add type for text_tokens and image_tokens
    

  } catch (error) {
    console.error('Error:', error);
  }
}

// Use the BAK.wav file from the script's directory
const audioFilePath = path.join(__dirname, 'BAK.wav');
analyzeAudio(audioFilePath);

To get BAK.wav, download from kaggle audio datasethttps://www.kaggle.com/datasets/crischir/sample-wav-audio-files

OS

macOS

Node version

v24.11.1

Library version

2.9.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions