Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected field name timestamp_granularities[] #245

Open
kayhantolga opened this issue Apr 27, 2024 · 0 comments
Open

Unexpected field name timestamp_granularities[] #245

kayhantolga opened this issue Apr 27, 2024 · 0 comments

Comments

@kayhantolga
Copy link

Does this field name really include [], or is it a mistake?

CreateTranscriptionRequest:
      type: object
      additionalProperties: false
      properties:
        file:
          description: |
            The audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
          type: string
          x-oaiTypeLabel: file
          format: binary
        model:
          description: |
            ID of the model to use. Only `whisper-1` (which is powered by our open source Whisper V2 model) is currently available.
          example: whisper-1
          anyOf:
            - type: string
            - type: string
              enum: ["whisper-1"]
          x-oaiTypeLabel: string
        language:
          description: |
            The language of the input audio. Supplying the input language in [ISO-639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) format will improve accuracy and latency.
          type: string
        prompt:
          description: |
            An optional text to guide the model's style or continue a previous audio segment. The [prompt](/docs/guides/speech-to-text/prompting) should match the audio language.
          type: string
        response_format:
          description: |
            The format of the transcript output, in one of these options: `json`, `text`, `srt`, `verbose_json`, or `vtt`.
          type: string
          enum:
            - json
            - text
            - srt
            - verbose_json
            - vtt
          default: json
        temperature:
          description: |
            The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use [log probability](https://en.wikipedia.org/wiki/Log_probability) to automatically increase the temperature until certain thresholds are hit.
          type: number
          default: 0
===>timestamp_granularities[]:<=============================
          description: |
            The timestamp granularities to populate for this transcription. `response_format` must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.
          type: array
          items:
            type: string
            enum:
              - word
              - segment
          default: [segment]
      required:
        - file
        - model
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant