Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[!] Error: 'language' when attempting to use srt_writer #658

Open
knuurr opened this issue Jan 5, 2024 · 1 comment
Open

[!] Error: 'language' when attempting to use srt_writer #658

knuurr opened this issue Jan 5, 2024 · 1 comment

Comments

@knuurr
Copy link

knuurr commented Jan 5, 2024

I'm getting this very brief error when atempting to save transcription using method from utils.py

I'm using free version of Google Colab. This is core logic of my code for handling transcription.

There are of course some other variables and models are loaded in different part of code but I assume issue is somewhere here.

    try:
      print("[*] Loading audio...")
      audio = whisperx.load_audio(source_path)

      print("[*] Attempting to transcribe...")
      result = model.transcribe(source_path, language="pl", batch_size=16)

      # 2. Align whisper output
      # model_a, metadata = whisperx.load_align_model(language_code="pl", device=device)
      print("[*] Attempting to align...")
      result = whisperx.align(result["segments"], model_a, metadata, audio, device, return_char_alignments=False)

      # 3. Assign speaker labels
      print("[*] Attempting to assign speaker labels...")
      diarize_model = whisperx.DiarizationPipeline(use_auth_token=HF_TOKEN, device=device)
      diarize_segments = diarize_model(audio)

      result = whisperx.assign_word_speakers(diarize_segments, result)
      print("result:", result)

    except Exception as e:
        print_colored(f"[!] Error during transcription: {mp3_file}", color='red')
        print(f"[!] Error: {str(e)}")
        success = False

    # Save as an SRT file
    if success:
      try:
        print("[*] Attempting Save as an SRT file...")
        srt_writer = get_writer("srt", transcript_path)
        
        # fix for "missing 1 required positional argument: 'options'"
        word_options = {
          "highlight_words": True,
          "max_line_count": 50,
          "max_line_width": 3
          }

        srt_writer(result, mp3_file, word_options)

      except Exception as e:
          print_colored(f"[!] Error saving subtitles: {mp3_file}" , color='red')
          print(f"[!] Error: {str(e)}")
          success = False

Output:

[*] Processing and moving files:
[*] Now processing: <file>mp3
[*] Loading audio...
[*] Attempting to transcribe...
Detected language: pl (0.97) in first 30s of audio...
[*] Attempting to align...
[*] Attempting to assign speaker labels...
[*] Attempting Save as an SRT file...
[!] Error saving subtitles: <file>mp3
[!] Error: 'language'
[!] Failed to process: <file>.mp3
An exception has occurred, use %tb to see the full traceback.

I must say I have hard time to debug this. I don't know what issue can be. I have simmilar pipeline for vanilla OAI's Whipser, and I got it to work fine there.

@metheofanis
Copy link

Exactly the same problem here! Is there a any progress?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants