Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add on_progress callback for transcribe and align #620

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

stex
Copy link

@stex stex commented Dec 9, 2023

This adds an optional on_progress argument to both the align and transcribe methods.

This allows processing the current progress via a callback, until now, it was only possible to print the progress to STDOUT.

Example:

transcribe(
  my_audio,
  batch_size=8,
  on_progress=lambda state, current=0, total=0: print(f"state: {state}, current: {current}, t: {total}")
)

States are defined as

class TranscriptionState(Enum):
  LOADING_AUDIO = "loading_audio"
  GENERATING_VAD_SEGMENTS = "generating_vad_segments"
  TRANSCRIBING = "transcribing"
  FINISHED = "finished"

This adds an optional `on_progress` argument to both the `align` and `transcribe` methods.

This allows processing the current progress via a callback, until now, it was only possible to print the progress to `STDOUT`.

Example:

```python
transcribe(
  my_audio,
  batch_size=8,
  on_progress=lambda state, current=0, total=0: print(f"state: {state}, current: {current}, t: {total}")
)
```

States are defined as

```python
class TranscriptionState(Enum):
  LOADING_AUDIO = "loading_audio"
  GENERATING_VAD_SEGMENTS = "generating_vad_segments"
  TRANSCRIBING = "transcribing"
  FINISHED = "finished"
```

Signed-off-by: Stefan Exner <stex@stex.codes>
@matheusbach
Copy link

nice work, great feature. keep going

@IgorTavcar
Copy link

this feature is exactly what we need at the moment

@stex stex marked this pull request as ready for review January 19, 2024 16:37
@stex
Copy link
Author

stex commented Jan 19, 2024

@matheusbach @IgorTavcar is there something missing in this PR that you'd need for your work?
I removed the "draft" state, so it's now up to @m-bain

@pluja
Copy link

pluja commented Mar 5, 2024

this would be very useful if merged!

@casic
Copy link

casic commented Mar 6, 2024 via email

@stex
Copy link
Author

stex commented Mar 13, 2024

@casic what doesn't work regarding alignment?

@casic
Copy link

casic commented Mar 13, 2024 via email

@matheusbach
Copy link

@matheusbach @IgorTavcar is there something missing in this PR that you'd need for your work? I removed the "draft" state, so it's now up to @m-bain

I appreciate if it the function parameter was padronized between transcribe and align step. Also, the enumerated status used in transcribe step is very useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants