Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend import for 'as-a-module' use #10

Merged
merged 4 commits into from
Nov 28, 2023
Merged

Conversation

Luca-Pozzi
Copy link
Contributor

With reference to issue #9, the present PR implements a import_backed method in ASRBase. The method is then appropriately overridden both in WhisperTimestampedASR and FasterWhisperASR to import the required libraries.

To make the use more flexible, I have also exposed the output arg in OnlineASRProcessor. It receives a file-like object, so one could pass:

  • sys.stderr (as before) which in most cases will result into printing to terminal
  • a file handler (open("/path/to/file.txt", "a")) to log the output to a text file
  • open(os.devnull, "w"), to send the output to /dev/null, i.e. to discard it

Copy link
Collaborator

@Gldkslfmsd Gldkslfmsd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to "submit review"

self.commited_in_buffer = []
self.buffer = []
self.new = []

self.last_commited_time = 0
self.last_commited_word = None

self.output = output
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename self.output to self.logfile. Otherwise I agree, good idea.

Btw., logging module should be used, but I am too lazy for that. The Python code should stay simple.

@@ -465,11 +476,11 @@ def split(self, text):
#asr = WhisperASR(lan=language, modelsize=size)

if args.backend == "faster-whisper":
from faster_whisper import WhisperModel
#from faster_whisper import WhisperModel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

asr_cls = FasterWhisperASR
else:
import whisper
import whisper_timestamped
#import whisper
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

import whisper
import whisper_timestamped
#import whisper
#import whisper_timestamped
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

# join transcribe words with this character (" " for whisper_timestamped, "" for faster-whisper because it emits the spaces when neeeded)
sep = " "
sep = " " # join transcribe words with this character (" " for whisper_timestamped,
# "" for faster-whisper because it emits the spaces when neeeded)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


def __init__(self, lan, modelsize=None, cache_dir=None, model_dir=None):
self.transcribe_kargs = {}
self.original_language = lan

self.import_backend()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too complicated code. Let's just do the imports in load_model in the childs that need it.

@lrq3000
Copy link

lrq3000 commented Oct 30, 2023

@Luca-Pozzi Great work! Do you plan on finishing this PR? I would really like to use whisper_online as a module, being unable to greatly limits composability into more end user facing apps.

@Luca-Pozzi
Copy link
Contributor Author

@lrq3000 It is really a minor contribution, but thank you!
I plan to finish the PR by this week

@lrq3000
Copy link

lrq3000 commented Oct 31, 2023

@Luca-Pozzi Awesome, thank you very much! Please let me know if I can help, I am experienced in Python coding but not in deep learning modules.

@Luca-Pozzi
Copy link
Contributor Author

@Gldkslfmsd I have edited the PR as per your suggestions!
Thank you very much, and sorry if it took so long on my side.

@Gldkslfmsd
Copy link
Collaborator

Thanks! No worries for late.
I'm now busy until 25.11.2023, at least, I hope I'll look on this later.

@Gldkslfmsd Gldkslfmsd merged commit 39e06b5 into ufal:main Nov 28, 2023
Gldkslfmsd added a commit that referenced this pull request Nov 28, 2023
@lrq3000
Copy link

lrq3000 commented Nov 28, 2023

Thank you both for your great work on this! :D

@umaryasin33
Copy link

umaryasin33 commented Dec 8, 2023

How to use this to allow multiple clients to connect when you host a server or create an API for live transcription?

@Gldkslfmsd
Copy link
Collaborator

How to use this to allow multiple clients to connect when you host a server or create an API for live transcription?

I don't know, it's a topic that requires a separate issue.
But first, there must be a Whisper backend that enables batching -- more inputs processing at once.
If there's not, then use one GPU with one server for one client.

@umaryasin33
Copy link

How to use this to allow multiple clients to connect when you host a server or create an API for live transcription?

I don't know, it's a topic that requires a separate issue. But first, there must be a Whisper backend that enables batching -- more inputs processing at once. If there's not, then use one GPU with one server for one client.

Thank you. Using one GPU for each client is a tall ask for me as there could be up to a dozen clients active at a particular time for my use case. I think there are a few backends which do support batched processing. e.g. https://github.com/Blair-Johnson/batch-whisper
If you have any references or you can point me to the parts where changes are needed to implement this.
Or is it alright if I create a new issue for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants