Skip to content

Commit

Permalink
Also catch comma doc sentences
Browse files Browse the repository at this point in the history
  • Loading branch information
evamaxfield committed Feb 27, 2023
1 parent 7727bcc commit e63c25d
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions cdp_backend/sr_models/whisper.py
Original file line number Diff line number Diff line change
Expand Up @@ -230,11 +230,11 @@ def transcribe(
log.info("Constructing sentences with word metadata")
for doc_sent in doc.sents:
doc_sent_text = doc_sent.text.strip()
# Sometimes spacy produces a doc sentence that is just a period
# Sometimes spacy produces a doc sentence that is just a period or comma.
# This sentence is attached to the end of the word
# in the timestamped words with metas list
# We can simply ignore those odd sentences
if doc_sent_text == ".":
if any([c == doc_sent_text for c in [".", ","]]):
continue

log.info(f"Doc sent: '{doc_sent_text}'")
Expand Down

0 comments on commit e63c25d

Please sign in to comment.