-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NemoAsrReader handling of UTF8 text #2358
Add NemoAsrReader handling of UTF8 text #2358
Conversation
Signed-off-by: Joaquin Anton <janton@nvidia.com>
!build |
CI MESSAGE: [1700932]: BUILD STARTED |
CI MESSAGE: [1700932]: BUILD FAILED |
8b9bce1
to
c8b7f53
Compare
CI MESSAGE: [1703805]: BUILD STARTED |
Signed-off-by: Joaquin Anton <janton@nvidia.com>
d9d5db1
to
863eef4
Compare
|
||
{ | ||
std::stringstream ss; | ||
ss << R"code({"audio_filepath": "path/to/audio1.wav", "duration": 1.45, "text": "这是一个测试"})code" << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should get the utf-8 bytes and store it as a char
array (providing the characters' numerical values) - otherwise it's prone to breaking when the file is transcoded.
@@ -85,14 +85,6 @@ in seconds, of the audio samples. | |||
|
|||
Samples with a duration longer than this value will be ignored.)code", | |||
0.0f) | |||
.AddOptionalArg("normalize_text", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecate this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
nemo_asr_manifest = os.path.join(tmp_dir.name, "nemo_asr_manifest.json") | ||
create_manifest_file(nemo_asr_manifest, names, lengths, rates, ref_text_literal) | ||
ref_text = [np.frombuffer(bytes(s, "utf8"), dtype=np.uint8).reshape(-1) for s in ref_text_literal] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is .reshape(-1)
for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flattening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked, not needed
98bf028
to
81937a1
Compare
Signed-off-by: Joaquin Anton <janton@nvidia.com>
81937a1
to
6f073f8
Compare
!build |
CI MESSAGE: [1707805]: BUILD STARTED |
CI MESSAGE: [1707805]: BUILD PASSED |
Signed-off-by: Joaquin Anton janton@nvidia.com
Why we need this PR?
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
NeMo manifests are encoded in utf8. Made sure that the output of the reader is the expected utf8 encoded string
Tests added
Removed trailing 0 in text outputs
Removed text normalization, which will be left to the python code
NemoAsrReader
N/A
New tests added
Docstr updated
JIRA TASK: [DALI-1635]