New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes + Channel Selection for CHiME-7 Task #4934
Conversation
from torch.utils.data import DataLoader, Dataset | ||
|
||
|
||
class EnvelopeVariance(torch.nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Channel selection is based on Envelope variance right now.
It is not guarantee it will work because of overlapped speech
fi | ||
|
||
sox_conda=`command -v ../../../tools/venv/bin/sox 2>/dev/null` | ||
sox_conda=`command -v $(dirname $(which python))/sox 2>/dev/null` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopefully this fixes the sox issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI. Conda setups some useful shell environment variables: CONDA_PREFIX, CONDA_EXE, etc.
If sox was installed by sox, the path should be ${CONDA_PREFIX}/bin/sox
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks I was not aware of CONDA_PREFIX. Seems much better to use that, it is more clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Also updated [local/generate_chime6_data.sh] with same variable (https://github.com/espnet/espnet/pull/4934/files#diff-83abe1d71c47a59774fbbdc7721c39519004e681d12764a55c81c66f9ffaceae)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kamo-naoyuki I followed your suggestion and added a script + JSON file to check the MD5 checksum for each file https://github.com/espnet/espnet/blob/cfbb957d9c71c5c7aed27a1d4b2b85b62721381a/egs2/chime7_task1/asr1/local/check_data_gen.py but I had also to add a .json file to this recipe. Is it ok ?
Codecov Report
@@ Coverage Diff @@
## master #4934 +/- ##
=======================================
Coverage 76.56% 76.56%
=======================================
Files 603 603
Lines 53756 53756
=======================================
Hits 41158 41158
Misses 12598 12598
Flags with carried forward coverage won't be shown. Click here to find out more. 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
This is ready for review.
vs (classic way) using all outer mics on CHiME-6
|
@@ -139,12 +139,9 @@ def get_feats(self, data, ref_len=None): | |||
x = torch.from_numpy(x).float().to(self.device) | |||
x = x.view(1, -1) | |||
|
|||
feat = self.model.wav2vec2.extract_features( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to apply black here otherwise tests were failing.
@@ -12,6 +12,8 @@ def make_history_mask(xp, block): | |||
""" | |||
batch, length = block.shape | |||
arange = xp.arange(length) | |||
history_mask = (arange[None] <= arange[:, None])[None,] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
Some tests are failing because of some broken dowload. |
@sw005320 @simpleoier can we merge this PR ? |
I quickly scan the changes, and it looks good to me. |
If somebody can test it, it would be great. |
I checked local/check_data_gen.py, but I got
(Maybe all json files under train_call are failed) I think some problems still exist in this recipe. |
For quick check, I'll paste the head of my json file here: mixer6/transcriptions/train_call/20090921_130118_LDC_120501.json
I'm suspicious about the |
For sure, @kamo-naoyuki can you give me your path to \u2019 is correct, if it appears in the original data as provided by LDC. It should however disappear after text normalization as this is applied
Do you still have after running the data generation step ? [
{
"start_time": "1956.580",
"end_time": "1957.290",
"words": "hi",
"speaker": "120501"
},
{
"start_time": "1957.720",
"end_time": "1959.310",
"words": "i can\u2019t really hear",
"speaker": "120501"
},
{
"start_time": "1960.420",
"end_time": "1962.030",
"words": "lena",
"speaker": "120501"
}, |
splits/train_call |
I have the same now. But this does not really explain why the MD5 checksum is not the same. |
UPDATE: the annotation I have differs from one LDC gave to participants (in a non significant way but it changes the hash). |
@kamo-naoyuki Did the check fail for other stuff too ? |
[gpu_gss]: | ||
[gss]: | ||
|
||
We would like to thank Dr. Naoyuki Kamo for his precious help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for mention me, but sorry, I don't have Ph.D. :->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I am sorry. I can remove it. Honestly you deserve an honorary one ;)
No description provided.