Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

waveform-compare fails on pause #17

Closed
ssinyagin opened this issue Mar 28, 2015 · 6 comments
Closed

waveform-compare fails on pause #17

ssinyagin opened this issue Mar 28, 2015 · 6 comments

Comments

@ssinyagin
Copy link

hi,

it looks like waveform-compare does not treat pauses correctly. See the WAV files at http://www.k-open.com/s/94bbcb9a.zip

In my test environment, a call is made to a telephony provider, then lands at another VoIP telephony provider, and terminates on a test server which echoes the signal back to the caller. The audio recording is done at the calling side. There are some distortions related to voice activity detection and comfort noise, which are managed by waveform-compare quite correctly. But then there's a silence pause for about 3 seconds in the source audio, and a slight noise on the received audio, and waveform-compare fails to match the frames.

94bbcb9a-3ee6-41e6-b6a6-34814367e650.wav is the original recording from the telephone system: the first (left) channel is the received audio, and the second (right) channel is the sent audio. The pause is around 18th second. left.wav and right.wav are the extracted channels for the comparison input:

# waveform-compare --verbose --threshold=0 --pad-short-block left.wav right.wav 
block 1: 0.927277 1674
block 2: 0.947752 1674
block 3: 0.900476 1674
block 4: 0.937155 1674
block 5: 0.930716 1674
block 6: 0.976387 1674
block 7: 0.892142 1674
block 8: 0.823834 1674
block 9: 0.969748 1674
block 10: 0.903771 1674
block 11: 0.810379 1514
block 12: 0.838841 1514
block 13: 0.783041 1514
block 14: 0.70482 1674
block 15: 0.599987 1674
block 16: 0.488953 1674
block 17: 0.471326 1674
block 18: 0.259636 1674
failed: offset distance exceeded (4714)
block 19: 0.00525391 6388
failed: offset distance exceeded (5195)
block 20: 0.00792493 6869
block 21: 0.134894 1514
block 22: 0.906908 1514
block 23: 0.982338 1514
block 24: 0.966458 1514
block 25: 0.979316 1514
block 26: 0.98211 1514
block 27: 0.977999 1514
block 28: 0.709532 1514
finished: reached end of both samples
Failure
Block: 20
Value in block: 0.00792493
Offset in block: 6869 (normal: 1674)
@ssinyagin
Copy link
Author

Here's another example. These two sound files were received under the same test conditions at different times, so they have a slight difference in random noise. waveform-compare fails at the beginning of the files:
http://www.k-open.com/s/c329cfc3df2.zip

@jasn
Copy link
Contributor

jasn commented Mar 30, 2015

I will find time to figure out what is going on sometime this week.

@ssinyagin
Copy link
Author

I noticed that there's sometimes a lost 20ms frame due to jitter buffer overflow -- I guess that's where the algorithm fails. At some point the samples are misaligned, and then the whole matching fails completely.
What do you think, is it possible to add jitter compensation to the tool?

@ssinyagin
Copy link
Author

it seems to be definitely related to slipping frames, and not to the silence. But it fails on silence blocks for some reason.

@jasn
Copy link
Contributor

jasn commented Apr 10, 2015

I just had a look at your files. It seems that the silence was the reason for the fail in this particular instance. The silence detection is quite naive, it merely checks if the average of the absolute values in the signal is close to 0. This is a knob that you should feel free to turn up and down, I just changed it from 5.0 to 10.0 (each value of the signal is a 16-bit integer, so 10.0 is still really close to 0 I would say) and then this particular set of files works out fine. You can have a look at commit d0eda87 to see how to modify the threshold.

We are not going to make compensations for jitter, since that would actually work against what we developed this tool to do originally. If you have a problem with jitter, I would suggest that you do a bit of processing on the verbose output to check if the reason for the fail is due to jitter

@jasn jasn closed this as completed Apr 10, 2015
@ssinyagin
Copy link
Author

I found a tool which is recommended by ITU and is taking jitter into account: https://github.com/dennisguse/ITU-T_pesq
Of course, your project has started with different goals, and telephony is a bit different world :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants