Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cmvn #540

Merged
merged 8 commits into from Apr 17, 2020
Merged

add cmvn #540

merged 8 commits into from Apr 17, 2020

Conversation

wanglong001
Copy link
Contributor

@wanglong001 wanglong001 commented Apr 14, 2020

HI ,issues , add feature of kaldi cmvn

ths

Closes #535.

@mthrok
Copy link
Collaborator

mthrok commented Apr 14, 2020

Hi @wanglong001

Thanks for the PR. We love this addition to torchaudio.
To merge this PR, we need a few things, in addition to fixing the current test failure.

Assuming this corresponds to apply-cmvn-sliding command from Kaldi, we need to add the compatibility check in test.

  1. We need to create a set of sample data that represents what Kaldi produces, which can be used in test.
    What are the commands (options for apply-cmvn-sliding), this Python implementation is compatible with?
  2. Can we add a test suite in test/test_functional.py that reads input data, pass them to sliding_window_cmn_internal and compare the result with expected result?

@mthrok
Copy link
Collaborator

mthrok commented Apr 14, 2020

@vincentqb

We have torchaudio.compliance.kaldi module which has a collection of Kaldi compatible functions, and it is not immediately clear to me if newly added Kaldi compatible features should go there as a simple function, or to functional and/or transforms as in the current state of this PR. What is your thought?

Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good.

torchaudio/functional.py Outdated Show resolved Hide resolved
torchaudio/transforms.py Outdated Show resolved Hide resolved
torchaudio/transforms.py Outdated Show resolved Hide resolved
torchaudio/functional.py Outdated Show resolved Hide resolved
@@ -26,6 +26,7 @@
'Fade',
'FrequencyMasking',
'TimeMasking',
"SlidingWindowCmn",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use single quote so that it looks more consistent with the others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ths

Copy link
Contributor Author

@wanglong001 wanglong001 Apr 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wanglong001

Thanks for the PR. We love this addition to torchaudio.
To merge this PR, we need a few things, in addition to fixing the current test failure.

Assuming this corresponds to apply-cmvn-sliding command from Kaldi, we need to add the compatibility check in test.

  1. We need to create a set of sample data that represents what Kaldi produces, which can be used in test.
    What are the commands (options for apply-cmvn-sliding), this Python implementation is compatible with?
  2. Can we add a test suite in test/test_functional.py that reads input data, pass them to sliding_window_cmn_internal and compare the result with expected result?

HI, @mthrok , I tested, CMVN results are consistent with Kaldi produces

Kaldi produces:
test [ -0.2978237 -3.957721 -1.691367 -0.8017386 -0.1020617 0.7112818 1.032329 -0.07302935 -0.364126 -1.116751 -1.52251 0.008113492 0.6255074 0.2103512 -2.065425 -3.816057 -4.010491 -5.313768 -4.361475 -3.638874 -3.174215 -3.710273 -2.913771 -1.385493 -0.05104687 -0.3916942 -0.1137991 -0.4392033 0.1804291 -0.9255709 -1.088666 -1.524429 -1.000161 -1.45257 -1.327303 -0.7260036 -0.2312086 0.4410149 0.2573555 0.4030849 0.4006264 -0.3765712 0.4998532 0.3766024 1.700068 1.545152 2.894449 2.408592 2.028765 0.5500979 1.357298 1.732662 2.102398 3.12371 1.658175 -0.07140766 -1.650993 -2.856858 -2.421155 -2.578214 -0.9135854 -1.476693 -1.476091 -0.08946308 1.441404 0.8711346 0.9737606 2.655646 2.822439 0.4792782 -0.3406159 0.1678409 0.8757191 0.1234999 -0.6352221 0.4807768 1.081321 1.441535 1.359365 1.144065 0.4634563 0.04009861 0.4686528 -0.03200705 1.544268 1.596862 2.588459 1.933711 1.560324 0.4208693 0.2876002 1.620412 2.402887 2.347301 0.4475754 -1.333086 -2.693162 -3.084388 -2.819175 -3.223354 -2.203945 -1.836033 -1.859041 -0.7428044 0.8772426 0.02372437 0.3243809 1.992488 2.41502 0.08470878 -1.248146 -0.1552487 0.415659 -0.3751296 -2.249642 -1.400303 -0.9902093 -1.121125 -0.2403948 -0.02057447 -0.3575848 0.3077492 0.4317036 0.3063223 0.984347 0.8403406 1.598718 0.8940803 -0.1595858 -0.8741823 -0.6935915 0.09939348 0.4506874 -0.0337093 -1.115825 -2.622287 -4.711492 -4.556859 -4.668654 -5.444584 -3.740125 -3.011553 -2.973071 -1.986133 -0.1995359 -0.06490441 -0.2038088 0.9230375 1.61049 -0.9130806 -2.337496 -0.8909388 0.3564282 -1.37235 -2.773633 -2.053204 -2.349449 -1.753065 -0.4463646 -0.8244952

this produces:
tensor([[-0.2978, -3.9577, -1.6914, ..., 0.4410, 0.2574, 0.4031], [ 0.4006, -0.3766, 0.4999, ..., 1.4415, 1.3594, 1.1441], [ 0.4635, 0.0401, 0.4686, ..., -1.1211, -0.2404, -0.0206], ..., [-4.7276, -3.1761, -2.5138, ..., -1.3461, -2.5248, -2.8029], [-4.8128, -2.6665, -3.4212, ..., -2.8407, -2.7148, -2.8011], [-4.5950, -3.4546, -3.6095, ..., -3.4438, -2.6217, -2.7752]])

this Python implementation is compatible with Kaldi CMVN

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wanglong001

Thanks for checking the compatibility. That's great that it's compatible with Kaldi's output. Can you provide me with the Kaldi command you used? Just to make sure I would like to reproduce your result on my end. And also we would like to add that to our test suite so that in future someone changes the code we can maintain the integrity of the functionality.
We do not have Kaldi compatibility test which is simple enough for external contributor to write, so once I get the command line from you I can add that to test suite so that this function works as intended in future too.

Copy link
Contributor Author

@wanglong001 wanglong001 Apr 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mthrok

Very professional !!!

Example

PYTHON
feats = torch.randn(40, 10) with WriteHelper('ark,t:/tmp/t1.txt') as writer: writer("test", feats.numpy()) feats = slidingWindowCmnInternal(feats, center=False, cmvn_window=600, norm_vars=False)

KALDI COMMAND

apply-cmvn-sliding ark,t:/tmp/t1.txt ark,t:/tmp/t1_cmvn.txt

/tmp/t1.txt

test [
-1.2092187404632568 -0.2452489286661148 -0.10354653745889664 -1.055932879447937 0.6074185371398926 1.2994794845581055 -0.9786869287490845 0.4241405725479126 -1.7869689464569092 -0.3930174708366394
1.3758515119552612 2.271512269973755 1.547525405883789 0.09755882620811462 0.6663862466812134 -0.14491936564445496 0.010599859058856964 0.3704547882080078 -0.026905754581093788 0.9940603375434875
0.3229313790798187 0.45287033915519714 -1.9792391061782837 -0.46716856956481934 -0.3847292363643646 1.2056677341461182 -0.5216332077980042 0.4214261472225189 0.9459046125411987 -0.6201861500740051
-0.6755410432815552 0.18953222036361694 -2.453876256942749 -0.4136284291744232 -1.6230759620666504 1.6343038082122803 -0.2785346806049347 0.6952822208404541 1.5572190284729004 -0.15103335678577423
0.7448609471321106 0.9225212931632996 0.08566032350063324 0.30956557393074036 -0.5720630884170532 -0.27443450689315796 -0.5298489332199097 -0.2935234308242798 -0.7056423425674438 2.207134246826172
0.303416907787323 1.2786098718643188 -0.12336374819278717 -0.9351986646652222 0.3644144535064697 -0.7463241815567017 1.4035923480987549 0.27524638175964355 -0.0467388853430748 0.9172043800354004
-0.7640166282653809 -1.3739436864852905 2.007477283477783 0.9553800225257874 -0.597091794013977 -0.6707907319068909 0.10798247158527374 1.2496613264083862 0.7424683570861816 1.5249921083450317
0.3980518877506256 -0.999061107635498 -0.9398044943809509 0.40981823205947876 0.6541479825973511 0.8159498572349548 -0.10985129326581955 1.022978663444519 -1.6536064147949219 0.6017708778381348
0.3204551637172699 -0.8455250859260559 -0.6287296414375305 -0.06087707728147507 -0.3673917353153229 1.2744295597076416 0.4158554673194885 0.5116289258003235 -0.8029524087905884 0.423404723405838
-0.47009897232055664 0.5491663217544556 -2.34041690826416 0.5639355182647705 0.8848018050193787 1.0180824995040894 1.2150442600250244 1.3364535570144653 -0.19417907297611237 0.24204368889331818
0.6246862411499023 0.39906013011932373 -0.7155704498291016 0.006444863975048065 0.3321879208087921 1.678318738937378 1.1174615621566772 -0.2640886902809143 -0.15615008771419525 -0.9037182331085205
0.7920187711715698 0.5783515572547913 -1.188904047012329 -0.4041592478752136 -0.7321376204490662 0.059093113988637924 -1.246921181678772 -0.6501733660697937 1.715835690498352 2.0756399631500244
-0.17813552916049957 -0.44768399000167847 -1.1387429237365723 0.2032758593559265 -1.7872635126113892 0.21826878190040588 -3.098574638366699 -1.1429184675216675 0.7085928320884705 -1.168757438659668
0.07852610945701599 -1.3787460327148438 0.5427438020706177 0.07594803720712662 0.550636351108551 1.2864513397216797 -1.4941885471343994 -0.9298434257507324 1.833871841430664 0.24510546028614044
0.34600722789764404 1.4709651470184326 -0.19937828183174133 0.09141021966934204 1.5486915111541748 -0.4029368460178375 0.8056784272193909 0.053414445370435715 -1.7138036489486694 -0.19163841009140015
-0.6733019948005676 0.6173000931739807 -0.14437831938266754 -0.2538401782512665 0.47160375118255615 -1.6405439376831055 -0.8501981496810913 -0.6651865839958191 0.662636399269104 0.21473029255867004
-0.6396039724349976 0.2518253028392792 -1.3881373405456543 -0.04230407997965813 0.5668030977249146 -0.15674689412117004 -1.7664361000061035 1.01475191116333 -0.04886173456907272 0.8592840433120728
-0.4908730387687683 1.6541380882263184 -0.5007355213165283 -0.5055148005485535 1.3684736490249634 -1.6074591875076294 0.12517008185386658 -0.6648843884468079 1.1866555213928223 0.1114572212100029
-0.7519335746765137 -0.49858492612838745 2.091794013977051 -0.42229363322257996 -1.4436630010604858 0.5567572712898254 -0.1677234023809433 0.05117766559123993 0.731659471988678 -0.20385245978832245
0.32324153184890747 -0.10049401968717575 1.579226016998291 0.7150881290435791 0.20265425741672516 -0.29284852743148804 0.6828510761260986 2.0449142456054688 -2.7959678173065186 2.9863431453704834
-0.9678743481636047 -1.7277352809906006 -0.6008355021476746 -0.6325292587280273 -1.2347840070724487 -0.38209405541419983 -0.38333404064178467 0.5440695285797119 1.1540812253952026 0.392869234085083
0.1663489192724228 0.12225817888975143 -1.6994049549102783 -0.9291478395462036 -0.6442187428474426 3.121890068054199 0.9056876301765442 -0.9223313331604004 -0.8339102268218994 -0.17442043125629425
-0.013405872508883476 -0.8903205990791321 -0.37287694215774536 1.3863939046859741 1.0475677251815796 0.6418140530586243 1.437076210975647 1.3522521257400513 -0.28179964423179626 -1.4045377969741821
-0.778570830821991 0.5407215356826782 1.703031063079834 1.4699496030807495 0.7289568185806274 -0.5236309766769409 -0.1954544484615326 -0.18744006752967834 0.8091910481452942 -0.9062354564666748
-1.0162575244903564 -0.8560374975204468 -1.056389331817627 0.3531910181045532 0.1619015336036682 -0.28672996163368225 -1.1198407411575317 0.04759075120091438 -0.4830250144004822 -0.48468032479286194
1.1461893320083618 0.20625630021095276 0.20927461981773376 0.2557945251464844 0.47662121057510376 -0.6418120265007019 -0.2985978126525879 1.0127915143966675 -0.398916631937027 -1.4456056356430054
0.2262343019247055 -0.8553360104560852 0.10127707570791245 0.18299366533756256 0.31477582454681396 0.15762250125408173 -1.6080231666564941 1.5433456897735596 -1.0751574039459229 1.2443770170211792
-0.0771719440817833 0.10441994667053223 0.7341253161430359 1.2573859691619873 0.6562386751174927 -0.6382467150688171 -0.05928630009293556 -1.0132511854171753 -0.5974369645118713 0.40967586636543274
0.004415785428136587 -0.4210253357887268 0.7442784905433655 -0.5725408792495728 0.5462195873260498 -1.0933852195739746 -1.1129332780838013 -0.6032747626304626 1.8345623016357422 1.1984893083572388
0.6331589818000793 1.3837802410125732 0.5318757891654968 -0.6840939521789551 0.8321081399917603 0.25050145387649536 -1.1622778177261353 -0.7050056457519531 -0.06549245119094849 -1.4602961540222168
0.3204069435596466 -0.8191430568695068 0.4573107361793518 -1.2136025428771973 -0.5675181746482849 -0.7266758680343628 -0.031212201341986656 -0.12012416124343872 -0.29710572957992554 0.9852219223976135
1.4262821674346924 -0.2672862410545349 1.0378899574279785 -0.7522720694541931 -0.6533710956573486 -1.3227550983428955 -0.44243836402893066 -1.5958774089813232 -1.5162904262542725 1.209925889968872
-0.11565983295440674 -0.35410958528518677 -0.5972486734390259 -0.4434187114238739 2.4614715576171875 0.03463190793991089 1.1184158325195312 0.6207089424133301 0.5573611259460449 0.6690876483917236
-0.6895540356636047 -0.3429434597492218 1.1756212711334229 -0.09356377273797989 0.4086296856403351 -0.8162238001823425 1.1158862113952637 0.6267895102500916 -2.8943655490875244 -0.132553830742836
0.2839597165584564 1.3107786178588867 -0.5829496383666992 1.0912151336669922 1.2699445486068726 -0.9851419925689697 0.0008430131711065769 -0.6517494320869446 -0.2845141589641571 -1.7359248399734497
-1.0942730903625488 -0.23687131702899933 -1.069266438484192 0.5593739748001099 0.8239039182662964 -0.9064173698425293 0.8374661803245544 -1.6963270902633667 1.6548678874969482 0.2801225781440735
1.8101776838302612 -0.9648396968841553 -0.6196779608726501 -1.1564671993255615 -0.07477502524852753 1.193432092666626 1.2495510578155518 -1.7690165042877197 -1.1298855543136597 0.9360997676849365
1.523394227027893 -0.9413297176361084 -0.8621751666069031 0.9638739824295044 -1.8268489837646484 -0.3881881535053253 -0.7219868302345276 -0.4678535461425781 0.484478622674942 -2.668909788131714
0.2339848130941391 0.03133305534720421 0.27131256461143494 -0.009103471413254738 0.1773773729801178 -0.5420222878456116 -0.15730808675289154 -0.5197650194168091 -1.948490023612976 1.046708106994629
1.2467955350875854 1.1625522375106812 -0.14371293783187866 0.19678127765655518 0.1502665877342224 -1.8948307037353516 -1.5621964931488037 -1.504028081893921 0.46749237179756165 -0.2627536356449127 ]

KALDI CMVN
test [
-1.310366 -0.2685411 0.0621769 -1.058376 0.4632868 1.315441 -0.7949788 0.4528302 -1.669687 -0.5797082
1.274704 2.24822 1.713249 0.0951158 0.5222545 -0.1289578 0.194308 0.3991444 0.09037646 0.8073696
0.2217838 0.4295782 -1.813516 -0.4696116 -0.528861 1.221629 -0.3379251 0.4501157 1.063187 -0.8068768
-0.7766887 0.16624 -2.288153 -0.4160714 -1.767208 1.650265 -0.09482656 0.7239718 1.674501 -0.337724
0.6437133 0.8992291 0.2513838 0.3071226 -0.7161949 -0.2584729 -0.3461408 -0.2648338 -0.5883601 2.020444
0.2022693 1.255318 0.04235969 -0.9376417 0.2202827 -0.7303626 1.5873 0.303936 0.07054333 0.7305137
-0.8651643 -1.397236 2.173201 0.952937 -0.7412236 -0.6548291 0.2916906 1.278351 0.8597506 1.338301
0.2969043 -1.022353 -0.7740811 0.4073752 0.5100162 0.8319114 0.07385683 1.051668 -1.536324 0.4150802
0.2193075 -0.8688173 -0.4630062 -0.06332011 -0.5115235 1.290391 0.5995636 0.5403185 -0.6856702 0.2367141
-0.5712466 0.5258741 -2.174694 0.5614925 0.74067 1.034044 1.398752 1.365143 -0.07689686 0.05535303
0.5235386 0.3757679 -0.549847 0.004001837 0.1880562 1.69428 1.30117 -0.2353991 -0.03886787 -1.090409
0.6908711 0.5550594 -1.023181 -0.4066023 -0.8762694 0.07505472 -1.063213 -0.6214838 1.833118 1.888949
-0.2792832 -0.4709762 -0.9730195 0.2008328 -1.931395 0.2342304 -2.914866 -1.114229 0.825875 -1.355448
-0.02262152 -1.402038 0.7084672 0.07350501 0.4065046 1.302413 -1.31048 -0.9011539 1.951154 0.0584148
0.2448596 1.447673 -0.03365485 0.08896719 1.40456 -0.3869752 0.9893866 0.08210403 -1.596521 -0.3783291
-0.7744496 0.5940079 0.02134512 -0.2562832 0.327472 -1.624582 -0.66649 -0.636497 0.7799186 0.02803963
-0.7407516 0.2285331 -1.222414 -0.04474711 0.4226713 -0.1407853 -1.582728 1.043442 0.06842048 0.6725934
-0.5920207 1.630846 -0.3350121 -0.5079578 1.224342 -1.591498 0.3088782 -0.6361948 1.303938 -0.07523344
-0.8530812 -0.5218771 2.257517 -0.4247366 -1.587795 0.5727189 0.01598472 0.07986726 0.8489417 -0.3905431
0.2220939 -0.1237862 1.744949 0.7126451 0.05852249 -0.2768869 0.8665592 2.073604 -2.678686 2.799653
-1.069022 -1.751027 -0.4351121 -0.6349723 -1.378916 -0.3661324 -0.1996259 0.5727591 1.271363 0.2061786
0.06520129 0.098966 -1.533682 -0.9315909 -0.7883505 3.137852 1.089396 -0.8936418 -0.716628 -0.3611111
-0.1145535 -0.9136128 -0.2071535 1.383951 0.9034359 0.6577756 1.620784 1.380942 -0.1645174 -1.591228
-0.8797185 0.5174294 1.868755 1.467507 0.584825 -0.5076694 -0.01174632 -0.1587505 0.9264733 -1.092926
-1.117405 -0.8793297 -0.8906659 0.350748 0.01776977 -0.2707683 -0.9361326 0.07628034 -0.3657428 -0.671371
1.045042 0.1829641 0.3749981 0.2533515 0.3324894 -0.6258504 -0.1148897 1.041481 -0.2816344 -1.632296
0.1250867 -0.8786282 0.2670005 0.1805506 0.1706441 0.1735841 -1.424315 1.572035 -0.9578752 1.057686
-0.1783196 0.08112777 0.8998488 1.254943 0.5121069 -0.6222851 0.1244218 -0.9845616 -0.4801548 0.2229852
-0.09673184 -0.4443175 0.9100019 -0.5749839 0.4020878 -1.077424 -0.9292251 -0.5745852 1.951845 1.011799
0.5320113 1.360488 0.6975992 -0.686537 0.6879764 0.2664631 -0.9785697 -0.6763161 0.05178976 -1.646987
0.2192593 -0.8424352 0.6230342 -1.216046 -0.71165 -0.7107143 0.1524959 -0.09143457 -0.1798235 0.7985312
1.325135 -0.2905784 1.203613 -0.7547151 -0.7975029 -1.306793 -0.2587302 -1.567188 -1.399008 1.023235
-0.2168075 -0.3774018 -0.4315252 -0.4458617 2.31734 0.05059351 1.302124 0.6493985 0.6746433 0.482397
-0.7907017 -0.3662356 1.341345 -0.0960068 0.2644979 -0.8002622 1.299594 0.6554791 -2.777083 -0.3192445
0.1828121 1.287486 -0.4172262 1.088772 1.125813 -0.9691804 0.1845511 -0.6230599 -0.1672319 -1.922616
-1.195421 -0.2601635 -0.903543 0.556931 0.6797721 -0.8904558 1.021174 -1.667637 1.77215 0.09343192
1.70903 -0.9881319 -0.4539545 -1.15891 -0.2189068 1.209394 1.433259 -1.740327 -1.012603 0.7494091
1.422247 -0.9646219 -0.6964517 0.961431 -1.970981 -0.3722265 -0.5382787 -0.439164 0.6017609 -2.8556
0.1328372 0.008040876 0.437036 -0.0115465 0.0332456 -0.5260607 0.02640004 -0.4910754 -1.831208 0.8600174
1.145648 1.13926 0.0220105 0.1943382 0.006134818 -1.878869 -1.378488 -1.475338 0.5847746 -0.4494443 ]

THIS FUNCTION
tensor([[-1.3104, -0.2685, 0.0622, -1.0584, 0.4633, 1.3154, -0.7950, 0.4528,
-1.6697, -0.5797],
[ 1.2747, 2.2482, 1.7132, 0.0951, 0.5223, -0.1290, 0.1943, 0.3991,
0.0904, 0.8074],
[ 0.2218, 0.4296, -1.8135, -0.4696, -0.5289, 1.2216, -0.3379, 0.4501,
1.0632, -0.8069],
[-0.7767, 0.1662, -2.2882, -0.4161, -1.7672, 1.6503, -0.0948, 0.7240,
1.6745, -0.3377],
[ 0.6437, 0.8992, 0.2514, 0.3071, -0.7162, -0.2585, -0.3461, -0.2648,
-0.5884, 2.0204],
[ 0.2023, 1.2553, 0.0424, -0.9376, 0.2203, -0.7304, 1.5873, 0.3039,
0.0705, 0.7305],
[-0.8652, -1.3972, 2.1732, 0.9529, -0.7412, -0.6548, 0.2917, 1.2784,
0.8598, 1.3383],
[ 0.2969, -1.0224, -0.7741, 0.4074, 0.5100, 0.8319, 0.0739, 1.0517,
-1.5363, 0.4151],
[ 0.2193, -0.8688, -0.4630, -0.0633, -0.5115, 1.2904, 0.5996, 0.5403,
-0.6857, 0.2367],
[-0.5712, 0.5259, -2.1747, 0.5615, 0.7407, 1.0340, 1.3988, 1.3651,
-0.0769, 0.0554],
[ 0.5235, 0.3758, -0.5498, 0.0040, 0.1881, 1.6943, 1.3012, -0.2354,
-0.0389, -1.0904],
[ 0.6909, 0.5551, -1.0232, -0.4066, -0.8763, 0.0751, -1.0632, -0.6215,
1.8331, 1.8889],
[-0.2793, -0.4710, -0.9730, 0.2008, -1.9314, 0.2342, -2.9149, -1.1142,
0.8259, -1.3554],
[-0.0226, -1.4020, 0.7085, 0.0735, 0.4065, 1.3024, -1.3105, -0.9012,
1.9512, 0.0584],
[ 0.2449, 1.4477, -0.0337, 0.0890, 1.4046, -0.3870, 0.9894, 0.0821,
-1.5965, -0.3783],
[-0.7744, 0.5940, 0.0213, -0.2563, 0.3275, -1.6246, -0.6665, -0.6365,
0.7799, 0.0280],
[-0.7408, 0.2285, -1.2224, -0.0447, 0.4227, -0.1408, -1.5827, 1.0434,
0.0684, 0.6726],
[-0.5920, 1.6308, -0.3350, -0.5080, 1.2243, -1.5915, 0.3089, -0.6362,
1.3039, -0.0752],
[-0.8531, -0.5219, 2.2575, -0.4247, -1.5878, 0.5727, 0.0160, 0.0799,
0.8489, -0.3905],
[ 0.2221, -0.1238, 1.7449, 0.7126, 0.0585, -0.2769, 0.8666, 2.0736,
-2.6787, 2.7997],
[-1.0690, -1.7510, -0.4351, -0.6350, -1.3789, -0.3661, -0.1996, 0.5728,
1.2714, 0.2062],
[ 0.0652, 0.0990, -1.5337, -0.9316, -0.7884, 3.1379, 1.0894, -0.8936,
-0.7166, -0.3611],
[-0.1146, -0.9136, -0.2072, 1.3840, 0.9034, 0.6578, 1.6208, 1.3809,
-0.1645, -1.5912],
[-0.8797, 0.5174, 1.8688, 1.4675, 0.5848, -0.5077, -0.0117, -0.1588,
0.9265, -1.0929],
[-1.1174, -0.8793, -0.8907, 0.3507, 0.0178, -0.2708, -0.9361, 0.0763,
-0.3657, -0.6714],
[ 1.0450, 0.1830, 0.3750, 0.2534, 0.3325, -0.6259, -0.1149, 1.0415,
-0.2816, -1.6323],
[ 0.1251, -0.8786, 0.2670, 0.1806, 0.1706, 0.1736, -1.4243, 1.5720,
-0.9579, 1.0577],
[-0.1783, 0.0811, 0.8998, 1.2549, 0.5121, -0.6223, 0.1244, -0.9846,
-0.4802, 0.2230],
[-0.0967, -0.4443, 0.9100, -0.5750, 0.4021, -1.0774, -0.9292, -0.5746,
1.9518, 1.0118],
[ 0.5320, 1.3605, 0.6976, -0.6865, 0.6880, 0.2665, -0.9786, -0.6763,
0.0518, -1.6470],
[ 0.2193, -0.8424, 0.6230, -1.2160, -0.7116, -0.7107, 0.1525, -0.0914,
-0.1798, 0.7985],
[ 1.3251, -0.2906, 1.2036, -0.7547, -0.7975, -1.3068, -0.2587, -1.5672,
-1.3990, 1.0232],
[-0.2168, -0.3774, -0.4315, -0.4459, 2.3173, 0.0506, 1.3021, 0.6494,
0.6746, 0.4824],
[-0.7907, -0.3662, 1.3413, -0.0960, 0.2645, -0.8003, 1.2996, 0.6555,
-2.7771, -0.3192],
[ 0.1828, 1.2875, -0.4172, 1.0888, 1.1258, -0.9692, 0.1846, -0.6231,
-0.1672, -1.9226],
[-1.1954, -0.2602, -0.9035, 0.5569, 0.6798, -0.8905, 1.0212, -1.6676,
1.7722, 0.0934],
[ 1.7090, -0.9881, -0.4540, -1.1589, -0.2189, 1.2094, 1.4333, -1.7403,
-1.0126, 0.7494],
[ 1.4222, -0.9646, -0.6965, 0.9614, -1.9710, -0.3722, -0.5383, -0.4392,
0.6018, -2.8556],
[ 0.1328, 0.0080, 0.4370, -0.0115, 0.0332, -0.5261, 0.0264, -0.4911,
-1.8312, 0.8600],
[ 1.1456, 1.1393, 0.0220, 0.1943, 0.0061, -1.8789, -1.3785, -1.4753,
0.5848, -0.4494]])

THS !!!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I will work on testing.

torchaudio/functional.py Outdated Show resolved Hide resolved
docs/source/transforms.rst Show resolved Hide resolved
torchaudio/transforms.py Outdated Show resolved Hide resolved
@vincentqb
Copy link
Contributor

@vincentqb

We have torchaudio.compliance.kaldi module which has a collection of Kaldi compatible functions, and it is not immediately clear to me if newly added Kaldi compatible features should go there as a simple function, or to functional and/or transforms as in the current state of this PR. What is your thought?

Let's keep transforms and functionals in those files, and let's not add functionality to torchauidio.compliance.kaldi for now.

If we are to keep torchaudio.compliance.kaldi, it should be only a thin wrapper around our transforms anyway. We could instead consider deprecating this module completely. We can have another discussion around this, independent of this PR.

@mthrok
Copy link
Collaborator

mthrok commented Apr 15, 2020

@wanglong001 It seems to me that you use cmn and cmvn interchangeably.
I looked at Kaldi code and Kaldi only uses cmn SlidingWindowCmnInternal and SlidingWindowCmnOptions except in the command name apply-cmvn-sliding.

I do not know much about the definitions but, unless you have a strong opinion,
I would like to use cmn in our code base. This will also make it easy to write the test I am working on.
Your thoughts?

Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comments.

torchaudio/functional.py Outdated Show resolved Hide resolved
@wanglong001
Copy link
Contributor Author

@wanglong001 It seems to me that you use cmn and cmvn interchangeably.
I looked at Kaldi code and Kaldi only uses cmn SlidingWindowCmnInternal and SlidingWindowCmnOptions except in the command name apply-cmvn-sliding.

I do not know much about the definitions but, unless you have a strong opinion,
I would like to use cmn in our code base. This will also make it easy to write the test I am working on.
Your thoughts?

@mthrok Hi, v is optionally variance, It would be better to use cmn in our code base

@mthrok
Copy link
Collaborator

mthrok commented Apr 16, 2020

@wanglong001

Thanks!

If you are interested in, we can also add batch consistency test for sliding_window_cmn similar to this.
If you do not have time for that let me know. I can do that alongside with the Kaldi compatibility test I am working on.

FYI: The test you added passed on GPU too.
$ pytest test/test_torchscript_consistency.py -vv -k cmn
============================================================================================================ test session starts =============================================================================================================
platform linux -- Python 3.8.2, pytest-5.4.1, py-1.8.1, pluggy-0.13.1 -- /home/moto/conda/envs/PY3.8-cuda101/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/scratch/moto/torchaudio/.hypothesis/examples')
rootdir: /scratch/moto/torchaudio
plugins: hypothesis-5.8.3
collected 96 items / 92 deselected / 4 selected

test/test_torchscript_consistency.py::TestFunctionalCPU::test_sliding_window_cmn PASSED                                                                                                                                                [ 25%]
test/test_torchscript_consistency.py::TestFunctionalCUDA::test_sliding_window_cmn PASSED                                                                                                                                               [ 50%]
test/test_torchscript_consistency.py::TestTransformsCPU::test_SlidingWindowCmn PASSED                                                                                                                                                  [ 75%]
test/test_torchscript_consistency.py::TestTransformsCUDA::test_SlidingWindowCmn PASSED                                                                                                                                                 [100%]

====================================================================================================== 4 passed, 92 deselected in 3.41s ======================================================================================================

@wanglong001
Copy link
Contributor Author

@wanglong001

Thanks!

If you are interested in, we can also add batch consistency test for sliding_window_cmn similar to this.
If you do not have time for that let me know. I can do that alongside with the Kaldi compatibility test I am working on.

FYI: The test you added passed on GPU too.

$ pytest test/test_torchscript_consistency.py -vv -k cmn
============================================================================================================ test session starts =============================================================================================================
platform linux -- Python 3.8.2, pytest-5.4.1, py-1.8.1, pluggy-0.13.1 -- /home/moto/conda/envs/PY3.8-cuda101/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/scratch/moto/torchaudio/.hypothesis/examples')
rootdir: /scratch/moto/torchaudio
plugins: hypothesis-5.8.3
collected 96 items / 92 deselected / 4 selected

test/test_torchscript_consistency.py::TestFunctionalCPU::test_sliding_window_cmn PASSED                                                                                                                                                [ 25%]
test/test_torchscript_consistency.py::TestFunctionalCUDA::test_sliding_window_cmn PASSED                                                                                                                                               [ 50%]
test/test_torchscript_consistency.py::TestTransformsCPU::test_SlidingWindowCmn PASSED                                                                                                                                                  [ 75%]
test/test_torchscript_consistency.py::TestTransformsCUDA::test_SlidingWindowCmn PASSED                                                                                                                                                 [100%]

====================================================================================================== 4 passed, 92 deselected in 3.41s ======================================================================================================

e can also add batch consistency test for sliding_window_cmn similar to this

@mthrok Hi, I do not have time recently, Thanks

@mthrok
Copy link
Collaborator

mthrok commented Apr 17, 2020

@wanglong001 Thanks for letting us know.

@vincentqb This PR is ready to merge. I can follow up on other type of test (and batching) in another PR.

@vincentqb
Copy link
Contributor

Rebased, and merging. Thanks!

@vincentqb vincentqb merged commit b42d610 into pytorch:master Apr 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

KALDI:apply-cmvn-sliding
3 participants