Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement specaug as preprocessing and unify egs/wsj/asr1 and asr2 #745

Merged
merged 15 commits into from May 29, 2019

Conversation

@ShigekiKarita
Copy link
Contributor

commented May 23, 2019

@kan-bayashi

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

Thanks Shigeki.
I think now your specaug is numpy array operation.
So it can be integrated with @kamo-naoyuki’s preprocess-conf.
Why don’t you do it?

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

Cool. My impl is still buggy and wip. Let me summarize my thoughts

  • @bobchennan impl pros: differentiable w.r.t. fbank (maybe?), similar to TF in terms of sparse_image_warp reimpl (see this discussion) https://github.com/mozilla/DeepSpeech/pull/2090/files#r281919349
  • @bobchennan impl cons: slow in Transformer (4.0 iter/sec -> 3.3 iter/sec), maybe RNN doesn't matter
  • My impl pros: faster in Transformer (4.0 iter/sec -> 4.0 iter/sec)
  • My impl cons: non differentiable w.r.t. fbank, only similar to TF in terms of 2 order spline interpolation
  • Integration with preprocess (transform)? I'm not sure because currently the transform does not care it is training or evaluation. Maybe we need big change there.
@kan-bayashi

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

Integration with preprocess (transform)? I'm not sure because currently the transform does not care it is training or evaluation. Maybe we need big change there.

load_tr = LoadInputsAndTargets(
mode='asr', load_output=True, preprocess_conf=args.preprocess_conf,
preprocess_args={'train': True} # Switch the mode of preprocessing
)
load_cv = LoadInputsAndTargets(
mode='asr', load_output=True, preprocess_conf=args.preprocess_conf,
preprocess_args={'train': False} # Switch the mode of preprocessing
)

I think there are already train flag options.

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

I have not know that great! I will work on that.

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

So I will create something like

class SpecAugment(TransformBase):
    """Spec Augument
    apply random time warping and time/freq masking
    default setting is based on LD (Librispeech double) in Table 2 https://arxiv.org/pdf/1904.08779.pdf

    :param str resize_mode: "PIL" (fast, nondifferentiable) or "sparse_image_warp" (slow, differentiable)
    :param int max_time_warp: maximum frames to warp the center frame in spectrogram (W)
    :param int max_freq_mask: maximum width of the random freq mask (F)
    :param int n_freq_mask: the number of the random freq mask (m_F)
    :param int max_time_mask: maximum width of the random time mask (T)
    :param int n_time_mask: the number of the random time mask (m_T)
    """
    def __init__(
        self, resize_mode="PIL", max_time_warp=80, 
        max_freq_mask=27, n_freq_mask=2,
        max_time_mask=100, n_time_mask=2):
@kan-bayashi

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

Nice :)

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

This is tensorboard plot. Y-axis is train or valid acc. X-axis is time (hour). Blue line is the base recipe. Red line is my impl specaug. Cyan line is prev impl specaug.
At this time, my impl seems to have less overhead. Anyway convergence and acc does not get slow a lot in any cases. Maybe I need to tweak something from RNN config?
image

image

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

Hmm, transformer + specaug results worse than baseline. Just a moment for @bobchennan impl because it is still running.

recipe WER dev93 WER eval92
Transformer (run_pytorch.sh) 8.2 4.8
+specaug LD setting (run_pytorch_specaug_new.sh) 8.2 5.5
+specaug @bobchennan setting (run_pytorch_specaug_prev.sh) 7.6 5.4
+specaug @bobchennan setting (run.sh --preprocess-conf conf/preprocess.conf) 8.0 5.1

@ShigekiKarita ShigekiKarita changed the title [WIP] Update Transformer WSJ recipe with specaug [WIP] Implement specaug as preprocessing May 23, 2019

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

After refactoring some espnet.transform modules and Transformer recipe, I'm considering the name "transform" is very annoying. In fact, sometimes I get confused to recognize "transform" or "Transformer" 🤒

I suspect that espnet.transform is derived from torchvision. Maybe we can rename this to something like

  1. espnet.preprocess (this is most relevant to --preprocess-conf arg)
  2. espnet.signal (like scipy?)
  3. espnet.feat (like kaldi?)
  4. espnet.frontend

Which one is the best? Welcome to suggest new better name.

@ShigekiKarita ShigekiKarita added this to the v.0.4.0 milestone May 23, 2019

@kamo-naoyuki

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

I don't care much about this confusing, but how about espnet.processings? It's not necessary to be pre.

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

In my opinion, at least espnet.transform and --preprocess-conf should be met. I mean

  • If we keep espnet.transform, we should change the cmd arg to --transform-conf
  • If we keep --preprocess-conf, we should change the module name to espnet.preprocess
@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

It's not necessary to be pre.

Hmm. I'm not sure what you mean. I expect this transform module is mainly used inside data loader/iterator. It should be data "pre"paration i.e., preprocessing. It is true?

@kan-bayashi

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

I agree with espnet.preprocess.

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

I'm now testing @kamo-naoyuki's YAML config. This is amazing. In fact, we can merge asr1 and asr2 easily because their differences are only the YAML and the model averaging stage. How do you think about it?

@kamo-naoyuki

This comment has been minimized.

Copy link
Contributor

commented May 24, 2019

I'm not much particular about the naming of this module, basically.

I named it as transform instead of preprocessing because these modules are just a set of transformation functions itself and they are not necessary to be applied before main processing.
For example, someone might want to apply it after asr_enhance.py. I didn't want to limit the naming for future development.

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 24, 2019

gotcha

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 24, 2019

I updated the Transformer result in #745 (comment)
Maybe we need more tweaks there but I'm go back the asr1 recipe to check that RNN+specaug can reproduce @bobchennan result.

@ShigekiKarita ShigekiKarita changed the title [WIP] Implement specaug as preprocessing Implement specaug as preprocessing May 24, 2019

@sw005320
Copy link
Contributor

left a comment

I'll review again once you get a result.

egs/wsj/asr2/conf/decode_chainer.yaml Outdated Show resolved Hide resolved
egs/wsj/asr2/conf/decode_pytorch.yaml Outdated Show resolved Hide resolved
egs/wsj/asr2/conf/preprocess.json Outdated Show resolved Hide resolved
@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 26, 2019

For differentiable specaug, we can use torch.nn.interpolate in torch 1.1.0 pytorch/pytorch#9849

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

Hmm I cannot reproduce the result #734 (comment)

dev CER dev WER eval CER eval WER training time (sec)
RNN baseline 3.9 8.6 2.8 5.1 28625
RNN specaug(sparse_image_warp @bobchennan) 3.9 8.6 2.8 6.1 30448
RNN specaug(PIL resize) 3.9 8.6 2.8 6.1 30050
@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

I used default values in @bobchennan's specaug. Was it tweaked from that?

@ShigekiKarita ShigekiKarita changed the title Implement specaug as preprocessing [WIP] Implement specaug as preprocessing May 27, 2019

@ShigekiKarita ShigekiKarita changed the title [WIP] Implement specaug as preprocessing Implement specaug as preprocessing and unify egs/wsj/asr1 and asr2 May 27, 2019

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

When we compare Transformer and RNN results above, I made unified run.sh as discussed at #758 with @sw005320 @kan-bayashi @kamo-naoyuki

@ShigekiKarita ShigekiKarita requested review from kan-bayashi and bobchennan May 27, 2019

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

In fact, this unification is great when we search a new method over RNN and Transformer

for net in pytorch_transformer rnn; do
  for pre in no_preprocess specaug; do
    # I'm using slurm. Do not try this wihout such a job system.
    ./run.sh --ngpu 1 --stage 4 --backend pytorch \
      --train-config conf/tuning/train_${net}.yaml \
      --decode-config conf/tuning/decode_${net}.yaml \
      --preprocess-config conf/${pre}.yaml &
  done
done
wait

In addition, you can edit params in yaml with utils/change_yaml.py inside this script. COOL.

@bobchennan

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

Sorry for the late reply. I am in vacation during these two weeks.

You should be able to reproduce my results with commits before this. In this commit I fixed a bug in previous implementation: specaug implementation we used takes 1 * feature_dim * feature_len as input but we usually use 1 * feature_len * feature_dim. So I tried to fix this problem in this commit however I didn't verify results of this commit.

Please change it back and test it.

@ShigekiKarita

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

Thanks @bobchennan! I will check that.

BTW, I have almost no time to check the detail and to work on the reproduction this week. Maybe it is better to merge this PR for unified asr1 and draft preprocessing specaug. I rework that in the later PR.

@sw005320

This comment has been minimized.

Copy link
Contributor

commented May 29, 2019

OK for me. @bobchennan, do you have time to review this PR?

@sw005320

This comment has been minimized.

Copy link
Contributor

commented May 29, 2019

I'll merge it, but please make sure to reproduce @bobchennan's results with later PRs.

@sw005320 sw005320 merged commit 0c88bba into espnet:v.0.4.0 May 29, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@kamo-naoyuki

This comment has been minimized.

Copy link
Contributor

commented on espnet/transform/transformation.py in b4262c2 Jun 15, 2019

@ShigekiKarita I met the following error.

  File "/data/work/espnet_0427/egs/an4/asr1/../../../espnet/bin/asr_train.py", line 371, in <module>
    main(sys.argv[1:])
  File "/data/work/espnet_0427/egs/an4/asr1/../../../espnet/bin/asr_train.py", line 359, in main
    train(args)
  File "/data/work/espnet_0427/espnet/asr/pytorch_backend/asr.py", line 335, in train
    preprocess_args={'train': True}  # Switch the mode of preprocessing
  File "/data/work/espnet_0427/espnet/utils/io_utils.py", line 54, in __init__
    self.preprocessing = Transformation(preprocess_conf)
  File "/data/work/espnet_0427/espnet/transform/transformation.py", line 86, in __init__
    check_kwargs(class_obj, opts)
  File "/data/work/espnet_0427/espnet/utils/check_kwargs.py", line 20, in check_kwargs
    raise TypeError(f"{name}() got an unexpected keyword argument '{k}'")
TypeError: TimeWarp() got an unexpected keyword argument 'max_time_warp'
# Accounting: time=4 threads=1
# Ended (code 1) at Sat Jun 15 18:16:46 JST 2019, elapsed time 4 seconds

TimeWarp class has only **kwargs argument ifself and check_kwargs can't handle such arguments.
I think we don't need to check the arguments here. We should show the original python errors.

This comment has been minimized.

Copy link
Contributor Author

replied Jun 15, 2019

I haven't expected this. ok.

@@ -21,6 +21,8 @@ seed=1
# feature configuration
do_delta=false

# config files
preprocess_config=conf/no_preprocess.yaml # use conf/specaug.yaml for data augmentation

This comment has been minimized.

Copy link
@jiayu2B

jiayu2B Jul 10, 2019

Is the comment means if I change this line to preprocess_config=conf/specaug.yaml, The spec_augment module would work? If not, how can I use specaug function?

This comment has been minimized.

Copy link
@bobchennan

bobchennan Jul 10, 2019

Contributor

check #830

This comment has been minimized.

Copy link
@ShigekiKarita

ShigekiKarita Jul 18, 2019

Author Contributor

@jiayu2B yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.