Skip to content

Tensorflow implementation of Conformer - Transformer-based model for Speech Recognition

License

Notifications You must be signed in to change notification settings

thanhtvt/conformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Conformer

Open In Colab

About

This is the implementation of Conformer [1] in Tensorflow 2

Note: This repository is still in development and constantly evolving. New features and updates will appear over time.

Motivation

I have seen many Conformer's implementations ([2], [3]) but none of them is in Tensorflow. Therefore, I wanted to challenge myself to implement this model in my favorite framework.

Installation

You should have Python 3.7 or higher. I highly recommend creating a virual environment like venv or conda.

The main part of this project uses only Tensorflow 2. However, I also use Tensorflow I/O for features augmentation (which is not needed right now)

Script for downloading dependencies (setuptools is not available right now)

pip install tensorflow
pip install tensorflow-io   # optional

Usage

import tensorflow as tf
from conformer import Conformer

batch_size, seq_len, input_dim = 3, 15, 256

model = Conformer(
    num_conv_filters=[512, 512], 
    num_blocks=1, 
    encoder_dim=512, 
    num_heads=8, 
    dropout_rate=0.4, 
    num_classes=10, 
    include_top=True
)

# Get sample input
inputs = tf.random.uniform((batch_size, seq_len, input_dim),
                            minval=-40,
                            maxval=40)

# Convert to 4-dimensional tensor to fit Conv2D
inputs = tf.expand_dims(inputs, axis=1)  

# Get output
outputs = model(inputs)     # [batch_size, 1, seq_len, num_class]
outputs = tf.squeeze(outputs, axis=1)

References

[1] Conformer: Convolution-augmented Transformer for Speech Recognition 🔗

[2] @sooftware's PyTorch implementation 🔗

[3] @jaketae's PyTorch implementation 🔗

About

Tensorflow implementation of Conformer - Transformer-based model for Speech Recognition

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published