# EasyMMS notebook example
[EasyMMS](https://github.com/abdeladim-s/easymms) is simple Python package to easily use [Meta's Massively Multilingual Speech (MMS) project](https://github.com/facebookresearch/fairseq/tree/main/examples/mms). 


# Dependencies

In [None]:
import os

!apt install ffmpeg
!apt install sox
!pip install -U --pre torchaudio --index-url https://download.pytorch.org/whl/nightly/cu118
!pip install git+https://github.com/abdeladim-s/easymms
import locale
locale.getpreferredencoding = lambda: "UTF-8"

In [None]:
# @title Download Model { display-mode: "form" }

model = 'mms1b_fl102' #@param ["mms1b_fl102", "mms1b_l1107", "mms1b_all"] {allow-input: true}

if model == "mms1b_fl102": 
  !wget -P ./models 'https://dl.fbaipublicfiles.com/mms/asr/mms1b_fl102.pt'

elif model == "mms1b_l1107":
  !wget -P ./models 'https://dl.fbaipublicfiles.com/mms/asr/mms1b_l1107.pt'

elif model == "mms1b_all":
  !wget -P ./models 'https://dl.fbaipublicfiles.com/mms/asr/mms1b_all.pt'

# Upload Media files

In [None]:
files = ['/content/media_file1.wav', '/content/media_file2.mp3']

# ASR Model inference

In [None]:
from easymms.models.asr import ASRModel

asr = ASRModel(model=f'./models/{model}.pt')

transcriptions = asr.transcribe(files, lang='eng', align=False)
for i, transcription in enumerate(transcriptions):
    print(f">>> file {files[i]}")
    print(transcription)

# ASR Model inference + Alignment

In [None]:
from easymms.models.asr import ASRModel

asr = ASRModel(model='./models/mms1b_fl102.pt')
files = ['/content/talk_10s.wav']
transcriptions = asr.transcribe(files, lang='eng', align=True, device='cpu')
for i, transcription in enumerate(transcriptions):
    print(f">>> file {files[i]}")
    for segment in transcription:
        print(f"{segment['start_time']} -> {segment['end_time']}: {segment['text']}")
    print("----")

# TTS Model

In [4]:
from easymms.models.tts import TTSModel
from IPython.display import Audio

text = "This is a simple example"

tts = TTSModel('eng')
res = tts.synthesize(text)
tts.save(res)

Audio(res[0], rate=res[1])

[INFO] File 'eng.tar.gz' already exists in /home/su/.local/share/easymms/tts
[INFO] Extracting /home/su/.local/share/easymms/tts/eng.tar.gz to /home/su/.local/share/easymms/tts
[INFO] /home/su/.local/share/easymms/uroman already exists
[INFO] /home/su/.local/share/easymms/tts/vits already exists
[INFO] /home/su/.local/share/easymms/fairseq already exists
[INFO] loading /home/su/.local/share/easymms/tts/eng/G_100000.pth ...
[INFO] Loaded checkpoint '/home/su/.local/share/easymms/tts/eng/G_100000.pth' (iteration 6251)
[INFO] text: This is a simple example
[INFO] Saving audio file to out.wav


text after filtering OOV: this is a simple example
