Audio Waveform and Word Alignments Visualization

Task Description

This project involves generating waveforms of audio files along with word alignments based on information from a CTM (Continuous Time Marked) file. The script reads the CTM file and associated audio files, and then plots the waveforms with marked word intervals.

Task Details

Read the supplied CTM file and audio files.
Generate waveforms for each audio file.
Display word alignments on the waveform plots.
Annotate words on the waveform plot.

Sample output file plot_64de11fdd1033954ee0031f4.png is provided for reference.

All reference data is available in Google Drive Folder.

Deliverables

A fully functional script (Pinnacle_ASR_Task_Script.py) for generating waveforms along with waveform output images.

Implementation

The script Pinnacle_ASR_Task_Script.py processes the CTM file and audio files to create visualizations. It utilizes Matplotlib for plotting waveforms, highlighting word intervals, and annotating words on the plot.

Usage

Download the script Pinnacle_ASR_Task_Script.py.
Install the required libraries if not already installed:
```
pip install matplotlib numpy scipy
```

Outputs

The Waveform and the corresponding words for the Audio 1 are marked in the plot as,
The Waveform and the corresponding words for the Audio 2 are marked in the plot as,

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
audio_data		audio_data
Mangal Regular.ttf		Mangal Regular.ttf
Output_Audio_1.png		Output_Audio_1.png
Output_Audio_2.png		Output_Audio_2.png
Pinnacle_ASR_Task_Script.py		Pinnacle_ASR_Task_Script.py
README.md		README.md
ctm.ctm		ctm.ctm
plot_64de11fdd1033954ee0031f4.png		plot_64de11fdd1033954ee0031f4.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audio_data

audio_data

Mangal Regular.ttf

Mangal Regular.ttf

Output_Audio_1.png

Output_Audio_1.png

Output_Audio_2.png

Output_Audio_2.png

Pinnacle_ASR_Task_Script.py

Pinnacle_ASR_Task_Script.py

README.md

README.md

ctm.ctm

ctm.ctm

plot_64de11fdd1033954ee0031f4.png

plot_64de11fdd1033954ee0031f4.png

Repository files navigation

Audio Waveform and Word Alignments Visualization

Task Description

Task Details

Deliverables

Implementation

Usage

Outputs

About

Releases

Packages

Languages

Deus1704/Identification-of-Word-Inside-Audio-Waveform

Folders and files

Latest commit

History

Repository files navigation

Audio Waveform and Word Alignments Visualization

Task Description

Task Details

Deliverables

Implementation

Usage

Outputs

About

Resources

Stars

Watchers

Forks

Languages