Add tfio.experimental.audio.decode_mp3 support #865

yongtang · 2020-03-20T04:06:06Z

This PR

adds tfio.experimental.audio.decode_mp3 support,
Update minimp3 to 55da78c
Merge audio and video into one library, as minimp4 serves both audio and video
This PR also adds encode_mp3, through lame on Linux (macOS and Windows are not supported).

This PR closes #815

/cc @jjedele @faroit

Signed-off-by: Yong Tang yong.tang.github@outlook.com

jjedele · 2020-03-20T10:28:46Z

@yongtang Looks great, thx for taking over!

jjedele · 2020-03-20T10:44:26Z

I think the only thing left in #815 is this function to ensure a fixed output shape which we talked about (

io/tensorflow_io/core/python/ops/audio_ops.py

Lines 75 to 100 in cba8bf7

    
           def _fix_shape(data, desired_channels, desired_samples): 
        
             """Fix shape of decoded audio to desired channels and samples.""" 
        
             org_shape = tf.shape(data) 
        
             org_samples = org_shape[1] 
        
             if desired_samples >= 0: 
        
               # truncate if necessary 
        
               # ignored if org_samples <= desired_samples 
        
               data = data[:, :desired_samples] 
        
               # pad if necessary 
        
               right_pad = desired_samples - org_samples 
        
               if right_pad > 0: 
        
                 data = tf.pad(data, [[0, 0], [0, right_pad]], "CONSTANT") 
        
             if desired_channels >= 0: 
        
               out_samples = desired_samples if desired_samples >= 0 else org_samples 
        
               # truncate if necessary 
        
               # ignored if org_channels <= desired_channels 
        
               data = data[:desired_channels, :] 
        
               # convert to mono to stereo if necessary 
        
               data = tf.broadcast_to(data, [desired_channels, out_samples]) 
        
             return data

).

Not sure if it's helpful/general to add this here of it's simply enough to write on your own when you need it.

yongtang · 2020-03-20T15:55:25Z

@jjedele As we discussed I think the def _fix_shape could be a separate API, something like tfio.experimental.audio.normalize. Would you like to work on it?

This PR adds tfio.experimental.audio.decode_mp3 support Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

…and video Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

* Add tfio.experimental.audio.decode_mp3 support This PR adds tfio.experimental.audio.decode_mp3 support Signed-off-by: Yong Tang <yong.tang.github@outlook.com> * Update minimp3 to 55da78c Signed-off-by: Yong Tang <yong.tang.github@outlook.com> * Merge audio and video into one library, as minimp4 serves both audio and video Signed-off-by: Yong Tang <yong.tang.github@outlook.com> * Support encode_mp3 in Linux with lame Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

yongtang mentioned this pull request Mar 20, 2020

Audio Processing API and tfio.audio #839

Open

yongtang added 3 commits March 20, 2020 17:17

Add tfio.experimental.audio.decode_mp3 support

c5bc134

This PR adds tfio.experimental.audio.decode_mp3 support Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

Update minimp3 to 55da78c

0590fe0

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

Merge audio and video into one library, as minimp4 serves both audio …

ea27e51

…and video Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

yongtang force-pushed the mp3 branch 3 times, most recently from 3322f60 to 4371f56 Compare March 21, 2020 02:16

Support encode_mp3 in Linux with lame

9483769

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

yongtang force-pushed the mp3 branch from 4371f56 to 9483769 Compare March 21, 2020 03:45

yongtang mentioned this pull request Mar 21, 2020

Add mp4 video dataset support on macOS #870

Merged

terrytangyuan approved these changes Mar 21, 2020

View reviewed changes

terrytangyuan merged commit 0740cdf into tensorflow:master Mar 21, 2020

yongtang deleted the mp3 branch March 21, 2020 20:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tfio.experimental.audio.decode_mp3 support #865

Add tfio.experimental.audio.decode_mp3 support #865

yongtang commented Mar 20, 2020 •

edited

Loading

jjedele commented Mar 20, 2020

jjedele commented Mar 20, 2020 •

edited

Loading

yongtang commented Mar 20, 2020

Add tfio.experimental.audio.decode_mp3 support #865

Add tfio.experimental.audio.decode_mp3 support #865

Conversation

yongtang commented Mar 20, 2020 • edited Loading

jjedele commented Mar 20, 2020

jjedele commented Mar 20, 2020 • edited Loading

yongtang commented Mar 20, 2020

yongtang commented Mar 20, 2020 •

edited

Loading

jjedele commented Mar 20, 2020 •

edited

Loading