Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tfio.experimental.audio.decode_mp3 support #865

Merged
merged 4 commits into from
Mar 21, 2020

Conversation

yongtang
Copy link
Member

@yongtang yongtang commented Mar 20, 2020

This PR

  • adds tfio.experimental.audio.decode_mp3 support,
  • Update minimp3 to 55da78c
  • Merge audio and video into one library, as minimp4 serves both audio and video
  • This PR also adds encode_mp3, through lame on Linux (macOS and Windows are not supported).

This PR closes #815

/cc @jjedele @faroit

Signed-off-by: Yong Tang yong.tang.github@outlook.com

@jjedele
Copy link

jjedele commented Mar 20, 2020

@yongtang Looks great, thx for taking over!

@jjedele
Copy link

jjedele commented Mar 20, 2020

I think the only thing left in #815 is this function to ensure a fixed output shape which we talked about (

def _fix_shape(data, desired_channels, desired_samples):
"""Fix shape of decoded audio to desired channels and samples."""
org_shape = tf.shape(data)
org_samples = org_shape[1]
if desired_samples >= 0:
# truncate if necessary
# ignored if org_samples <= desired_samples
data = data[:, :desired_samples]
# pad if necessary
right_pad = desired_samples - org_samples
if right_pad > 0:
data = tf.pad(data, [[0, 0], [0, right_pad]], "CONSTANT")
if desired_channels >= 0:
out_samples = desired_samples if desired_samples >= 0 else org_samples
# truncate if necessary
# ignored if org_channels <= desired_channels
data = data[:desired_channels, :]
# convert to mono to stereo if necessary
data = tf.broadcast_to(data, [desired_channels, out_samples])
return data
).

Not sure if it's helpful/general to add this here of it's simply enough to write on your own when you need it.

@yongtang
Copy link
Member Author

@jjedele As we discussed I think the def _fix_shape could be a separate API, something like tfio.experimental.audio.normalize. Would you like to work on it?

This PR adds tfio.experimental.audio.decode_mp3 support

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
…and video

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
@terrytangyuan terrytangyuan merged commit 0740cdf into tensorflow:master Mar 21, 2020
@yongtang yongtang deleted the mp3 branch March 21, 2020 20:39
i-ony pushed a commit to i-ony/io that referenced this pull request Feb 8, 2021
* Add tfio.experimental.audio.decode_mp3 support

This PR adds tfio.experimental.audio.decode_mp3 support

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

* Update minimp3 to 55da78c

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

* Merge audio and video into one library, as minimp4 serves both audio and video

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

* Support encode_mp3 in Linux with lame

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants