stempeg 2.0 #28

faroit · 2019-11-05T14:28:27Z

This addresses #27 and implements a new ffmpeg backend. I choose ffmpeg-python for reading and writing. Here the audio is piped directly to stdin instead of writing temporarly files with pysoundfile and converting them in a separate process call.

Part of the code was copied from spleeters audio backend. First benchmarks of the input piping indicate that this method is twice as fast as my previous "tmpfile based method".

Saving stems still requires to save temporarly files since the complex filter cannot be carried out using python-ffmpeg. This enabled a new API. Here the idea was to not come up with presets and do all the checks to cover all use cases but instead let users have to do this themselves. This means more errors for users, but its way easier to maintain. E.g. if a user wants to write multistream audio as .wav files, an error will be thrown, since this container does not support multiple streams. The user would instead have to use streams_as_multichannel.

This PR furthermore introduces a significant number of new features:

Audio Loading

Loading audio now uses the same API as in spleeters audio loading backend
A target samplerate can be specified to resample audio on-the-fly and return the resampled audio
An option stems_from_multichannel was added to load stems that are aggregated into multichannel audio (concatenation of pairs of stereo channels), see more info on audio writing
substream titles can be read from the Info object.

Audio Writing

stems can now be saved as substreams, aggregated into channels or saved as multiple files.
titles for each substream can now be embedded into metadata
in addition to write_stems (which is a preset to achieve compatibility with NI stems), we also have write_streams (supports writing as multichannel or multiple files). And, in case, stempeg is used for just stereo files, write_audio can be used (Again this is API compatible to spleeter).

The procedure for writing stream files may be quite complex as it varies depending of the
specified output container format. Basically there are two possible stream saving options:

1.) container supports multiple streams (mp4/m4a, opus, mka)
2.) container does not support multiple streams (wav, mp3, flac)

For 1.) we provide two options:

1a.) streams will be saved as substreams aka
when streams_as_multichannel=False (default)
1b.) streams will be aggregated into channels and saved as
multichannel file.
Here the audio tensor of shape=(streams, samples, 2)
will be converted to a single-stream multichannel audio
(samples, streams*2). This option is activated using
streams_as_multichannel=True
1c.) streams will be saved as multiple files when streams_as_files is active

For 2.), when the container does not support multiple streams there
are also two options:

2a) streams_as_multichannel has to be set to True (See 1b) otherwise an
error will be raised. Note that this only works for wav and flac).
* file ending of path determines the container (but not the codec!).
2b) streams_as_files so that multiple files will be created when streams_as_files is active

Example / Use Cases

"""Opens a stem file and saves (re-encodes) back to a stem file
"""
import argparse
import stempeg
import subprocess as sp
import numpy as np
from os import path as op


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument(
        'input',
    )
    args = parser.parse_args()

    # load stems
    stems, rate = stempeg.read_stems(args.input)

    # load stems,
    # resample to 96000 Hz,
    # use multiprocessing
    stems, rate = stempeg.read_stems(
        args.input,
        sample_rate=96000,
        multiprocess=True
    )

    # --> stems now has `shape=(stem x samples x channels)``

    # save stems from tensor as multi-stream mp4
    stempeg.write_stems(
        "test.stem.m4a",
        stems,
        sample_rate=96000
    )

    # save stems as dict for convenience
    stems = {
        "mix": stems[0],
        "drums": stems[1],
        "bass": stems[2],
        "other": stems[3],
        "vocals": stems[4],
    }
    # keys will be automatically used

    # from dict as files
    stempeg.write_stems(
        "test.stem.m4a",
        data=stems,
        sample_rate=96000
    )

    # `write_stems` is a preset for the following settings
    # here the output signal is resampled to 44100 Hz and AAC codec is used
    stempeg.write_stems(
        "test.stem.m4a",
        stems,
        sample_rate=96000,
        writer=stempeg.StreamsWriter(
            codec="aac",
            output_sample_rate=44100,
            bitrate="256000",
            stem_names=['mix', 'drums', 'bass', 'other', 'vocals']
        )
    )

    # Native Instruments compatible stems
    stempeg.write_stems(
        "test_traktor.stem.m4a",
        stems,
        sample_rate=96000,
        writer=stempeg.NIStemsWriter(
            stems_metadata=[
                {"color": "#009E73", "name": "Drums"},
                {"color": "#D55E00", "name": "Bass"},
                {"color": "#CC79A7", "name": "Other"},
                {"color": "#56B4E9", "name": "Vocals"}
            ]
        )
    )

    # lets write as multistream opus (supports only 48000 khz)
    stempeg.write_stems(
        "test.stem.opus",
        stems,
        sample_rate=96000,
        writer=stempeg.StreamsWriter(
            output_sample_rate=48000,
            codec="opus"
        )
    )

    # writing to wav requires to convert streams to multichannel
    stempeg.write_stems(
        "test.wav",
        stems,
        sample_rate=96000,
        writer=stempeg.ChannelsWriter(
            output_sample_rate=48000
        )
    )

    # # stempeg also supports to load merged-multichannel streams using
    stems, rate = stempeg.read_stems(
        "test.wav",
        reader=stempeg.ChannelsReader(nb_channels=2)
    )

    # mp3 does not support multiple channels,
    # therefore we have to use `stempeg.FilesWriter`
    # outputs are named ["output/0.mp3", "output/1.mp3"]
    # for named files, provide a dict or use `stem_names`
    # also apply multiprocessing
    stempeg.write_stems(
        ("output", ".mp3"),
        stems,
        sample_rate=rate,
        writer=stempeg.FilesWriter(
            multiprocess=True,
            output_sample_rate=48000,
            stem_names=["mix", "drums", "bass", "other", "vocals"]
        )
    )

…rt the same metdata

faroit · 2020-05-02T14:31:15Z

@romi1502 @mmoussallam I really like the simple ffmpeg adapter you implemented for spleeter. I took some code from spleeter to move it into stempeg. I extended the function to also support reading and writing multistream/stem files. The basic (stereo) read/write, is still API compatible to the ffmpeg adapter you have in spleeter. Therefore I would love your feedback on the following:

Are you okay with copying these parts? I can add credits in the docstring if you like
Would you be interested in replacing you code and use stempeg directly? stempeg is already a dependency for spleeter, so you won't change much for users. Spleeter users would then benefit from being able to save into stem format.
If yes to the previous questions, it would make sense (and would be great) if you could review this PR.

mmoussallam · 2020-05-03T12:51:56Z

Hi @faroit, hope you're fine and safe.

Thanks for the suggestion,it would definitely make sense to allow writing stems as output in spleeter.. Give us a few days to look into it and come back to you.

Best

faroit · 2020-05-03T14:02:24Z

@mmoussallam 👍 sounds good.

Just a few more notes:

stempeg was actually not a requirement for spleeter, I thought you had musdb there. So yes, please evaluate if this would justify adding another dependency.
I quickly hacked the spleeter ffmpeg audio adapter to use stempeg instead for loading and writing. See here: Test stempeg 0.2.0 adapter deezer/spleeter#357 as you can see, writing stems is quite simple as you can directly pass the estimate dictionary
I noticed that the spleeter AudioAdapter does not differentiate between audio containers/extensions and the actual codec. That is an issue as eg. mp4/m4a is a container but not a codec. The codec is aac. FFMPEG does select a default codec when selecting a container/extension. But I would suggest to extend spleeters code to enable extensive control for this. I can add an issue for this if you agree.

faroit · 2020-11-07T23:32:08Z

Hey @faroit! Good job! Unfortunately Traktor cannot read the stem metadata:

This should be fixed now. Test using

import stempeg
stems, rate = stempeg.read_stems(stempeg.example_stem_path())
stempeg.write_stems(
    "test_traktor.stem.m4a",
    stems,
    sample_rate=rate,
    writer=stempeg.NIStemsWriter()
)

axeldelafosse · 2020-11-08T10:26:23Z

Cool! However I get this error:

Traceback (most recent call last):
  File "test.py", line 7, in <module>
    writer=stempeg.NIStemsWriter()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/stempeg/write.py", line 726, in write_stems
    sample_rate=sample_rate
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/stempeg/write.py", line 530, in __call__
    stem_names=['Mix'] + [d['name'] for d in self.stems_metadata]
TypeError: 'NoneType' object is not iterable

Am I missing something? I'm testing with the latest changes and re-installed stempeg via pip install .

faroit · 2020-11-08T13:24:51Z

Am I missing something? I'm testing with the latest changes and re-installed stempeg via pip install .

@axeldelafosse looks like your repo doesn't contain the default_metadata.json... its part of the MANIFEST so it should work... 🤷

Here is simple colab notebook to test that https://colab.research.google.com/drive/1cuTrBnjuBWANiW_fnseT1pfhPzlcGigX?usp=sharing
Let me know if you find the issue.

faroit · 2020-11-08T15:51:01Z

@mmoussallam @pseeth this is ready for a review.

axeldelafosse · 2020-11-08T16:58:13Z

@faroit I have the default_metadata.json so I don't know what's the issue... Anyway. Thanks for the colab. It works! Great job.

faroit · 2020-11-08T17:06:09Z

@faroit I have the default_metadata.json so I don't know what's the issue... Anyway. Thanks for the colab. It works! Great job.

@axeldelafosse Can you try with a clean environment? Also can you run the unit tests?

faroit · 2020-11-18T23:29:59Z

Ping @mmoussallam

mmoussallam · 2020-11-19T09:12:14Z

Hi @faroit

Thanks for this and great work. I'll hopefully find some time next week to look at it carefully.

faroit · 2020-11-19T09:29:38Z

@mmoussallam great. This is will also be used for the next version of open-unmix so it would be great to have this unblocked soon ;-)

README.md

stempeg/write.py

stems writer was updated Co-authored-by: Moussallam <manuel.moussallam@gmail.com>

Co-authored-by: Moussallam <manuel.moussallam@gmail.com>

faroit · 2020-11-23T20:45:53Z

@mmoussallam thanks for the checks, these are corrected now. Did you checkout deezer/spleeter#357 to see if the new stempeg api could be useful in spleeter? if there are minor things to be changed later thats fine as long as the api looks good to you. Let me know if this can be merged then

mmoussallam · 2020-11-24T08:47:24Z

Hi @faroit

Sorry it took me some time to finish reviewing this. It all seems good to me. congrats on the rework I think the API looks really great now!

I'm planning on doing some tests on the spleeter integration later this week.

faroit · 2020-11-24T09:34:00Z

I'm planning on doing some tests on the spleeter integration later this week.

@mmoussallam sounds great. let me know if there is anything left to do. Now lets create some REAL stems! ;-)

first draft on new loading backend from spleeter

ef0c97d

faroit added the enhancement label Nov 5, 2019

Fabian-Robert Stöter added 10 commits February 17, 2020 17:21

adopt read from spleeter

270d757

adopt write from spleeter

d423593

update requirements

3e1aa61

implement multichannel option

5d2e01f

return to tmpfile version of writer

cbbdb26

remove soundfile

3e7212a

additional functionality

bfdaaf7

update example

72c7d5e

update docstrings

4ef1f58

add unit tests

e33eccd

faroit marked this pull request as ready for review May 1, 2020 18:51

faroit changed the title ~~[WIP]: first draft on new loading backend from spleeter~~ New loading and writing backend from spleeter May 1, 2020

Fabian-Robert Stöter added 7 commits May 2, 2020 10:42

rename container to mp4

63bcbd9

update example

bc11df1

fix warnings, add more tests

c446f37

allow to read titles from metadata

76f18be

only check metadata for mp4 files, as matroska does not seem to suppo…

11b89ce

…rt the same metdata

pep8

f36feb9

update example

c1758d2

This was referenced May 2, 2020

Centralize detection of ffmpeg executables #24

Closed

Reading is too slow #25

Closed

update cli conversion tool

13a8175

add probe

aeb4773

faroit mentioned this pull request May 3, 2020

Test stempeg 0.2.0 adapter deezer/spleeter#357

Draft

for compatbility reasons, lets also have Info.rate

f0e9837

Fabian-Robert Stöter added 2 commits November 8, 2020 00:24

rewrite NI stems writer to use temporary directories

ea75b03

fix mp4box cli

fa8ec72

Fabian-Robert Stöter added 3 commits November 8, 2020 10:50

add mp4box unit test

66dd43c

add ffmpeg 4.3 test

3b8538a

remove debug outputs

032147e

make search for MP4Box case-sensitive

0bb2e40

update mp4box executable name

0c46255

Fabian-Robert Stöter added 2 commits November 10, 2020 15:22

attempt unit test fixes

121b075

shorten tests

47503b5

mmoussallam reviewed Nov 23, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

mmoussallam reviewed Nov 23, 2020

View reviewed changes

stempeg/write.py Outdated Show resolved Hide resolved

faroit and others added 2 commits November 23, 2020 21:42

Update README.md

98228c1

stems writer was updated Co-authored-by: Moussallam <manuel.moussallam@gmail.com>

Update stempeg/write.py

7b99567

Co-authored-by: Moussallam <manuel.moussallam@gmail.com>

faroit merged commit ed80514 into master Nov 24, 2020

This was referenced Dec 5, 2020

Native Instruments (Traktor) Format #31

Closed

Add a check for mono files #20

Closed

add audio2stem cli #26

Closed

faroit deleted the add_new_ffmpegprocess branch January 31, 2021 23:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stempeg 2.0 #28

stempeg 2.0 #28

faroit commented Nov 5, 2019 •

edited

Loading

faroit commented May 2, 2020 •

edited

Loading

mmoussallam commented May 3, 2020

faroit commented May 3, 2020 •

edited

Loading

faroit commented Nov 7, 2020 •

edited

Loading

axeldelafosse commented Nov 8, 2020

faroit commented Nov 8, 2020

faroit commented Nov 8, 2020

axeldelafosse commented Nov 8, 2020

faroit commented Nov 8, 2020 •

edited

Loading

faroit commented Nov 18, 2020

mmoussallam commented Nov 19, 2020

faroit commented Nov 19, 2020

faroit commented Nov 23, 2020

mmoussallam commented Nov 24, 2020

faroit commented Nov 24, 2020

stempeg 2.0 #28

stempeg 2.0 #28

Conversation

faroit commented Nov 5, 2019 • edited Loading

Audio Loading

Audio Writing

Example / Use Cases

faroit commented May 2, 2020 • edited Loading

mmoussallam commented May 3, 2020

faroit commented May 3, 2020 • edited Loading

faroit commented Nov 7, 2020 • edited Loading

axeldelafosse commented Nov 8, 2020

faroit commented Nov 8, 2020

faroit commented Nov 8, 2020

axeldelafosse commented Nov 8, 2020

faroit commented Nov 8, 2020 • edited Loading

faroit commented Nov 18, 2020

mmoussallam commented Nov 19, 2020

faroit commented Nov 19, 2020

faroit commented Nov 23, 2020

mmoussallam commented Nov 24, 2020

faroit commented Nov 24, 2020

faroit commented Nov 5, 2019 •

edited

Loading

faroit commented May 2, 2020 •

edited

Loading

faroit commented May 3, 2020 •

edited

Loading

faroit commented Nov 7, 2020 •

edited

Loading

faroit commented Nov 8, 2020 •

edited

Loading