Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stempeg 2.0 #28

Merged
merged 71 commits into from
Nov 24, 2020
Merged

stempeg 2.0 #28

merged 71 commits into from
Nov 24, 2020

Conversation

faroit
Copy link
Owner

@faroit faroit commented Nov 5, 2019

This addresses #27 and implements a new ffmpeg backend. I choose ffmpeg-python for reading and writing. Here the audio is piped directly to stdin instead of writing temporarly files with pysoundfile and converting them in a separate process call.

Part of the code was copied from spleeters audio backend. First benchmarks of the input piping indicate that this method is twice as fast as my previous "tmpfile based method".

Saving stems still requires to save temporarly files since the complex filter cannot be carried out using python-ffmpeg. This enabled a new API. Here the idea was to not come up with presets and do all the checks to cover all use cases but instead let users have to do this themselves. This means more errors for users, but its way easier to maintain. E.g. if a user wants to write multistream audio as .wav files, an error will be thrown, since this container does not support multiple streams. The user would instead have to use streams_as_multichannel.

This PR furthermore introduces a significant number of new features:

Audio Loading

  • Loading audio now uses the same API as in spleeters audio loading backend
  • A target samplerate can be specified to resample audio on-the-fly and return the resampled audio
  • An option stems_from_multichannel was added to load stems that are aggregated into multichannel audio (concatenation of pairs of stereo channels), see more info on audio writing
  • substream titles can be read from the Info object.

Audio Writing

  • stems can now be saved as substreams, aggregated into channels or saved as multiple files.
  • titles for each substream can now be embedded into metadata
  • in addition to write_stems (which is a preset to achieve compatibility with NI stems), we also have write_streams (supports writing as multichannel or multiple files). And, in case, stempeg is used for just stereo files, write_audio can be used (Again this is API compatible to spleeter).

The procedure for writing stream files may be quite complex as it varies depending of the
specified output container format. Basically there are two possible stream saving options:

1.) container supports multiple streams (mp4/m4a, opus, mka)
2.) container does not support multiple streams (wav, mp3, flac)

For 1.) we provide two options:

1a.) streams will be saved as substreams aka
when streams_as_multichannel=False (default)
1b.) streams will be aggregated into channels and saved as
multichannel file.
Here the audio tensor of shape=(streams, samples, 2)
will be converted to a single-stream multichannel audio
(samples, streams*2). This option is activated using
streams_as_multichannel=True
1c.) streams will be saved as multiple files when streams_as_files is active

For 2.), when the container does not support multiple streams there
are also two options:

2a) streams_as_multichannel has to be set to True (See 1b) otherwise an
error will be raised. Note that this only works for wav and flac).
* file ending of path determines the container (but not the codec!).
2b) streams_as_files so that multiple files will be created when streams_as_files is active

Example / Use Cases

"""Opens a stem file and saves (re-encodes) back to a stem file
"""
import argparse
import stempeg
import subprocess as sp
import numpy as np
from os import path as op


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument(
        'input',
    )
    args = parser.parse_args()

    # load stems
    stems, rate = stempeg.read_stems(args.input)

    # load stems,
    # resample to 96000 Hz,
    # use multiprocessing
    stems, rate = stempeg.read_stems(
        args.input,
        sample_rate=96000,
        multiprocess=True
    )

    # --> stems now has `shape=(stem x samples x channels)``

    # save stems from tensor as multi-stream mp4
    stempeg.write_stems(
        "test.stem.m4a",
        stems,
        sample_rate=96000
    )

    # save stems as dict for convenience
    stems = {
        "mix": stems[0],
        "drums": stems[1],
        "bass": stems[2],
        "other": stems[3],
        "vocals": stems[4],
    }
    # keys will be automatically used

    # from dict as files
    stempeg.write_stems(
        "test.stem.m4a",
        data=stems,
        sample_rate=96000
    )

    # `write_stems` is a preset for the following settings
    # here the output signal is resampled to 44100 Hz and AAC codec is used
    stempeg.write_stems(
        "test.stem.m4a",
        stems,
        sample_rate=96000,
        writer=stempeg.StreamsWriter(
            codec="aac",
            output_sample_rate=44100,
            bitrate="256000",
            stem_names=['mix', 'drums', 'bass', 'other', 'vocals']
        )
    )

    # Native Instruments compatible stems
    stempeg.write_stems(
        "test_traktor.stem.m4a",
        stems,
        sample_rate=96000,
        writer=stempeg.NIStemsWriter(
            stems_metadata=[
                {"color": "#009E73", "name": "Drums"},
                {"color": "#D55E00", "name": "Bass"},
                {"color": "#CC79A7", "name": "Other"},
                {"color": "#56B4E9", "name": "Vocals"}
            ]
        )
    )

    # lets write as multistream opus (supports only 48000 khz)
    stempeg.write_stems(
        "test.stem.opus",
        stems,
        sample_rate=96000,
        writer=stempeg.StreamsWriter(
            output_sample_rate=48000,
            codec="opus"
        )
    )

    # writing to wav requires to convert streams to multichannel
    stempeg.write_stems(
        "test.wav",
        stems,
        sample_rate=96000,
        writer=stempeg.ChannelsWriter(
            output_sample_rate=48000
        )
    )

    # # stempeg also supports to load merged-multichannel streams using
    stems, rate = stempeg.read_stems(
        "test.wav",
        reader=stempeg.ChannelsReader(nb_channels=2)
    )

    # mp3 does not support multiple channels,
    # therefore we have to use `stempeg.FilesWriter`
    # outputs are named ["output/0.mp3", "output/1.mp3"]
    # for named files, provide a dict or use `stem_names`
    # also apply multiprocessing
    stempeg.write_stems(
        ("output", ".mp3"),
        stems,
        sample_rate=rate,
        writer=stempeg.FilesWriter(
            multiprocess=True,
            output_sample_rate=48000,
            stem_names=["mix", "drums", "bass", "other", "vocals"]
        )
    )

@faroit faroit marked this pull request as ready for review May 1, 2020 18:51
@faroit faroit changed the title [WIP]: first draft on new loading backend from spleeter New loading and writing backend from spleeter May 1, 2020
@faroit
Copy link
Owner Author

faroit commented May 2, 2020

@romi1502 @mmoussallam I really like the simple ffmpeg adapter you implemented for spleeter. I took some code from spleeter to move it into stempeg. I extended the function to also support reading and writing multistream/stem files. The basic (stereo) read/write, is still API compatible to the ffmpeg adapter you have in spleeter. Therefore I would love your feedback on the following:

  • Are you okay with copying these parts? I can add credits in the docstring if you like
  • Would you be interested in replacing you code and use stempeg directly? stempeg is already a dependency for spleeter, so you won't change much for users. Spleeter users would then benefit from being able to save into stem format.
  • If yes to the previous questions, it would make sense (and would be great) if you could review this PR.

@mmoussallam
Copy link
Contributor

Hi @faroit, hope you're fine and safe.

Thanks for the suggestion,it would definitely make sense to allow writing stems as output in spleeter.. Give us a few days to look into it and come back to you.

Best

@faroit
Copy link
Owner Author

faroit commented May 3, 2020

@mmoussallam 👍 sounds good.

Just a few more notes:

  • stempeg was actually not a requirement for spleeter, I thought you had musdb there. So yes, please evaluate if this would justify adding another dependency.
  • I quickly hacked the spleeter ffmpeg audio adapter to use stempeg instead for loading and writing. See here: Test stempeg 0.2.0 adapter deezer/spleeter#357 as you can see, writing stems is quite simple as you can directly pass the estimate dictionary
  • I noticed that the spleeter AudioAdapter does not differentiate between audio containers/extensions and the actual codec. That is an issue as eg. mp4/m4a is a container but not a codec. The codec is aac. FFMPEG does select a default codec when selecting a container/extension. But I would suggest to extend spleeters code to enable extensive control for this. I can add an issue for this if you agree.

@faroit
Copy link
Owner Author

faroit commented Nov 7, 2020

Hey @faroit! Good job! Unfortunately Traktor cannot read the stem metadata:

This should be fixed now. Test using

import stempeg
stems, rate = stempeg.read_stems(stempeg.example_stem_path())
stempeg.write_stems(
    "test_traktor.stem.m4a",
    stems,
    sample_rate=rate,
    writer=stempeg.NIStemsWriter()
)

image

@axeldelafosse
Copy link

Cool! However I get this error:

Traceback (most recent call last):
  File "test.py", line 7, in <module>
    writer=stempeg.NIStemsWriter()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/stempeg/write.py", line 726, in write_stems
    sample_rate=sample_rate
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/stempeg/write.py", line 530, in __call__
    stem_names=['Mix'] + [d['name'] for d in self.stems_metadata]
TypeError: 'NoneType' object is not iterable

Am I missing something? I'm testing with the latest changes and re-installed stempeg via pip install .

@faroit
Copy link
Owner Author

faroit commented Nov 8, 2020

Am I missing something? I'm testing with the latest changes and re-installed stempeg via pip install .

@axeldelafosse looks like your repo doesn't contain the default_metadata.json... its part of the MANIFEST so it should work... 🤷

Here is simple colab notebook to test that https://colab.research.google.com/drive/1cuTrBnjuBWANiW_fnseT1pfhPzlcGigX?usp=sharing
Let me know if you find the issue.

@faroit
Copy link
Owner Author

faroit commented Nov 8, 2020

@mmoussallam @pseeth this is ready for a review.

@axeldelafosse
Copy link

@faroit I have the default_metadata.json so I don't know what's the issue... Anyway. Thanks for the colab. It works! Great job.

@faroit
Copy link
Owner Author

faroit commented Nov 8, 2020

@faroit I have the default_metadata.json so I don't know what's the issue... Anyway. Thanks for the colab. It works! Great job.

@axeldelafosse Can you try with a clean environment? Also can you run the unit tests?

@faroit
Copy link
Owner Author

faroit commented Nov 18, 2020

Ping @mmoussallam

@mmoussallam
Copy link
Contributor

Hi @faroit

Thanks for this and great work. I'll hopefully find some time next week to look at it carefully.

@faroit
Copy link
Owner Author

faroit commented Nov 19, 2020

@mmoussallam great. This is will also be used for the next version of open-unmix so it would be great to have this unblocked soon ;-)

README.md Outdated Show resolved Hide resolved
stempeg/write.py Outdated Show resolved Hide resolved
faroit and others added 2 commits November 23, 2020 21:42
stems writer was updated

Co-authored-by: Moussallam <manuel.moussallam@gmail.com>
Co-authored-by: Moussallam <manuel.moussallam@gmail.com>
@faroit
Copy link
Owner Author

faroit commented Nov 23, 2020

@mmoussallam thanks for the checks, these are corrected now. Did you checkout deezer/spleeter#357 to see if the new stempeg api could be useful in spleeter? if there are minor things to be changed later thats fine as long as the api looks good to you. Let me know if this can be merged then

@mmoussallam
Copy link
Contributor

Hi @faroit

Sorry it took me some time to finish reviewing this. It all seems good to me. congrats on the rework I think the API looks really great now!

I'm planning on doing some tests on the spleeter integration later this week.

@faroit
Copy link
Owner Author

faroit commented Nov 24, 2020

I'm planning on doing some tests on the spleeter integration later this week.

@mmoussallam sounds great. let me know if there is anything left to do. Now lets create some REAL stems! ;-)

@faroit faroit merged commit ed80514 into master Nov 24, 2020
@faroit faroit deleted the add_new_ffmpegprocess branch January 31, 2021 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants