New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EBU R 128 loudness normalization #1216
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Looks mostly good to me, mostly code style changes.
General:
- For consistency with the current code style. Add a newline after the function name, before function code, and at the end of the function code before }. For example.:
void function() {
line1
line2
line3
}
- Add curly braces even on one-line if's.
Can you add libebur128 to the windows dependencies too?
-
https://github.com/strawberrymusicplayer/strawberry-mxe
Copy an existing .mk file in src/ for example taglib.mk and adjust it for libebur128 -
https://github.com/strawberrymusicplayer/strawberry-msvc-dependencies/blob/master/.github/workflows/build.yml
Add libebur128 to build.yml between libbs2b and ffmpeg.
Thanks for taking a look!
Yeah, i knew this was coming. I have initially tried to follow my normal path of
strawberrymusicplayer/strawberry-mxe#235 and strawberrymusicplayer/strawberry-msvc-dependencies#182. |
@jonaski thank you for taking a look! I've tried to hopefully address review nits, but i suspect i only handled them in the letter, not the spirit, |
I'm not sure if the windows failures are because the builds started |
There is a check that all libraries get's added to the windows installer that currently fails now. |
I think you can ignore the codacy errors, I don't see any settings for configuring what to build, so not sure how it works. |
Aha! Looks like i didn't search for that hard-enough, |
There need to be a |
Unless i messed up the benchmark (https://godbolt.org/z/xKTfMn9x5), I'm not sure how many frames we get per $ ./ebur128_benchmark
|
The most juicy bit! This is based on Song Fingerprint Analysis, but here we must know the actual song, and not just the file. The library supports only interleaved S16/S32/F32/F64, so we must be sure we insert `audioconvert` into pipeline. One point of contention here for me, is whether we should feed the frames to the library the moment we get them in `NewBufferCallback`, or collect them in a buffer and pass them all at once. I've gone with the former, because it seems like that is not the worst choice: strawberrymusicplayer#1216 (comment) In principle, the analysis *could* fail, so we want to handle that gracefully.
@jonaski thank you for nits! Hopefully addressed. |
The most juicy bit! This is based on Song Fingerprint Analysis, but here we must know the actual song, and not just the file. The library supports only interleaved S16/S32/F32/F64, so we must be sure we insert `audioconvert` into pipeline. One point of contention here for me, is whether we should feed the frames to the library the moment we get them in `NewBufferCallback`, or collect them in a buffer and pass them all at once. I've gone with the former, because it seems like that is not the worst choice: strawberrymusicplayer#1216 (comment) In principle, the analysis *could* fail, so we want to handle that gracefully.
The most juicy bit! This is based on Song Fingerprint Analysis, but here we must know the actual song, and not just the file. The library supports only interleaved S16/S32/F32/F64, so we must be sure we insert `audioconvert` into pipeline. One point of contention here for me, is whether we should feed the frames to the library the moment we get them in `NewBufferCallback`, or collect them in a buffer and pass them all at once. I've gone with the former, because it seems like that is not the worst choice: strawberrymusicplayer#1216 (comment) In principle, the analysis *could* fail, so we want to handle that gracefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good except for some minor pointer alignment style issues. Thanks for addressing these minor issues.
Pointer alignments should be right, except where there is no variable name and in casts. I'll check with clang-format community if it's possible to create an option for this.
Other than the pointer alignments, clang format is also removing newlines at the end of the function, which I didn't find an option to leave as is.
The most juicy bit! This is based on Song Fingerprint Analysis, but here we must know the actual song, and not just the file. The library supports only interleaved S16/S32/F32/F64, so we must be sure we insert `audioconvert` into pipeline. One point of contention here for me, is whether we should feed the frames to the library the moment we get them in `NewBufferCallback`, or collect them in a buffer and pass them all at once. I've gone with the former, because it seems like that is not the worst choice: strawberrymusicplayer#1216 (comment) In principle, the analysis *could* fail, so we want to handle that gracefully.
The most juicy bit! This is based on Song Fingerprint Analysis, but here we must know the actual song, and not just the file. The library supports only interleaved S16/S32/F32/F64, so we must be sure we insert `audioconvert` into pipeline. One point of contention here for me, is whether we should feed the frames to the library the moment we get them in `NewBufferCallback`, or collect them in a buffer and pass them all at once. I've gone with the former, because it seems like that is not the worst choice: strawberrymusicplayer#1216 (comment) In principle, the analysis *could* fail, so we want to handle that gracefully.
Thank you!
My main points is that doing so manually is ..., very much not 202x-like, |
Again, somewhat pretty similar to the existing fingerprint analysis, we must support performing it both for the new files, and re-performing it on (some of) already-existing songs, because it might have been disabled before. Admittedly, i quite don't like some of this code, maybe this can be done in a more concise way. NOTE: this only supports scanning each separate songs. Should we ever want to support per-album loudness normalization, this will need massive changes...
This improves the performance of the analysis (by 2x!), by offloading non-`libebur128`-computations (i.e. decode-convert) to a separate thread, thus reducing walltime.
The idea is that Integrated Loudness is an integral part of the song, much like knowing it's beginning / ending in the file, and we must handle it the exact same way, and pipe it through all the way. At the same time, `EngineBase` knows Target Level (from settings), and these two combined tell us the Gain needed to normalize the Loudness of the particular Song (`EngineBase::Load()` does that). So the actual backend only needs to handle the Volume. We don't currently support changing Target Level on the fly. We don't currently support changing Loudness-normalizing Gain on the fly. This does not handle the case when the song is loaded from URL and thus the EBU R 128 measures, that exist, are not nessesairly correct.
…peline This is a bit of a gotcha, there should be a point (where we seek?) where we'd be able to change said gain, but for now this is a simple[r] stop-gap fix.
The magic: if EBU R 128 loudness normalization is enabled, just insert `volume` GST element into the pipeline (where ReplayGain would be inserted) and configure it. We currently don't support changing said gain after the pipeline was created. We might need to, though, for a number of reasons.
Loudness measurement is channel-dependent. This perhaps matters most for mono audio.
…rmalizing gain Ok, it does appear that it is that simple. In principle this (even the non-update case) results in volume jumps, so maybe we'll want gradual gain change... Notably, i thought we'd always seek if the pipeline was already operating on the same URL as the new one, but apparently only for adjacent songs?
The most juicy bit! This is based on Song Fingerprint Analysis, but here we must know the actual song, and not just the file. The library supports only interleaved S16/S32/F32/F64, so we must be sure we insert `audioconvert` into pipeline. One point of contention here for me, is whether we should feed the frames to the library the moment we get them in `NewBufferCallback`, or collect them in a buffer and pass them all at once. I've gone with the former, because it seems like that is not the worst choice: #1216 (comment) In principle, the analysis *could* fail, so we want to handle that gracefully.
@jonaski thank you! |
Ok, some observations after some use:
|
#1238 as a half-measure.
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5063
|
The current limit is `x10`, which allows just `+20 dB` of gain. While it may seem sufficient, this came up as a problem in a real-world, non-specially-engineered situation, in strawberry's EBU R 128 loudness normalization. (strawberrymusicplayer/strawberry#1216) There is an audio track (that was not intentionally engineered that way), that has integrated loudness of `-38 LUFS`, and if we want to normalize it's loudness to e.g. `-16 LUFS`, which is a very reasonable thing to do, we need to apply gain of `+22 dB`, which is larger than `+20 dB`, and we fail... I think it should allow at least `+96 dB` of gain, and therefore should be at `10^(96/20) ~= 63096`. But, i don't see why we need to put any specific restriction on that parameter in the first place, so let's just remove it?
The current limit is `x10`, which allows just `+20 dB` of gain. While it may seem sufficient, this came up as a problem in a real-world, non-specially-engineered situation, in strawberry's EBU R 128 loudness normalization. (strawberrymusicplayer/strawberry#1216) There is an audio track (that was not intentionally engineered that way), that has integrated loudness of `-38 LUFS`, and if we want to normalize it's loudness to e.g. `-16 LUFS`, which is a very reasonable thing to do, we need to apply gain of `+22 dB`, which is larger than `+20 dB`, and we fail... I think it should allow at least `+96 dB` of gain, and therefore should be at `10^(96/20) ~= 63096`. But, i don't see why we need to put any specific restriction on that parameter in the first place, other than the fact that the fixed-point multiplication scheme does not support volume larger than 15x-ish. So let's just implement a floating-point fall-back path that does not involve fixed-point multiplication and lift the restriction altogether?
The current limit is `x10`, which allows just `+20 dB` of gain. While it may seem sufficient, this came up as a problem in a real-world, non-specially-engineered situation, in strawberry's EBU R 128 loudness normalization. (strawberrymusicplayer/strawberry#1216) There is an audio track (that was not intentionally engineered that way), that has integrated loudness of `-38 LUFS`, and if we want to normalize it's loudness to e.g. `-16 LUFS`, which is a very reasonable thing to do, we need to apply gain of `+22 dB`, which is larger than `+20 dB`, and we fail... I think it should allow at least `+96 dB` of gain, and therefore should be at `10^(96/20) ~= 63096`. But, i don't see why we need to put any specific restriction on that parameter in the first place, other than the fact that the fixed-point multiplication scheme does not support volume larger than 15x-ish. So let's just implement a floating-point fall-back path that does not involve fixed-point multiplication and lift the restriction altogether?
The current limit is `x10`, which allows just `+20 dB` of gain. While it may seem sufficient, this came up as a problem in a real-world, non-specially-engineered situation, in strawberry's EBU R 128 loudness normalization. (strawberrymusicplayer/strawberry#1216) There is an audio track (that was not intentionally engineered that way), that has integrated loudness of `-38 LUFS`, and if we want to normalize it's loudness to e.g. `-16 LUFS`, which is a very reasonable thing to do, we need to apply gain of `+22 dB`, which is larger than `+20 dB`, and we fail... I think it should allow at least `+96 dB` of gain, and therefore should be at `10^(96/20) ~= 63096`. But, i don't see why we need to put any specific restriction on that parameter in the first place, other than the fact that the fixed-point multiplication scheme does not support volume larger than 15x-ish. So let's just implement a floating-point fall-back path that does not involve fixed-point multiplication and lift the restriction altogether?
The current limit is `x10`, which allows just `+20 dB` of gain. While it may seem sufficient, this came up as a problem in a real-world, non-specially-engineered situation, in strawberry's EBU R 128 loudness normalization. (strawberrymusicplayer/strawberry#1216) There is an audio track (that was not intentionally engineered that way), that has integrated loudness of `-38 LUFS`, and if we want to normalize it's loudness to e.g. `-16 LUFS`, which is a very reasonable thing to do, we need to apply gain of `+22 dB`, which is larger than `+20 dB`, and we fail... I think it should allow at least `+96 dB` of gain, and therefore should be at `10^(96/20) ~= 63096`. But, i don't see why we need to put any specific restriction on that parameter in the first place, other than the fact that the fixed-point multiplication scheme does not support volume larger than 15x-ish. So let's just implement a floating-point fall-back path that does not involve fixed-point multiplication and lift the restriction altogether? Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5063>
Hurray, GStreamer/gstreamer@8b1500d!
|
The current limit is `x10`, which allows just `+20 dB` of gain. While it may seem sufficient, this came up as a problem in a real-world, non-specially-engineered situation, in strawberry's EBU R 128 loudness normalization. (strawberrymusicplayer/strawberry#1216) There is an audio track (that was not intentionally engineered that way), that has integrated loudness of `-38 LUFS`, and if we want to normalize it's loudness to e.g. `-16 LUFS`, which is a very reasonable thing to do, we need to apply gain of `+22 dB`, which is larger than `+20 dB`, and we fail... I think it should allow at least `+96 dB` of gain, and therefore should be at `10^(96/20) ~= 63096`. But, i don't see why we need to put any specific restriction on that parameter in the first place, other than the fact that the fixed-point multiplication scheme does not support volume larger than 15x-ish. So let's just implement a floating-point fall-back path that does not involve fixed-point multiplication and lift the restriction altogether? Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5063>
I'm happy that i have finally stumbled upon this software!
Finally, a audio player that supports CUE's.
One pain point remaining for me, is audio volume.
While there is Replay Gain support, i'm not sure it cuts it for my taste,
for a number of reasons:
loudgain
reuses them for non-RG values.EBU R 128
This PR implements support for EBU R 128-based loudness normalization (to the specified Target Loudness)
based on an EBU R 128 song analysis, performed the same way the song fingerprinting is done,
automatically, without needing to touch the source files in any way.
Note that analysis, and normalization, are separate steps,
and normalization does not require the presence of
libebur128
,while analysis does.
Two things i want to call out explicitly that this does not deal with are:
While i have thought extensively about that, at least currently nothing is done for that.
First, with any luck, all of these filters are done in
FP32
(might be good to force that?)because that is what the audio playback consumes, so clipping should not be an issue anyway.
Secondly, i'm not sure where we'd want to clip, presumably after all filters.
But clipping is lossy, and is unoptimal to those who further process (e.g. via
easyeffects
) audio.In my limited testing, this is a working proof of concept,
although some further changes may be warranted.
I've hopefully did sufficiently granular commits.
Thoughts? :)