Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New waveforms have much higher visual gain than legacy ones #12448

Closed
fwcd opened this issue Dec 18, 2023 · 25 comments
Closed

New waveforms have much higher visual gain than legacy ones #12448

fwcd opened this issue Dec 18, 2023 · 25 comments

Comments

@fwcd
Copy link
Member

fwcd commented Dec 18, 2023

Bug Description

Screen.Recording.2023-12-18.at.21.32.45.mov

Is this intended?

cc @m0dB

Version

2.5-alpha-164-g6acfe63a39 (HEAD)

OS

macOS 14.2

@uklotzde
Copy link
Contributor

Same here after merging the latest changes from 2.4/main:

image

Doesn't look as before.

@uklotzde
Copy link
Contributor

For comparison the same waveform rendered with an older 2.4 beta version:

image

If this occurs for the current 2.4 version than it should be a release blocker!

@fwcd
Copy link
Member Author

fwcd commented Dec 18, 2023

Did a quick bisect over the main snapshots, the last good is 2c2e706, the first bad 0f0deb1, here's the diff. I assume it is this PR:

@m0dB
Copy link
Contributor

m0dB commented Dec 19, 2023

Well, either this is the bug or #12115 is the bug.

My changes were to have the colored waveforms have the same amplitude as the simple waveform. As reported in 12115 by @ronso0 , it was strange that that wasn't the case.

You will also find my analysis there that shows that the combined high-mid-low is not the same as all.

Anyway, I am fine with either approach but note that there are now two contradicting bug reports.

@JoergAtGithub
Copy link
Member

My expectation as user is, that 100% amplitude is the same as the threshold for the clipping LED in the VU-Meter - accross all waveform types.

@m0dB
Copy link
Contributor

m0dB commented Dec 19, 2023

I agree that that's the reasonable expectation. Now, why the waveform
data filtered.all exceeds 100%, I have no idea. I can look into that but not today. But maybe somebody knows?

In any case, if we want the same waveform amplitude across waveform types, which I think makes sense, the current glsl waveform types are correct.

@fwcd
Copy link
Member Author

fwcd commented Dec 20, 2023

I agree, the current waveforms look a bit "clipped", despite not extending to the full height. For example, here's Thunder by Imagine Dragons using the current GLSL waveforms and a visual gain of 1.0:

image

There's certainly compression going on, but it doesn't feel that heavy. Compare the legacy waveforms:

image

Legacy waveforms with a visual gain of 2.0:

image

The preview doesn't look as chunky as the current GLSL waveforms either:

image

@m0dB
Copy link
Contributor

m0dB commented Dec 25, 2023

I have been investigating this and I have come to the following conclusion:

  • As discussed and shown with plots in bug-report Why do waveform types have different visual gains? #12115 , the waveform amplitude calculated by combining the low + mid + high components is different from the overall waveform amplitude. This issue is addressed in make scaling of GLSL RGB and RGB L/R waveform amplitudes consistent with simple waveform #12205 , which scales the combined component amplitude to the overall amplitude, this ensuring that the simple waveform and the coloured waveforms have the same shape.

  • As the waveform amplitude calculated by combining the low + mid + high components is different from the overall waveform amplitude, that change means that the waveform looks slightly different. But I stand behind it, it is more correct than the combined low + mid + high components.

  • The waveform is scaled by a combination of the visual gain, the pre-fader gain and the ReplayGain

  • Now, when logging, I noticed that the applied scaling was 2x what I expected. You can see this easily: calculate the amplitude form of the ReplayGain (as shown track properties): pow(10.0, -replayGain / 20.0), and use this as the visual gain: The waveform goes out of the screen. Use half, and the peak waveform has the expected height.

  • I traced this down to the following line in src/waveform/renderers/waveformwidgetrenderer.cpp:

    // This gain adjustment compensates for an arbitrary /2 gain chop in
    // EnginePregain. See the comment there.
    m_gain = m_pGainControlObject->get() * 2;

I think this is wrong. Also I don't see any relevant comment in EnginePregain. So maybe this "arbitrary /2 gain chop" was fixed there, but this gain adjustment compensation was not removed?

If you agree this is indeed wrong, I will create a PR to remove this * 2 multiplication.

@m0dB
Copy link
Contributor

m0dB commented Dec 25, 2023

Hmmm, investigating a bit more, there is definitely something strange with the overall amplitude data.

Top is combined sqrt(filtered.low^2+ filtered.mid^2+ filtered.height^2)

Middle is filtered.all (aka simple)

Bottom is lines from abs(peakR) to -(abs(peakL), with exact number of samples per line, using the recording output from mixxx.

Maybe filtered.all is not simply the peaks?

Screenshot 2023-12-25 at 18 33 48

@m0dB
Copy link
Contributor

m0dB commented Dec 25, 2023

This is with at the top pow(filtered.all, 2.0), which seems to result in the better match than filtered.all. I am going to look at the analyser code to see if can understand why.

Screenshot 2023-12-25 at 18 58 34

@m0dB
Copy link
Contributor

m0dB commented Dec 25, 2023

Found it:

 20 inline CSAMPLE scaleSignal(CSAMPLE invalue, FilterIndex index = FilterCount) {
 21     if (invalue == 0.0) {
 22         return 0;
 23     } else if (index == Low || index == Mid) {
 24         //return std::pow(invalue, 2 * 0.5);
 25         return invalue;
 26     } else {
 27         return std::pow(invalue, 2.0f * 0.316f);
 28     }           
 29 }                       
 30       

Does anybody know the reasoning behind this code?

@m0dB
Copy link
Contributor

m0dB commented Dec 25, 2023

So, if I undo that scaling I get what I expect:

Screenshot 2023-12-25 at 20 21 49

I will create a PR, but it would be great if somebody with more insight in the waveform analyser and ReplayGain code code shed their light on this...

@m0dB m0dB mentioned this issue Dec 25, 2023
@Swiftb0y
Copy link
Member

Does anybody know the reasoning behind this code?

investigating using the commit history pointed me to this:
#6352

@m0dB
Copy link
Contributor

m0dB commented Dec 27, 2023

Thanks for digging that up. So from reading those comments it looks like the pow(invalue, 2.0f * 0.316f) scaling is intentional. I think it was a bad idea to do that in the analyzer code rather than in the waveform render code. It would be much nicer to have options for the waveform rendering (e.g. scaling: "linear" "low-level-enhanced") and the code would have been less confusing too.

Anyway, I will put the unscaling code behind a condition. At least the next time someone wonders why the waveforms look differently in mixxx than in any audio editor it will be more clear.

@m0dB
Copy link
Contributor

m0dB commented Dec 27, 2023

We still need an explanation for that strange *2 scaling to compensate for the "arbitrary /2 gain chop" though...

@Swiftb0y
Copy link
Member

I think it was a bad idea to do that in the analyzer code rather than in the waveform render code.

I agree. probably was just easier back then... mixxx has always been a pile of hacks ontop of each other.

I'd highly appreciate if that was getting moved to the waveforms. making it optional could be done later. The only issue I see then is that scaling would get applied twice (once already in the cached waveform data and once in the render)...

@m0dB
Copy link
Contributor

m0dB commented Dec 28, 2023

Well, the problem is that this would require reanalysing all tracks. Or maybe there is a versioning mechanism?

Also, maintaining all the different waveform types (or accepting they diverge) will be a pain. Post 2.4 I would really like to do a big clean-up.

@Swiftb0y
Copy link
Member

Well, the problem is that this would require reanalysing all tracks. Or maybe there is a versioning mechanism?

Unfortunately not that I'm aware of (though there are heuristics somewhere the invalidate cached data such as waveforms info, though idk where). Reanalysing the waveform data is not a big deal IMO since its fairly quick and fortunately doesn't cause the other analyzers to rerun too. The problem is that I don't know how to automatically invalidate the previous waveforms.

Also, maintaining all the different waveform types (or accepting they diverge) will be a pain.

I assume you don't mean the different renderers but the cached analyzer data? If so, then I agree.

@m0dB
Copy link
Contributor

m0dB commented Dec 30, 2023

I assume you don't mean the different renderers but the cached analyzer data? If so, then I agree.

No, I mean that if we decide to "un-scale" the filtered.all data in the waveform renderers, we need to change all of them. Unless we decide we can remove all of them :-)

@Swiftb0y
Copy link
Member

yup, eventually I'd be nice to remove them. IMO since we declared them as legacy anyways, I'd propose to ignore them (feature-wise) and only work on the non-legacy ones.

Fyi, this discussion also reminds me of #11833. Maybe a more general waveform-data preprocessing facility is in order anyways...

@Swiftb0y
Copy link
Member

Swiftb0y commented Jan 2, 2024

is this resolved by #12466?

@fwcd
Copy link
Member Author

fwcd commented Jan 2, 2024

It hasn't made its way to main yet, so I'm afraid my build isn't new enough to tell 😄 Hopefully I'll find some time to build 2.4 soon...

@m0dB
Copy link
Contributor

m0dB commented Jan 3, 2024

Yes. Fixed.

@m0dB m0dB closed this as completed Jan 3, 2024
@fwcd
Copy link
Member Author

fwcd commented Jan 3, 2024

Sorry for being so nitpicky here... the scaling is fixed, but the new waveforms still lack a fair bit of detail. Compare GLSL:

Screenshot 2024-01-03 at 04 10 02

to GLSL (legacy):

Screenshot 2024-01-03 at 04 09 51

The peaks feel a fair bit more accentuated. I remember seeing something about scaling frequency bands differently, perhaps that is somehow related?

(Though this should probably be tracked in a separate issue, since the overall visual gain is fixed)

@mxmilkiib
Copy link
Contributor

mxmilkiib commented Jan 3, 2024

#8406 and #8367 are related, no?

Also, why does changing ReplayGain setting levels not change waveform heights?

Thirdly; I think I mentioned in Zulip maybe previously, but it's hard to tell when a track is clipping past it's own 0dB, or whether is or will clip past the channels 0dB, aka redline, given the current gain(s), or if the waveform calculation is so broken as to chop the top off (as has been discovered and fixed, to a hopefully large degree, though the comment directly prior shows there is still some kind of discrepancy). Imho, because there is no visual 0dB delimitation (for track, or channel, or main gain, or I guess ReplayGain also); it just disappears into black. Displaying in advance when the waveform at the current gain(s) would redline the channel could be done with, say, a red line at top and bottom at those points, but that certainly deserves a separate issue - I'm just musing here because there appears a conflux of multiple things that have been making it really confusing to understand what is literally happening to the signal, and so much ambiguity added together is, to me, fairly scary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants