24 bit sample support #301

derselbst · 2017-12-09T18:37:41Z

This adds support for 24 bit sample soundfonts. It has been implemented by adding a separate pointer data24 to fluid_sample_t that points to the lower byte counterpart of each sample. This pointer can be set to NULL if a soundfont only has 16 bit samples. Implementing it this way ensures two important aspects:

Not to break any use-case related to the public fluid_sample_t struct, and
have the same amount of memory consumption when coping with usual 16 bit SF2s.

Putting the 16 bit samples together with a potentially present 8 bit counterpart to create a 24 bit sample is done as late as possible, specifically in the interpolation routines in rvoice_dsp. But even if there is no 8 bit counterpart, we now always use 24 bit samples (in fact int32_t) for mixing in rvoice_dsp (the lower part is assumed to be zero in this case, see fluid_rvoice_get_sample()).

This leads to the interesting question: Is there even any gain in precision when using 24 bit samples?

The synthesizing pipeline looks as follows:

Read integer sample data from the soundfont
Access those integer samples in rvoice_dsp during interpolation and create a 32 bit integer sample from it (with a value range of [-2^23 ; 2^23 - 1])
Implicitly convert these 32 bit integer samples to floating point samples upon interpolating
Do the interpolation calculation
Amplify the floating point samples by INT24_MAX to force them to a normalized range of [-1.0 ; 1.0)

The only potential precision loss due to 32 bit integers I can see could occur at 3. However even with single point precision floating points we can accurately represent integer values in the range [-2^24 ; 2^24] when promoting them to floats. The 32 bit integer samples are guaranteed to be in that range so there should be no rounding errors that haven't been there before.

I tested the synthesized output with the OmegaGMGS2 soundfont. As expected an audible difference was not hearable. However the resulting 24 bit waveform is slightly different from the one synthesized with fluidsynth 1.1.8. Also as far as I can tell there doesnt seem to be any measurable drop in performance.

Any feedback welcome.

jjceresa · 2017-12-09T22:25:10Z

I tested the synthesized output with the OmegaGMGS2 soundfont. As expected an audible difference was not hearable.

I suppose the test was done using an audio card with a 24bits DAC ?.
Even in this case, it is difficult to see a difference by ear between 16 bits and 24 bits. 16 bits samples brings 96 dB of dynamic which is already enormous for ours ears.
The only things that i see, that with 24 bits, the quantization noise is -48 dB lower than with 16 bits. The quantization noise produces an unwanted harmonic distortion sometimes hearable(because it is periodic). The use of dithering technique is often useful to cancel this harmonic distortion. Actually dithering was used in 16 bits . May be that with 24 bits dithering will be not necessary anymore ?

However the resulting 24 bit waveform is slightly different from the one synthesized with fluidsynth 1.1.8.

What do you mean ?

Also as far as I can tell there doesnt seem to be any measurable drop in performance.

Do you mean, lost in computation time ?

derselbst · 2017-12-10T09:01:04Z

I suppose the test was done using an audio card with a 24bits DAC ?

Good point, I cant even tell right now.

Even in this case, it is difficult to see a difference by ear between 16 bits and 24 bits.

Yes, that's why I was expecting none, else this would have probably meant I introduced some strange bug.

May be that with 24 bits ditheringwill be not necessary anymore ?

Not sure. But since most soundfonts are only 16 bit, dithering will still be useful.

What do you mean ?

See the attachment. Watching the two files in e.g. audacity you can see a slight difference for some samples, so at least it seem to have any effect.
sm24_test.zip

Do you mean, lost in computation time ?

I dont think lost is the correct word in this context, but yes I was referring to computation time which stayed the same.

jjceresa · 2017-12-11T01:17:32Z

Watching the two files you can see a slight difference for some samples, so at least it seem to have any effect.

Using Sonic Visualiser the two files look similar.
Of course, i suppose the two files was build using the file rendering functionality of FluidSynth.

In the two files, the absolute maximum level is around 0.2 (normalized to 1.0f). For example, if we compare the same sample number near this level (ie the sample number 4581140 at time 1:43:880.).
The result is:

Test16:left 0.1873 (-7.274 dB) -- Test24: left 0.1865 (-7.292 dB) difference: 0.18 cB!
Test16:right 0.1532 (-8.147 dB) -- Test24: right 0.1508 (-8.216 dB) difference: 0.69 cB!

The average difference is 0.45 cB (i.e 0.6 %), so this is an inaudible difference.

I was referring to computation time which stayed the same.

Please, how the time measurement was done ? , and which interpolation method was used ?

mawe42 · 2017-12-11T06:29:06Z

If you load both files in audacity, invert one of them and mix them together you get the difference. It does look like the changes are miniscule and not audible.

mawe42 · 2017-12-11T06:37:44Z

Also as far as I can tell there doesnt seem to be any measurable drop in performance.

I think it would be great to add more instrumentation (like jjcs profiling patch) to objectively measure performance. And maybe keep a wiki page updated with the numbers for different platforms for each release. That way we can be sure that there are no performance regressions.

derselbst · 2017-12-11T16:53:15Z

I just noticed that the demo files I provided used the default s16 for "audio.file.format" setting, so that waveform analysis was quite meaningless I guess.

And maybe keep a wiki page updated with the numbers for different platforms for each release.

Even if we manage it to setup a fully automated test suite, I dont think gathering those performance information across multiple/different platforms is something we can bear up in the long term.

Anyway, here is the demo program I used.
benchmark.txt
Compiled with gcc benchmark.c -lfluidsynth -o ben, libfluidsynth.so build with RelWithDebInfo.

And these are my test results, depending on the interp. method used:

Test with OmegaGMGS2.sf2

./ben OmegaGMGS2/OmegaGMGS2.sf2 OmegaGMGS2/Demo\ Midi\ Files/Video\ Game/Zelda/DARKWRL2.MID

fluidsynth-master (without sm24)

interp = 0 : 3983.905100 ms
interp = 1 : 4734.673700 ms
interp = 4 : 5635.398300 ms
interp = 7 : 8316.970300 ms

fluidsynth-1.1.8

interp = 0 : 4091.934600 ms
interp = 1 : 4866.325400 ms
interp = 4 : 5733.814400 ms
interp = 7 : 8409.062700 ms

fluidsynth-sm24 (this PR)

interp = 0 : 3987.057500 ms
interp = 1 : 4380.970100 ms
interp = 4 : 5094.222800 ms
interp = 7 : 6235.854200 ms

Test with FluidR3_GM.sf2

./ben /usr/share/sounds/sf2/FluidR3_GM.sf2 OmegaGMGS2/Demo\ Midi\ Files/Video\ Game/Zelda/DARKWRL2.MID

fluidsynth-master (without sm24)

interp = 0 : 4220.572200 ms
interp = 1 : 5006.859700 ms
interp = 4 : 5940.045100 ms
interp = 7 : 8673.536500 ms

fluidsynth-1.1.8

interp = 0 : 4422.360800 ms
interp = 1 : 5208.928800 ms
interp = 4 : 6130.983500 ms
interp = 7 : 8904.197700 ms

fluidsynth-sm24 (this PR)

interp = 0 : 4201.633800 ms
interp = 1 : 4664.054000 ms
interp = 4 : 5289.579800 ms
interp = 7 : 6197.322800 ms

I'm not quite sure why sm24 is even faster today, didnt experienced that two days ago...

jjceresa · 2017-12-11T17:27:54Z

If you load both files in audacity, invert one of them and mix them together you get the difference.

Good point . This is because i haven't yet audacity installed that i have used an other available tool .

In fact, the difference are mainly for small amplitude sample: (for 16 bits the smallest level is 1/32768.0f against 1/8388608.0f for 24bits. So, the worst case difference could be -48dB).

i think it would be great to add more instrumentation like profiling patch..

As an example,I have used this patch to compare different reverb cpu load. It was possible to see notable difference between code implemented inline and the same code implemented with macros.

Another example, using a Raspberry 2, it was possible to see that when playing 50 voices the result was:

 ------ cpu loads(%)(sample rate:44100 Hz) and estimated maximum voices ------
  nVoices |total(%) |voices(%) |reverb(%) |chorus(%)  | voice(%)|estimated maxVoices
        50|    73.44|     38.25|     27.90|      7.29 |     0.73|    137

On this hardware the reverb cpu load is high 27.90% (probably due to the lack of floating point unit ?).
When used properly profiling patch gives precises data. I don't regret the necessary time to build it.

jjceresa · 2017-12-12T00:34:15Z

To get a behaviour similar to this sm24 PR , i have implemented fluid_rvoice_get_sample() and fluid_voice_calculate_gain_amplitude().

Theses are interesting results tests using the profiling commands (on interpolation method 4 only).

Test1: is without implementation of sm24 PR behaviour.
Test2: is with fluid_rvoice_get_sample() and fluid_voice_calculate_gain_amplitude().
Test3: is similar to Test 2, but the functions are implemented inline.

The condition of the measure:

interpolation method 4 only.
1 CPU core.
chorus, reverb: off
numbers of voices generated: 250
Duration of the measurement: 1 second.

Definitions:

cpu loads (%)are the sample processing time (in percent) relative to the current sample period.
nVoices the number of generated voices (here 250).
voices(%) the cpu load of all voices (here 250).
reverb(%) the cpu load of reverb only.
chorus(%) the cpu load of chorus only.
voice(%): the cpu load of only one voice.
estimated maxVoices gives the maximum voices the current hardware is capable (at 100% cpu load).

In those tests, because we are only interested by the impact on interpolation, only the cpu load of one voice (i.e voice(%)) is relevant:
For example a voice(%) value of 0.22 mean that the duration processing of one voice sample is 0.22 % of the sample period.

Test:1 Without sm24 PR behaviour.

------ cpu loads(%)(sample rate:44100 Hz) and estimated maximum voices ------
  nVoices |total(%) |voices(%) |reverb(%) |chorus(%)  | voice(%)|estimated maxVoices
       250|    40.80|     40.80|      0.00|      0.00 |     0.16|    620

The duration of one voice sample is 0.16% of the sample period.

Test:2 With fluid_rvoice_get_sample() - fluid_voice_calculate_gain_amplitude()

------ cpu loads(%)(sample rate:44100 Hz) and estimated maximum voices ------
  nVoices |total(%) |voices(%) |reverb(%) |chorus(%)  | voice(%)|estimated maxVoices
       250|    55.23|     40.80|      0.00|      0.00 |     0.22|    458

The duration of one voice sample is 0.22% of the sample period. This correspond to an augmentation of 35 % compared to Test 1.

Test:3 With fluid_rvoice_get_sample() and fluid_voice_calculate_gain_amplitude() now implemented `inline`

------ cpu loads(%)(sample rate:44100 Hz) and estimated maximum voices ------
  nVoices |total(%) |voices(%) |reverb(%) |chorus(%)  | voice(%)|estimated maxVoices
       250|    40.92|     40.80|      0.00|      0.00 |     0.16|    620

The duration of one voice sample is 0.16% of the sample period. This is the same result that Test 1.

So, it worth to implement those functions inline.

static inline fluid_real_t fluid_voice_calculate_gain_amplitude().
static inline int32_t fluid_rvoice_get_sample().

derselbst · 2017-12-12T16:25:38Z

So, it worth to implement those functions inline.

I dont experience any performance boost when marking those functions inline explicitly. What platform are you on? What compiler + version did you use?

mawe42 · 2017-12-12T16:45:08Z

I dont experience any performance boost when marking those functions inline explicitly.

That "inline" is only a suggestion to the compiler anyway. The function might be inlined even when not marked as inline explicitly, or the compiler might ignore the suggestion. I think there's a -Winline switch for gcc where you get a warning if a function was suggested for inlining but wasn't inlined.

derselbst · 2017-12-12T16:58:14Z

That "inline" is only a suggestion to the compiler anyway.

Yes, and when compiled with RelWithDebInfo a nowadays compiler should be smart enough to inline it implicitly. I'm still curious to know jjc's compiler version + platform.

jjceresa · 2017-12-12T19:17:43Z

What platform are you on? What compiler + version did you use?

The OS is Win XP. The CPU is an AMD.
The compiler is Visual C++ 6.0.

I have chosen the MinSizeRel configuration not the Debug configuration because in Debug the inline is not taken in effect.
When in MinSizeRel the optimisation is Minimun Size but inline is in effect also.
The equivalent command line options are : /O1 (for "minimum size") and /Ob1 (for "expands only functions marked as inline").

with RelWithDebInfo a nowadays compiler should be smart enough to inline it implicitly.

May be inline code is not appropriate to Debug ?.

derselbst · 2017-12-12T19:43:30Z

May be inline code is not appropriate to Debug ?.

RelWithDebInfo is a release build that enables compiler optimizations and should inline appropriately.

I think the problem is your compiler. Marking functions as inline usually really helped compilers from the good old 90s and Visual C++ 6.0 was released in '98. Although I do think explicitly inlining functions is outdated, in this case it seems to be useful.

fluid_voice_calculate_gain_amplitude() fluid_rvoice_get_sample()

in rvoice_dsp

derselbst · 2017-12-13T12:53:16Z

@mawe42 Are you all right with this?

mawe42

Yes, looks good!

derselbst and others added 15 commits October 7, 2017 11:34

parse sm24 chunk

8fd8a75

Merge branch 'master' into sm24

daf85ae

remove unused iter_preset from fluid_defsfont_t

07e63d1

start implementing 24 bit sample support

fb36c24

Merge branch 'master' into sm24

0ac6209

Merge branch 'master' into sm24

1815ca8

add 24 bit sample support to fluid_cached_sampledata_load()

45bc334

use FLUID_STRDUP rather than custom copy

0802fdf

complete 24bit sample support for sfloader

c56943c

Merge branch 'master' into sm24

8fca889

fix build

ce6c5ca

Merge branch 'master' into sm24

7de0e02

add 24bit sample support to rvoice_dsp interpolation functions

9a21e10

adjust voice gain for 24bit samples

b130763

update API docs about 24bit sample support

0619f10

derselbst added the enhancement label Dec 9, 2017

derselbst added this to the 2.0 milestone Dec 9, 2017

derselbst self-assigned this Dec 9, 2017

derselbst requested review from mawe42 and jjceresa December 9, 2017 18:37

derselbst added 3 commits December 12, 2017 20:54

mark voice helper functions inline

0dde0c1

fluid_voice_calculate_gain_amplitude() fluid_rvoice_get_sample()

avoid bad function cast warnings

05a4989

remove redundant fluid_real_t casts

f231df5

in rvoice_dsp

jjceresa approved these changes Dec 12, 2017

View reviewed changes

mawe42 approved these changes Dec 13, 2017

View reviewed changes

Merge branch 'master' into sm24

a82ddb8

derselbst merged commit af0301a into master Dec 14, 2017

derselbst deleted the sm24 branch December 14, 2017 15:47

derselbst mentioned this pull request Jan 19, 2018

implement 24 bit sample support for fluid_voice_optimize_sample() #329

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

24 bit sample support #301

24 bit sample support #301

derselbst commented Dec 9, 2017

jjceresa commented Dec 9, 2017 •

edited

derselbst commented Dec 10, 2017

jjceresa commented Dec 11, 2017 •

edited

mawe42 commented Dec 11, 2017

mawe42 commented Dec 11, 2017

derselbst commented Dec 11, 2017

jjceresa commented Dec 11, 2017

jjceresa commented Dec 12, 2017 •

edited

derselbst commented Dec 12, 2017

mawe42 commented Dec 12, 2017

derselbst commented Dec 12, 2017 •

edited

jjceresa commented Dec 12, 2017

derselbst commented Dec 12, 2017

derselbst commented Dec 13, 2017

mawe42 left a comment

24 bit sample support #301

24 bit sample support #301

Conversation

derselbst commented Dec 9, 2017

jjceresa commented Dec 9, 2017 • edited

derselbst commented Dec 10, 2017

jjceresa commented Dec 11, 2017 • edited

mawe42 commented Dec 11, 2017

mawe42 commented Dec 11, 2017

derselbst commented Dec 11, 2017

Test with OmegaGMGS2.sf2

fluidsynth-master (without sm24)

fluidsynth-1.1.8

fluidsynth-sm24 (this PR)

Test with FluidR3_GM.sf2

fluidsynth-master (without sm24)

fluidsynth-1.1.8

fluidsynth-sm24 (this PR)

jjceresa commented Dec 11, 2017

jjceresa commented Dec 12, 2017 • edited

Test:1 Without sm24 PR behaviour.

Test:2 With fluid_rvoice_get_sample() - fluid_voice_calculate_gain_amplitude()

Test:3 With fluid_rvoice_get_sample() and fluid_voice_calculate_gain_amplitude() now implemented inline

derselbst commented Dec 12, 2017

mawe42 commented Dec 12, 2017

derselbst commented Dec 12, 2017 • edited

jjceresa commented Dec 12, 2017

derselbst commented Dec 12, 2017

derselbst commented Dec 13, 2017

mawe42 left a comment

Choose a reason for hiding this comment

jjceresa commented Dec 9, 2017 •

edited

jjceresa commented Dec 11, 2017 •

edited

jjceresa commented Dec 12, 2017 •

edited

Test:3 With fluid_rvoice_get_sample() and fluid_voice_calculate_gain_amplitude() now implemented `inline`

derselbst commented Dec 12, 2017 •

edited