Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pitch shifting #567

Merged
merged 14 commits into from Jul 9, 2015
Merged

Pitch shifting #567

merged 14 commits into from Jul 9, 2015

Conversation

@jmtd
Copy link
Contributor

@jmtd jmtd commented Jul 1, 2015

This is my initial PR for issue #274. It's useable enough to play with but there's a fair bit of clean-up and assurance to do. I'm opening the PR now so I can get some feedback, and hopefully we can use this space to track what needs to improve in the patch-set.

Unless you object, I'd tend to treat a PR branch as a patch-set that can be rebased. My intention is for each patch to be logically complete, so you can read and understand each patch in isolation, and the game works at any commit. As I get feedback and work more on this, I'll rebase the merge branch accordingly.

jmtd added 3 commits Jun 27, 2015
Some debug code in CacheSFX is #ifdef'd out by default and hadn't
been updated to reflect more recent refactorings in the sound code.
The four game engines and the back-end sound driver need to agree
on what NORM_PITCH is, so define it in i_sound.h, which is the one
header they all have in common. Remove other definitions and use
it in a few places where a bare constant was used instead.
@bradharding
Copy link
Contributor

@bradharding bradharding commented Jul 1, 2015

Just checked this out, integrating the last commit (jmtd@137486c) into Doom Retro, and it sounds great so far! Great work @jmtd!

@@ -390,12 +391,26 @@ static int S_AdjustSoundParams(mobj_t *listener, mobj_t *source,
return (*vol > 0);
}

int clamp(int x)
{
if (x<0)

This comment has been minimized.

@fragglet

fragglet Jul 1, 2015
Member

Please reformat all code to use coding style as defined in HACKING.

@@ -878,13 +927,34 @@ static int I_SDL_StartSound(sfxinfo_t *sfxinfo, int channel, int vol, int sep)
return -1;
}

snd = sfxinfo->driver_data;
if(NULL == (snd = GetAllocatedSoundBySfxInfoAndPitch(sfxinfo, pitch)))

This comment has been minimized.

@fragglet

fragglet Jul 1, 2015
Member

Split into separate statements please. Don't do this.

@@ -390,12 +391,26 @@ static int S_AdjustSoundParams(mobj_t *listener, mobj_t *source,
return (*vol > 0);
}

int clamp(int x)

This comment has been minimized.

@fragglet

fragglet Jul 1, 2015
Member

Should be static and named consistently with other functions:

static int Clamp(int x)
Sint16 *srcbuf, *dstbuf;
Uint32 srclen, dstlen;

dstlen = (insnd->chunk.alen)*((pitch+1)*100/NORM_PITCH)/100; // rounds up

This comment has been minimized.

@fragglet

fragglet Jul 1, 2015
Member

These two lines are messy and hard to follow. Reformat with spaces, improve comments to explain more and don't use end of line comments

This comment has been minimized.

@jmtd

jmtd Jul 1, 2015
Author Contributor

Which two? I guess one is #304. Is the other #.305?

I don't disagree. It's probably the oldest line in the patch-set too. I might try a slightly different approach. It's also the one place at the moment to determine how much the pitch gets shifted, so I might need to calibrate it a bit based on more experiments with vanilla. (I've got some hacked EXEs and IWADs with a tuned sine wave of known pitch and length for all sfx and known RNG table to figure out the extent of the original shifting, I just need to do the recording and comparisons).

if(!newsnd)
{
return -1;
// XXX: unlock the original sound?

This comment has been minimized.

@fragglet

fragglet Jul 1, 2015
Member

XXX should be TODO?

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 1, 2015

Thanks for the review! Those should be easy to fix. I'll get back to you soon.

@fragglet
Copy link
Member

@fragglet fragglet commented Jul 1, 2015

@jmtd a few things -

  • My initial comments are mostly just formatting, I need to read through in more detail to understand/check the logic here. I understand that you're allocating new sounds for each sound pitch, correct? What's the potential extra memory impact of something like this?
  • I have a feeling this is going to interact really badly with the libsamplerate integration. That works by preconverting all the sound effects at startup, as high quality upsampling can be slow and can cause the game to stall.
  • The biggest thing that seems to be missing here at first glance is a configuration option to turn this on or off. As you've seen from Linguica's comments on Twitter, some people really don't like pitched sound effects. They're also not in Doom v1.9, so I think this should be off by default.
@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 1, 2015

I understand that you're allocating new sounds for each sound pitch, correct? What's the potential extra memory impact of something like this?

There are either 31 or 32 possible pitch variations per sound effect, and I think you've fixed the sfx buffer size to 60M, so worst-case is 32x what you would have with no pitch shifting, with stuff being pushed out of the buffer. But because I link into the allocated_sounds_t list, "hot" sound effects will stay at the head. (need to check whether we promote a "base" sfx when we play a pitched variation: I think so.) I haven't done any instrumentation yet to see what the memory usage looks like with this on.

(another thing I wanted to check was, what was the audible difference between the 31 or 32 possible variations. If you can't detect a difference between one step, but you can for two; then the expansion could practically drop to 16x. But I suspect you can tell. We'll know better when more calibration is done, see below)

I was thinking about memory usage, and I was meaning to check, with the up sampling, are you converting the sfx to stereo? The low-pass filter you've got assumes the sample is stereo so I assumed so (haven't confirmed). If you either a) kept them mono or b) had it optional you'd drop memory usage a lot. Either way this should be an option, not hard-coded on (see below).

I have a feeling this is going to interact really badly with the libsamplerate integration. That works by preconverting all the sound effects at startup, as high quality upsampling can be slow and can cause the game to stall.

I haven't tested with libsamplerate enabled yet, but you upsample via other methods at start-up even with libsamplerate turned off, right? The sound effects are the same bit-depth and sample rate either way?

The biggest thing that seems to be missing here at first glance is a configuration option to turn this on or off. As you've seen from Linguica's comments on Twitter, some people really don't like pitched sound effects. They're also not in Doom v1.9, so I think this should be off by default.

Yes they should definitely be optional. I don't have a preference whether it should be on or off by default. In terms of the Doom most people are used to, then that is likely post-1.2 for the majority of people (my own experience differs a bit; I spent a lot of time with 1.2 SW so pitch shifting is part of my earliest doom experiences, but then I used doom2 1.666 for most of the rest of the time until the source-port era). So if platonic Doom is the most widely experienced, then off by default fits that. However, it's clear that ID intended the effects to be pitched, that this broke by accident and was never fixed, and likely worked for some/most of the doom2 design period (it was broken with 1.4 which was the first EXE that could run Doom 2). So if platonic Doom is what ID intended, then one could argue that it should be on. But as I said, I have no preference :)

Back to memory usage again; I'd probably argue for upping the static sfx buffer if its on, subject to some benchmarking to determine sensible figures, and (if the default was chosen to be on) disabling it for low-power or memory platforms if that turns out to actually be a problem / the performance is not great. But apart from my rockbox MP3 player, I can't think of a platform slow enough to try that out on (even my raspberry pi is unlikely to notice).

@AXDOOMER
Copy link
Contributor

@AXDOOMER AXDOOMER commented Jul 1, 2015

Is it going to be possible to use the sound pitch effect from Doom v1.2 ? What is missing in every source port is an option to use the v1.2 sound pitch. There must be support for the v1.2 sound pitch, because that's what is used in Heretic, but I'd like to be able to use it in Doom.

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 1, 2015

@AXDOOMER that's the intention of this patch-set, to re-create pitches as they were in Doom 1.2. They were randomly varied up and down from the base pitch. All versions of Heretic and Hexen also varied pitch up and down (but more subtly than Doom 1.2.). At present, chocolate-heretic/hexen do not do this either (although this patch set should eventually resolve that -- I haven't tested them yet though). Other ports that do pitch shifting are Eternity and Strawberry Doom, at least (haven't listened to check how close to 1.2. behaviour they are. I know Classic Doom on PS3 does it, but is more extreme than 1.2 was.)

@fragglet
Copy link
Member

@fragglet fragglet commented Jul 1, 2015

I was thinking about memory usage, and I was meaning to check, with the up sampling, are you converting the sfx to stereo?

They're in stereo yes. I believe this is mandatory because the sample format has to match the output format, but it's possible I'm wrong. If I am wrong and SDL_mixer can play mono samples to a stereo output then we can probably fix that and save a bunch of memory.

I haven't tested with libsamplerate enabled yet, but you upsample via other methods at start-up even with libsamplerate turned off, right?

Not sure what you mean. It only converts each sound effect once at startup. If you enable pitch shifting then the samples cached at startup won't be used. Fixing it requires increasing the startup time by 32x. Maybe it's best to just disable pitch shifting if libsamplerate is being used.

So if platonic Doom is the most widely experienced, then off by default fits that.

I might argue for it to be a trinary option: always on / always off / follow -gameversion setting. We don't yet support 1.2 demo playback but I'd like to in the future and it would make sense that the setting by default should follow what version we're emulating.

For Heretic it should be on by default to match vanilla, with the option to turn it off.

Back to memory usage again; I'd probably argue for upping the static sfx buffer if its on, subject to some benchmarking to determine sensible figures, and (if the default was chosen to be on) disabling it for low-power or memory platforms if that turns out to actually be a problem / the performance is not great. But apart from my rockbox MP3 player, I can't think of a platform slow enough to try that out on (even my raspberry pi is unlikely to notice).

Might be easier said than done. I don't think LibSDL has any API to determine the amount of memory on the system. The lowest spec Raspberry Pis have only 256MB of RAM, half of which can be eaten up by the GPU, so this is quite important.

On a side note, this feature is more evidence that the SDL_mixer-based mixing loop is probably a dead end and we should be doing the mixing ourselves in real time.

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 2, 2015

They're in stereo yes. I believe this is mandatory because the sample format has to match the output format, but it's possible I'm wrong. If I am wrong and SDL_mixer can play mono samples to a stereo output then we can probably fix that and save a bunch of memory.

I'll investigate and check. If we're lucky, SDL_mixer can handle mixing mono samples into a stereo-configured mixer, since it's technically easy to do compared to adjusting sample rate or bit depth. I also want to confirm Doom's behaviour with stereo patches, that is if it was possible at all with later builds which were more forgiving about sfx format IIRC. I'm thinking in particular about always-centre-balanced sfx, such as your own weapon sounds, where you could conceivably wanted a stereo patch (so the shell ejection noise always was a bit to the left or whatever).

Not sure what you mean. It only converts each sound effect once at startup. If you enable pitch shifting then the samples cached at startup won't be used. Fixing it requires increasing the startup time by 32x. Maybe it's best to just disable pitch shifting if libsamplerate is being used.

I mean, when libsamplerate is not configured, you still upsample at start-up time, just using non-libsamplerate routines. At least that was my reading of the source code. So my pitch-shifter is encountering up sampled, uniform sound data by the time it sees it, but the quality of that up sampling depends on whether libsamplerate was linked in.

I don't think it's necessary to pre-shift the sound effects at start up time as my routine is much faster than a proper resampling, and if there are any platforms for which it was still too slow they are likely to be the ones that couldn't handle an inflated cache size anyway.

I might argue for it to be a trinary option: always on / always off / follow -gameversion setting. We don't yet support 1.2 demo playback but I'd like to in the future and it would make sense that the setting by default should follow what version we're emulating.

For Heretic it should be on by default to match vanilla, with the option to turn it off.

Sounds good to me.

Might be easier said than done. I don't think LibSDL has any API to determine the amount of memory on the system. The lowest spec Raspberry Pis have only 256MB of RAM, half of which can be eaten up by the GPU, so this is quite important.

60M * 32 in the context of 256M/2 is quite a lot, yes. I'm going to do a lot of stats-gathering to see how the cache table performs for certain playback situations. I've mostly been doing vanilla doom and doom2 demo loops at the moment, are there other any particularly good demos/tests/etc you could recommend to check out? (I'll look at commit logs from around when you introduced the sfx cache to see if you mention any there)

On a side note, this feature is more evidence that the SDL_mixer-based mixing loop is probably a dead end and we should be doing the mixing ourselves in real time.

I have no strong opinion on that; SDL_mixer is pretty basic but I don't know how much work it would be to rewrite it. Most of the other doom engines seem to have built their own mixers.

I've done all the style fix-ups you point out above; I just need to rebase and push them. I'm also going to compile and post my grand to-do list to this discussion so I can easily see/tick-off the things that need doing (e.g.:
code-fix-ups
config param
☐ benchmarking for cache size;
☐ test heretic/hexen/strife
…)

@jmtd jmtd force-pushed the jmtd:pitch-shifting branch from 137486c to efbf723 Jul 2, 2015
@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 2, 2015

OK I’ve now pushed the corrections and an initial attempt at a configuration variable. I put a parameter on there too but I’m not sure whether that’s necessary (and it introduces another bug, see below). Here’s my first cut at the TODO list:

first review code-fix-ups
configuration parameter
☐ make parameter a trinary value
☐ benchmarking/tuning for cache size
can we keep sfx lumps as mono and SDL_Mixer stereo? sadly not
☐ test heretic: not audible to me on first listen
☐ test hexen: not audible to me on first listen
☐ test strife (should be off)
☐ calibrate for doom
☐ calibrate for heretic
☐ calibrate for hexen
☐ test with libsamplerate enabled
investigate whether any doom supported stereo samples in any way nope
bug: if pitch configured off and you use -pitchshift, flipped to on in config removed, to push
☐ bug: configuration variable not written to default.cfg (and equivs)
☐ bug: off by default in heretic
☐ bug: off by default in hexen

@fragglet
Copy link
Member

@fragglet fragglet commented Jul 2, 2015

Doom never supported stereo samples as far as I know, and the at-startup upsampling should only take place when using libsamplerate.

I wonder if it's possible to have some scheme where we keep track of the number of allocated pitch-shifted samples and impose a limit where eg. no more than 2-3 are allocated at a time, and we free the old ones when they've finished playing. Then we only need to cache the "base" upsampled sound effect (Maybe this is what you've done already; I haven't got round to looking at the logic in depth yet)

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 3, 2015

Doom never supported stereo samples as far as I know

OK yes. (the variation I was trying to recall was the variable sample rate)

and the at-startup upsampling should only take place when using libsamplerate.

SDL_Mixer upsamples to the mixer's configuration when the samples are loaded in. This means that sadly yes, the samples have to be converted and stored in memory as stereo, but also they're getting resampled to 44100 rate, 16 bit depth too. Doom has just over 1M of raw sound data in its IWAD (doom2 a little more), but the in memory representation is roughly 20M when up sampled (11050kHz, 8bit, mono -> 44100, 16bit, stereo => *2 * 2 * 4).

The highest sample rate sfx in vanilla doom was 22050, so we could half that by changing the mixer to 22050, but I don't know what the rate limit was in vanilla doom, nor whether a higher-rate was used in notable PWADs. (I suppose we could determine that at run time).

Most modern machines won't care about 20M of RAM for up sampled sfx though. (32x that would be more of an issue ;)

I wonder if it's possible to have some scheme where we keep track of the number of allocated pitch-shifted samples and impose a limit where eg. no more than 2-3 are allocated at a time, and we free the old ones when they've finished playing. Then we only need to cache the "base" upsampled sound effect (Maybe this is what you've done already; I haven't got round to looking at the logic in depth yet)

We could do that, sure. My code generates new allocated_sound_t structures and sticks them into the existing linked list. IF a pitch-varied sound effect is being played, it looks up the "base" pitch version first, so that is also promoted to the top of the list (or should be). Your code already sorts that by number of uses, so rarely used-sound effects are purged first. The same logic applies to pitch shifted samples automatically at the moment.

I'd like to do some more thorough instrumentation of how that is performing before looking at any tweaks to it.

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 3, 2015

OK I did a little bit of instrumenting this evening. Firstly, I can't get any meaningful difference in execution time with -timedemo on my machine, varying pitch on/off and different sample rates. I might try again on a slower machine (took an rpi home to try).

I also did some memory benchmarking. This is the memory usage in MB on a per-second interval for 44100 (default) and 22050 mixer sample rates, pitch shifting on or off, for Doom registered v1.9 DEMO1

demo1

and Doom 2 v1.9 DEMO3.

d2 demo3

I now want to try a much longer demo, probably with some level transitions, to see what the usage looks like.

@fabiangreffrath
Copy link
Member

@fabiangreffrath fabiangreffrath commented Jul 5, 2015

Excuse me if this is too naive, I have no real idea of the internals of SDL-Mixer. But, shouldn't it be possible to randomly lie to SDL-Mixer about the real sample rate of the sound samples and thus receive a pitch shifting effect when they are mixed together?

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 6, 2015

@fabiangreffrath it might be possible, but I haven't found a way yet. SDL_Mixer does not store the sample rate of a sample in memory; it assumes that in-memory samples are the same rate as the configured Mixer. I've tried a) opening mixer at u8 mono 11kHz, b) loading the samples, c) closing the mixer and re-opening at s16/stereo 44.1kHz, and the samples play back at roughly 4x as fast. This was an attempt to avoid having to store up-sampled routines in memory and save some of it. We can't open/close the mixer during the game as it would stop all existing music and sound playback.

What might be possible would to to construct a fake in memory WAV file, perhaps a struct representing a WAV header then the Mixer_chunk as the data field, and lie in the WAV header about the sample rate, then use one of the SDL mixer load-from-memory routines to try and load from the WAV structure and see if that fools it. Some variation on Mix_LoadWAV_RW(SDL_RWFromMem(abuf, asize))...

But all this would get you would be an in-memory sample that is effectively pitch-shifted by SDL_Mixer, we'd still need to track it and decide whether or not to free it, and we'd still need the whole sample shifted at once, we can't do it as part of a Mix_RegisterEffect filter. It would essentially get us to where we are already, albeit with SDL Mixer doing the "shifting".

It will be pretty easy to adjust my current patch set to just not cache shifted samples. They are very cheap to recompute at run-time and that would result in no real bump in memory usage with the feature switched on.

@fabiangreffrath
Copy link
Member

@fabiangreffrath fabiangreffrath commented Jul 6, 2015

It would essentially get us to where we are already, albeit with SDL Mixer doing the "shifting".

Yes, it would be more or less what we already have now if we use libsamplerate to cache the sound samples in memory. But instead of caching them as raw PCM data and SDL_Mixer "knowing" their sample rate we would attach a WAV header to them and randomly vary their sample rates by a certain degree each time we play them. Freeing of the samples could be handled just as before.

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 6, 2015

OK I've just pushed a commit which immediately deletes any pitch-shifted sound effects, assuming their use-count is 0, at stop-time. Here's Doom DEMO1:

demo1_nocache

and Doom 2 DEMO3:

demo3

(in both cases the green line is the current state of this PR).

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 6, 2015

Hi Fabian,

Yes, it would be more or less what we already have now if we use libsamplerate to cache the sound samples in memory.

I still don't see how libsamplerate impacts this stuff particularly. What am I missing? With libsamplerate disabled, chocolate-doom still reads in and resamples all the sound effects at start-up, just via SDL_Mixer's routines instead of libsamplerate. The sample depth and size are identical by the time they get to PitchShift() in either case.

But instead of caching them as raw PCM data and SDL_Mixer "knowing" their sample rate we would attach a WAV header to them and randomly vary their sample rates by a certain degree each time we play them. Freeing of the samples could be handled just as before.

I'm unsure what problem here you're trying to solve. Simon's concerns seem to be about memory usage, so pitch-shifted sound effects need special handling (which d5ae189 now resolves). Do you think SDL_Mixer's resampling will be either more efficient, or higher quality than the routine in this patch set?

The advantage of the current approach, imho, is we can play with the pitch-shifting code (the core of which is about 5 lines atm) to see whether we can improve the quality without impacting the performance relatively easier. If we try to (ab)use SDL_Mixer we lose that flexibility. I think we'd also be sensitive to any behavioural changes in SDL_Mixer since we'd be using it in an unorthodox fashion.

dstlen = (insnd->chunk.alen) * ((pitch + 1) * 100 / NORM_PITCH) / 100;

// ensure that the new buffer is an even length
if( (dstlen & 1) == 1)

This comment has been minimized.

@fragglet

fragglet Jul 6, 2015
Member

(dstlen % 2) == 0 is more readable.


static int Clamp(int x)
{
if (x < 0)

This comment has been minimized.

@fragglet

fragglet Jul 6, 2015
Member

Still not indented right.

}

outsnd = AllocateSound(insnd->sfxinfo, dstlen);
if(!outsnd)

This comment has been minimized.

@fragglet

fragglet Jul 6, 2015
Member

Should be spaces between 'if', 'for', 'while', etc. and the following paren. Please fix.

return -1;
}

if(snd_pitchshift == true)

This comment has been minimized.

@fragglet

fragglet Jul 6, 2015
Member

Don't compare to true. Just:

if (snd_pitchshift)

// determine ratio of pitch to NORM_PITCH and apply to srclen to get dstlen
// (pitch + 1) is a crude way to rounds up the division
dstlen = (insnd->chunk.alen) * ((pitch + 1) * 100 / NORM_PITCH) / 100;

This comment has been minimized.

@fragglet

fragglet Jul 6, 2015
Member

This isn't really a big deal but I'm curious: how did you come up with this formula and how certain are we that it matches vanilla?

@fragglet
Copy link
Member

@fragglet fragglet commented Jul 6, 2015

Just a few more comments left and then I think this is ready to merge. Thanks, this has been a great code review - really appreciate all the graphs etc. Very professional.

jmtd added 5 commits Jun 23, 2015
Re-introduce removed code for pitch shifting. In the case of Doom,
bring back the code from the linux doom sources. In the case of
Heretic and Hexen, uncomment existing code.

The original implementation passed a pitch value around through
several of the sound routines, including I_StartSound and
I_UpdateSoundParams, likely because DMX required it. We won't
need it in I_UpdateSoundParams, so I haven't brought those bits
back.

Adjust the sound driver prototypes for the pitch variable.
channels_playing is an internal array in i_sdlsound which
used to track sfxinfo_t. The callers often had to look up
the allocated_sound_t from the sfxinfo_t. Track the
allocated_sound_t instead.
GetAllocatedSoundBySfxinfo is a linked-list look-up on the
list of allocated sounds, rather than the direct-access of the
driver_data pointer.

Previously we used the ->driver_data field of sfxinfo_t to point at the
corresponding allocated_sound_t, but this assumes a 1:1 mapping which
won't be true with pitch-shifted sounds.

The lookups are still cheap, but I did some instrumenting to see
how many iterations this results in. Note that when I ran these tests
there were a few more calls that I've since eliminated, so this is
slightly worse than current. For Doom 2 v1.9 demos:

        ____demo:___
 iter     1   2    3
    0   296 636 1691
    1   104 205 620
    2   57  134 488
    3   55  90  287
    4   58  56  204
    5   28  48  146
    6   18  33  119
    7   6   44  99
    8   4   33  86
    9   8   57  72
    10  6   30  62
    11  6   18  68
    12  6   22  46
    13  2   18  42
    14  4   18  48
    15  0   6   22
    16  4   12  16
    17  0   12  14
    18  0   8   12
    19  0   12  14
    20  0   16  12
    21  0   10  10
    22  0   2   20
    23  0   10  10
    24  0   2   6
    25  0   2   8
    26  0   0   10
    27  0   0   4
    28  0   0   6
    29  0   0   2
    30  0   0   12
    31  0   0   8
    32  0   0   4
    33  0   0   2
    34  0   0   4
    35  0   0   0
    36  0   0   0
    37  0   0   0
Necessary so I can use AllocateSound as part of a new routine that
will clone an existing allocated_sound_t.
Rework GetAllocatedSoundBySfxInfo to search by sfxinfo_t and pitch,
with all callers looking for NORM_PITCH.
@jmtd jmtd force-pushed the jmtd:pitch-shifting branch from d5ae189 to 83ee83e Jul 6, 2015
@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 6, 2015

Thanks for the review and the kind comments! I've pushed fixes for those issues you have highlighted. Here's an updated version of my TODO. With regards your question of how I settled on the value: It's guess work. I assumed that the pitch number was a linear thing, and mapping the ratio of the random pitch number to buffer size seemed natural. That's what "calibrate X" on the todo list means; it means I need to ensure that this does match the behaviour of Doom. I should manage to finish that tonight; I have my hacked 1.2 EXE with a tuned sine wave ready to go :) Here's where I'm at, then:

first review code-fix-ups
second review code-fix-ups
configuration parameter
benchmarking/tuning for cache size don't cache pitch-shifted sfx
can we keep sfx lumps as mono and SDL_Mixer stereo? sadly not
test with libsamplerate enabled
investigate whether any doom supported stereo samples in any way nope
bug: if pitch configured off and you use -pitchshift, flipped to on in config removed
☐ make parameter a trinary value
☐ bug: configuration variable not written to default.cfg (and equivs)
☐ bug: off by default in heretic
☐ bug: off by default in hexen
☐ test heretic: not audible to me on first listen
☐ test hexen: not audible to me on first listen
☐ test strife (should be off)
☐ calibrate for doom
☐ calibrate for heretic
☐ calibrate for hexen

@fragglet
Copy link
Member

@fragglet fragglet commented Jul 6, 2015

I'm assuming that for Strife we don't even want to provide the option, as none of the DOS versions of Strife ever had pitch shifting. Unless @haleyjd particularly wants it?

@haleyjd
Copy link
Contributor

@haleyjd haleyjd commented Jul 6, 2015

Strife shouldn't have it since it never worked. It doesn't bother me if it's a "hidden option" or what have you, but there's no way to judge what the results "should" sound like.

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 6, 2015

I've ran out of time to finish calibrating Doom tonight, but here's what I've done. I replaced every sound effect in the IWAD with a 1 second long sine wave, tuned to middle C, that starts and stops abruptly. I then hacked the RNG in Doom 1.2 Shareware so that it was either all 00s or all FFs and recorded the sound of starting the game and exiting it (hitting a few menu items on the way). I then isolated the resulting sound effects, one sample, and clipped manually in Audacity to as close to the start and finish of the sample as possible. I then ran a pitch detector over the results. The results:

input was 1s sine wave middle c (C4, 262Hz)
doom FF (lowest pitch/longest): 01.063s, B3 (245Hz)
doom 00 (highest pitch/shortest): 0.0883s, D4 (297Hz)

If we just look at the lengths, the longest is 106.3% the base and the shortest 88.3%.

Doom's source originally had the mid point as 128, I changed it to 127 to match Heretic/Hexen (and to have a consistent value across chocolate). The ratios:

Max pitch value is 144, lowest is 113

144.0/128.0 = 112.5% (too high)
113/128.0 = 88.28% (matches!)

144.0/127 = 113% (too high)
113.0/127 = 88.97% (too high)

Since I manually clipped the samples there's likely some error in that. I guess NORM_PITCH should be a game-specific variable rather than a constant across all games, then it can be 128 for Doom and 127 for Heretic/Hexen.

Here's the TODO list again

benchmarking/tuning for cache size don't cache pitch-shifted sfx
can we keep sfx lumps as mono and SDL_Mixer stereo? sadly not
test with libsamplerate enabled
investigate whether any doom supported stereo samples in any way nope
bug: if pitch configured off and you use -pitchshift, flipped to on in config removed
bug: off by default in heretic
bug: off by default in hexen
test strife (should be off)
☐ make parameter a trinary value
☐ bug: configuration variable not written to default.cfg (and equivs)
☐ test heretic: not audible to me on first listen
☐ test hexen: not audible to me on first listen
☐ calibrate for doom probably means NORM_PITCH = 128
☐ calibrate for heretic
☐ calibrate for hexen

@fabiangreffrath
Copy link
Member

@fabiangreffrath fabiangreffrath commented Jul 7, 2015

Hi Jonathan,

I still don't see how libsamplerate impacts this stuff particularly.

probably not at all. You are right, I haven't seen that sounds are resampled regardless of libsamplerate being enabled or not.

I'm unsure what problem here you're trying to solve.

My idea was that it would be better to have the samples in memory once and play them with different pitches instead of having multiple copies with different pitches in memory. By immediately freeing the pitch-shifted samples you are mitigating the memory consumption issue. But still, each sound sample needs to undergo the same pitch-shifting routines over and over again. And somehow, I expected SDL-Mixer's routines to be more suitable ("specialized"?) for that. Not sure why, sorry.

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 7, 2015

Oops, my possible-pitch values above were off-by-one, the actual ratios are

144.0/128 = 112% (too long / low pitched)
113.0/128 = 88.3% (correct)

143.0/127 = 112% (too long / low pitched)
112.0/127 = 88.1% (very close)

I'll fiddle with the ratio-calculating line tonight.

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 7, 2015

Sorry, still not done calibrating this. For what it's worth, I think it's pretty close, and the infrastructure side of things is sound, so feel free to merge it, I'll continue work on finessing the exactness. I'm busy hex-editing the 1.2 EXE and measuring the results and comparing to c-d. I'm updating this spreadsheet as I go
https://docs.google.com/spreadsheets/d/1kYURp2cwtHLc5TZ63GY-5Kg17t81zwZHgpIs1Tu9so4/edit?usp=sharing

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 8, 2015

OK just pushed an improvement in the approximation based on measurements of Doom 1.2 shareware. See the spreadsheet to see how close it is. I also reworked the ratio line to be more readable at the same time.

@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 8, 2015

Oh, christ, Heretic/Hexen...

The HH brothers pull two random numbers per pitch shift, which makes observing the original EXE behaviour much harder. You can't just blast the RNG with a single random value, you need to blast it with a pair, and hope that you have hit it at the right stride. I've added a sheet to that google docs spreadsheet with some measurements from Heretic SOTSW and c-d as of 4f46d3 (latest push to this PR) which show how close we match at the moment; I'd say not as close as Doom but pretty close non-the-less.

Frustratingly the error is greatest at the high pitch end, which is the opposite of Doom.

The number of distinct pitch values is half that of Doom (for most sfx, at least. Different rules for ambient sounds); and the spread is much smaller. In short the results are subtler, and so we don't sound too far off.

(I wonder if the weird round-off we see in both games is a result of bit-bias caused by masking against low value integers?)

jmtd added 4 commits Jun 26, 2015
Calculate a destination sound buffer size by working out the ratio
of pitch:NORM_PITCH, inverting and applying that to the source buf
size. This approximates Doom/DMX based on measurements.

Really simple algorithm that maps dest-cell to source-cell by the
ratio of offset from the start of each chunk, with no interpolation.
I've got better results with interpolation but that needs a low-pass
filter applied afterwards and I haven't finished that part yet.
When we stop playing a sound effect on a channel, check to see
whether it is a pitch-shifted sound effect and whether it's not
used on any other channel. If so, immediately free it.
Tweak snd_pitchshift to clearly default to off and switch the
default to on only for Heretic and Hexen (since all versions of
Heretic and Hexen did pitch-shifting).

Pitch shifting was never implemented for Strife, so hide the option
in chocolate-strife-setup entirely.
@jmtd jmtd force-pushed the jmtd:pitch-shifting branch from 4cf46d3 to a8c4c51 Jul 8, 2015
@jmtd
Copy link
Contributor Author

@jmtd jmtd commented Jul 8, 2015

Quite a long evening! Finally everything pretty much done. Updated TODO:

first review code-fix-ups
second review code-fix-ups
configuration parameter
benchmarking/tuning for cache size don't cache pitch-shifted sfx
can we keep sfx lumps as mono and SDL_Mixer stereo? sadly not
test with libsamplerate enabled
investigate whether any doom supported stereo samples in any way nope
bug: if pitch configured off and you use -pitchshift, flipped to on in config removed
test strife (should be off)
test heretic
calibrate for doom
calibrate for heretic
calibrate for hexen
bug: configuration variable not written to default.cfg (and equivs) it shouldn't be anyway
Hexen support
test Hexen
bug: off by default in heretic/hexen
☐ make parameter a trinary value

I've just had a run through all of the games and they sound good to me!

I'm not sure what you'll think of that (snd_pitchshift a trinary value: on/off/unconfigured), or overloading the pitch member in sfxinfo_t in Hexen.

jmtd added 2 commits Jul 8, 2015
Initialize snd_pitchshift to -1, to signal "not initialised".
Each game engine sets a different default.
Hexen modified the sfxinfo_t structure from Doom, removing the
pitch member (which was used for sfx links, which aren't used
in Hexen anyway) and adding 'changePitch', a flag which indicated
whether a given sound effect should be shifted, or not.

Rather than add the member back, overload the 'pitch' member to
mean the same thing.

The table of sfxinfo_t definitions was already writing the
changePitch value into the pitch member, so no further changes
are needed there.
@fragglet
Copy link
Member

@fragglet fragglet commented Jul 9, 2015

Looks good, thanks.

There are still a few formatting issues that haven't been resolved but I'll fix them myself.

fragglet added a commit that referenced this pull request Jul 9, 2015
Implementation of pitch shifting.

This appeared in early versions of Doom and (more importantly) in Heretic
and Hexen. Sound effects get pitched up and down to sound less repetitive.
@fragglet fragglet merged commit 93fe112 into chocolate-doom:master Jul 9, 2015
@jmtd jmtd deleted the jmtd:pitch-shifting branch Jul 22, 2015
Azarien pushed a commit to Azarien/chocolate-doom that referenced this pull request Apr 11, 2020
Now, instead of extending the background_buffer in R_FillBackScreen
and filling it with the bezel flat pattern, we draw the pattern to the
status bar's own st_backing_screen buffer before the actual status bar
background patch is drawn. Then, we copy over the entire SCREENWIDTH
in ST_refreshBackground().

This also reverts some questionable design decisions during the 5.7
release cycle. Fixes chocolate-doom#566 and fixes chocolate-doom#567.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.