Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipper noise with moving source and binaural effect #37

Open
grrrwaaa opened this issue Jul 6, 2017 · 3 comments
Open

zipper noise with moving source and binaural effect #37

grrrwaaa opened this issue Jul 6, 2017 · 3 comments

Comments

@grrrwaaa
Copy link

grrrwaaa commented Jul 6, 2017

Hi,

Using C API, steamaudio_api_2.0-beta.6, OSX. I am using IPL_HRTFDATABASETYPE_DEFAULT and IPL_CONVOLUTIONTYPE_PHONON.

I hear consistent zipper noise (quiet clicks) when changing the direction of a sound source. The effect is very noticeable with smaller frame sizes (e.g. 64), but is present at a lower frequency at larger frame sizes (e.g. 1024). I imagine this is because the source direction is sampled once per frame via iplApplyBinauralEffect(), and not being interpolated over time? For a VR HMD, nearly all sources' relative directions are dynamic, because the listener's head is moving, so this seems like a big issue. Am I missing something in the SDK? Is the undocumented iplApplyBinauralEffectWithParameters() helpful to deal with this?

Relevant code:

#include "steamaudio_api_2.0-beta.6/include/phonon.h"

struct {
	IPLContext context;
	IPLRenderingSettings settings;
	IPLHrtfParams hrtfParams;
	IPLAudioFormat source_format;
	IPLAudioFormat output_format;
	
	IPLhandle renderer = 0;
	IPLhandle binaural = 0;
	
	// pre-allocated buffers at maximum vector size
	IPLfloat32 source_buffer[4096];
	IPLfloat32 output_buffer[4096 * 2];
} phonon;


void init(double samplerate = 44100, int framesize = 64) {

	// default position in front of listener, to avoid 0,0,0
	direction.x = 0;
	direction.y = 0;
	direction.z = -1;

	phonon.context.allocateCallback = 0;
	phonon.context.freeCallback = 0;
	phonon.context.logCallback = phonon_log_function;
	phonon.settings.convolutionType = IPL_CONVOLUTIONTYPE_PHONON;

	// various options:
	phonon.hrtfParams.type = IPL_HRTFDATABASETYPE_DEFAULT; // or CUSTIOM
	phonon.hrtfParams.hrtfData = 0;	// Reserved. Must be NULL.
	// TODO: allow custom HRTFs; implement these:
	phonon.hrtfParams.numHrirSamples = 0;
	phonon.hrtfParams.loadCallback = 0;
	phonon.hrtfParams.unloadCallback = 0;
	phonon.hrtfParams.lookupCallback = 0;

	phonon.settings.samplingRate = samplerate;
	phonon.settings.frameSize = framesize;

	iplCreateBinauralRenderer(phonon.context, phonon.settings, phonon.hrtfParams, &phonon.renderer);
	
	// a single mono source
	phonon.source_format.channelLayoutType  = IPL_CHANNELLAYOUTTYPE_SPEAKERS;
	phonon.source_format.channelLayout      = IPL_CHANNELLAYOUT_MONO;
	phonon.source_format.numSpeakers		= 1;
	phonon.source_format.channelOrder       = IPL_CHANNELORDER_INTERLEAVED;
	
	phonon.output_format.channelLayoutType  = IPL_CHANNELLAYOUTTYPE_SPEAKERS;
	phonon.output_format.channelLayout      = IPL_CHANNELLAYOUT_STEREO;
	phonon.output_format.numSpeakers		= 2;
	phonon.output_format.channelOrder       = IPL_CHANNELORDER_INTERLEAVED;

	iplCreateBinauralEffect(phonon.renderer, phonon.source_format, phonon.output_format, &phonon.binaural);

}		

void perform(double **ins, long numins, double **outs, long numouts, long sampleframes) {
		
	// phonon uses float32 processing, so we need to copy :-(
	
	IPLAudioBuffer outbuffer;
	outbuffer.format = phonon.output_format;
	outbuffer.numSamples = sampleframes;
	outbuffer.interleavedBuffer = phonon.output_buffer;
	
	IPLAudioBuffer inbuffer;
	inbuffer.format = phonon.source_format;
	inbuffer.numSamples = sampleframes;
	inbuffer.interleavedBuffer = phonon.source_buffer;
	
	// copy input:
	{
		t_double * src = ins[0];
		IPLfloat32 * dst = phonon.source_buffer;
		int n = sampleframes;
		while (n--) { *dst++ = *src++; }
	}
	
	// rotate at 3 hz:
	static float t = 0.f;
	t += M_PI * 2. * 3. * sampleframes/(44100.);
	IPLVector3 dir;
	dir.x = sin(t);
	dir.y = 0.;
	dir.z = cos(t);
	
	// Unit vector from the listener to the point source,
	// relative to the listener's coordinate system.
	glm::vec3 dirn = glm::vec3( sin(t), 0.f, cos(t) );

	IPLAudioBuffer outbuffer;
	outbuffer.format = phonon.output_format;
	outbuffer.numSamples = sampleframes;
	outbuffer.interleavedBuffer = phonon.output_buffer;
	iplApplyBinauralEffect(phonon.binaural,
						   inbuffer,
						   dir,
						   IPL_HRTFINTERPOLATION_BILINEAR,
						   outbuffer);

	// copy output:
	{
		IPLfloat32 * src = phonon.output_buffer;
		t_double * dst0 = outs[0];
		t_double * dst1 = outs[1];
		int n = sampleframes;
		while (n--) {
			*dst0++ = *src++;
			*dst1++ = *src++;
		}
	}
}
@lakulish
Copy link
Collaborator

@grrrwaaa Do you encounter this issue when using IPL_HRTFINTERPOLATION_NEAREST? At first glance, things seem to be configured correctly in your code.

@grrrwaaa
Copy link
Author

grrrwaaa commented Jul 13, 2017

Hi there, thanks for looking at this.

Using IPL_HRTFINTERPOLATION_NEAREST doesn't make any difference as far as I can tell.

I've taken some screenshots of the waveforms I'm seeing, uploaded here:

https://www.dropbox.com/sh/023oy5r1a2swy1m/AADgDfkgN73X1QASZ1ZgqALea?dl=0

This was captured by panning a source signal that is a 2000Hz sinewave, panning around the listener a little faster than once per second. The sampling rate is 44.1kHz, IPL_HRTFINTERPOLATION_BILINEAR is enabled. Some are taken with a framesize of 64, some with a framesize of 1024 (indicated in the filename). OSX 64-bit.

In the most zoomed out picture there's very clearly linear stepping of the position, not interpolation. These artifacts seem to occur about every 2048 samples (regardless of my framesize -- I'm speculating that this is a function of the HRTF size so there's nothing I can do about it?) Still, in the most zoomed pictures at 64 framesize there's a very abrupt cut in both amplitude and phase, causing the crackle. In the zoomed in pictures at 1024 framesize, this manifests instead as a wobble over around 200ish samples -- presumably a crossfade between signals at different phases.

It's clear that there's going to be some kind of approximation for motion, since the direction vector is only updated as a step signal, at most once per framesize. But is there any way I can reduce the crackle? If the HRTFs are only updated every 2048 samples (which seems to be the case), then perhaps I can run two HRTFs in parallel and crossfade their outputs (perhaps interlaced, so that I can achieve a maximum rate of motion frequency of samplerate/1024)?

Graham

@grrrwaaa
Copy link
Author

I've updated to the beta 10, and I still hear the zipper noise. It seems like the interpolation is now working correctly (with interpolation disabled I hear stepping as before). However even with interpolation enabled I am still seeing abrupt cuts with a frame size less than 2048 samples, leading to crackle. The lower the frame size, the more the crackle.

Is this library not intended to be used with frame sizes other than 2048 samples?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants