Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing for arbitrarily large blocksizes #388

Open
giuliomoro opened this issue Mar 23, 2018 · 3 comments

Comments

Projects
None yet
3 participants
@giuliomoro
Copy link
Contributor

commented Mar 23, 2018

New idea: keep whatever the current hardware block size is, but add FIFOs to a separate thread to run the audio in. Basically, render() would now be called from within an AuxiliaryTask, in a way that is totally transparent to the user. If Bela is using RTDM (as it does on v0.3.x images), this should also come pretty cheap (I estimate the penalty is one context switch (~17us) every time the hardware buffer is full, that is about 0.5% CPU time with a blocksize of 128) and still run at xenomai-guaranteed priority.

The other solution is to use DDR instead of PRU RAM, but that is much more complicated.

@giuliomoro giuliomoro added this to the ctag merge milestone Mar 23, 2018

@apmcpherson

This comment has been minimized.

Copy link

commented Mar 24, 2018

This seems reasonable from a performance standpoint, and certainly easier than the PRU-DDR approach. You could make the internal PRU.cpp run at a higher priority than 95, so the priority of render() doesn't have to be changed. For small block sizes I would keep things as they are.

My only objection is philosophical, in that I think good code should be written to not need big block sizes, but I realise that this is often practically impossible when using legacy code bases or even when in a hurry to mock something up.

@giuliomoro

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2018

Blocksizes achievable via hardware buffers will run in the standard mode, larger blocksizes will have to run FIFOed. Xenomai's RT message queue could be useful here, as they block until a message is received (no need for polling!) (see an example implementation here) and docs here.

@giuliomoro

This comment has been minimized.

Copy link
Contributor Author

commented Apr 7, 2018

The data going through the message queue should be the raw int/uint from the PRU: this will minimize the amount of data that need to be passed around (i.e.: floats would take twice as much space!).

almost-non-pseudocode:

// ioToComputation and computationToIo are file descriptors for two message queues, 
// that need to be opened ahead of time
// in the IO thread
size_t sizeOfComputationToIo = //number of bytes that need to be sent from Computation to Io
size_t sizeOfIoToComputation = //number of bytes that need to be sent from Io to Computation

while(!gShouldStop)
{
__wrap_read(...); // wait for the PRU to signal a buffer is available
// copy from PRU memory to hardwareInBuffer
__wrap_mq_send(ioToComputation, hardwareInBuffer, sizeOfIoToComputation, 0) 
__wrap_mq_receive(computationToIo, hardwareOutBuffer, sizeOfComputationToIo, &prio);
// copy from hardwareOutBuffer to PRU memory
}
// in the Computation thread
int computationBuffersPerAudioBuffer = computationBlockSize/hardwareBlockSize; // needs to be an integer ratio!
char fromIoBuffer[sizeOfComputationToIo * computationBuffersPerAudioBuffer];
char toIoBuffer[sizeOfIoToComputation * computationBuffersPerAudioBuffer];
while(!gShouldStop)
{
  for(int n = 0; n < computationBuffersPerAudioBuffer; ++n)
  {
    __wrap_mq_receive(ioToComputation, &(fromIoBuffer[n*sizeOfComputationToIo]) , sizeOfIoToComputation, &prio);
  }                         
// convert ints to floats (stuff that is in PRU::loop())
  render();
// convert floats to ints (stuff that is in PRU::loop())
  for(int n = 0; n < computationBuffersPerAudioBuffer; ++n)
  {
    __wrap_mq_send(computationToIo,  &(toIoBuffer[n*sizeOfComputationToIo]), sizeOfComputationToIo, 0)) 
  }
}

Right now the audio thread looks like:

while(!gShouldStop)
{
__wrap_read(...); // wait for the PRU to signal a buffer is available
// copy from PRU memory to hardwareInBuffer
// convert ints to floats (stuff that is in PRU::loop())
  render();
// convert floats to ints (stuff that is in PRU::loop())
// copy from hardwareOutBuffer to PRU memory
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.