Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Digidesign Digi 003R clicks & pops while recording #26

Open
LGCW opened this issue Feb 10, 2020 · 23 comments
Open

Digidesign Digi 003R clicks & pops while recording #26

LGCW opened this issue Feb 10, 2020 · 23 comments
Assignees
Labels
bug

Comments

@LGCW
Copy link

@LGCW LGCW commented Feb 10, 2020

Picked up a Digidesign Digi 003R to play around with and for the most part it worked. It can connect at 4800/4100 256 and all of the routing appears to work in Ardour 5.12.

Unfortunately it generates a little pop/click every 5 to 7 seconds in the recording and output. This was using kernel 4.19 and 5.4 with and without snd-firewire-improve from.

I was wondering if there was any helpful output / logs I could provide to help track this issue?

@takaswie

This comment has been minimized.

Copy link
Owner

@takaswie takaswie commented Feb 10, 2020

Hi,

Thanks for the report.

Unfortunately it generates a little pop/click every 5 to 7 seconds in the recording and output.
This was using kernel 4.19 and 5.4 with and without snd-firewire-improve from.

I was wondering if there was any helpful output / logs I could provide to help track this issue?

I've acknowledged this issue for a long ago. This is a kind of issue dependent of each kind of devices.

IEC 61883-1/6 packet streaming engine in ALSA firewire stack transfers a series of isochronous packets which includes PCM frames exactly as the same as configured sampling rate[1]; e.g. 44100. This implementation relies on a feature called as 'clock recovery' in device side, which is a part of IEC 61883-1/6.

On the other hand, actual devices in market doesn't have the feature (sign...). Some of them expects to receive the number of PCM frames adjusted its internal sampling clock; e.g. 44092. Furthermore, the number is variable at time; e.g. 44092-44104. The mismatch between the 'ideal' throughput of the engine and the actual frequency of internal clock is the reason of clicks and pops, in my understanding.

Therefore such kind of device expects the engine to perform 'clock recovery' by analyzing a series of isochronous packets from the device for the number of PCM frames on isochronous packet to the device. In this year, I'll tackle this issue, however I've had no idea to achieve the 'clock recovery' for the engine yet.

(I note that due to the above design it's not possible to synchronize sampling clocks per devices on the same IEEE 1394 bus. They work according to their own internal clocks, against some 'myths' of FireWire. They can't adjust internal sampling clock by a series of timestamp in received isochronous packets.)

[1] https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/tree/sound/firewire/amdtp-stream.c#n741

@takaswie takaswie self-assigned this Feb 10, 2020
@takaswie takaswie added the bug label Feb 10, 2020
@LGCW

This comment has been minimized.

Copy link
Author

@LGCW LGCW commented Feb 10, 2020

Thanks for taking a look at it. That seems like quite a daunting task but I wish you the best of luck. I will keep the 003 in the rack since it makes a good shelf.

I don't know if it's of any value but I did notice the 003 generates an inaudible xrun in Ardour every 2.5 minutes or so. https://i.imgur.com/outPJR7.jpg

Cheers,
Venn

@takaswie

This comment has been minimized.

Copy link
Owner

@takaswie takaswie commented Feb 11, 2020

Hi,

I don't know if it's of any value but I did notice the 003 generates an inaudible xrun in Ardour every 2.5 minutes or so.

Hm. I'd like to get some parameters relevant to intermeditate PCM ring buffer when the XRUN occurs. Would you tell me the size of period and buffer in the PCM ring buffer?

@LGCW

This comment has been minimized.

Copy link
Author

@LGCW LGCW commented Feb 11, 2020

Sample rate: 48000
Buffer Size: 256
Periods: 3

I think that is the information you requested? If not let me know and I will provide it.

@takaswie

This comment has been minimized.

Copy link
Owner

@takaswie takaswie commented Feb 11, 2020

Sample rate: 48000
Buffer Size: 256
Periods: 3

I think that is the information you requested? If not let me know and I will provide it.

They're what I expected. Additionally, which version of Linux kernel did you use to see the periodical XRUNs; v4.19 or v5.4 or the both? Which backend did you use for Ardour; jackd or direct ALSA application?

@LGCW

This comment has been minimized.

Copy link
Author

@LGCW LGCW commented Feb 11, 2020

I observed periodical XRUNs in v4.19 and v5.4. Right now the box is running the stock RT kernel.

OS: Debian 10.3
Kernel: 4.19.0-8-rt-amd64
Ardour: Jackd backend.

@takaswie

This comment has been minimized.

Copy link
Owner

@takaswie takaswie commented Feb 13, 2020

Hi,

Sample rate: 48000
Buffer Size: 256
Periods: 3
OS: Debian 10.3
Kernel: 4.19.0-8-rt-amd64
Ardour: Jackd backend.

Thanks for the report. I tested my digi003 rack and got different result in this environment:

  • Ubuntu 19.10 (amd64, 5.3.0-29-generic)
  • Backported kernel modules from d56d8e6 (almost the same as you used for v5.4)
  • Ardour from Ubuntu official repository (1:5.12.0-3)
  • Sample rate: 48000
  • Buffer size: 256
  • Periods: 3
  • jackd with '-r' option (no-realtime)

Theoretically, if the XRUN occurs due to kernel driver problem, the periodical XRUN also occurs in the above environment. Actually not. I cannot see such periodical XRUNs. XRUN occurs but it's random period. Furthermore, I can't find any logs which means that ALSA driver encounter XRUN status.

@zamaudio

This comment has been minimized.

Copy link

@zamaudio zamaudio commented Feb 15, 2020

This issue is an instance of #24 (or at least related).

@dylofpoke

This comment has been minimized.

Copy link

@dylofpoke dylofpoke commented Feb 15, 2020

I have just got myself a Digi 002 rack and will be trying it in Ardour (5.12, I think) with Ubuntu Studio (1604, I think) when I get hold of a firewire card and cable, so I'm very interested in getting it working and up for testing and giving anyone any info that will help...

@dylofpoke

This comment has been minimized.

Copy link

@dylofpoke dylofpoke commented Feb 15, 2020

Also, I love that this is still actively being developed, as it will hopefully save a lot of high quality, cheaply available audio interfaces from obsolescence, which is a very noble thing to be doing!

@zamaudio

This comment has been minimized.

Copy link

@zamaudio zamaudio commented Feb 16, 2020

@dylofpoke I highly recommend getting a FW800 PCIe controller and using a FW800 -> FW400 cable to connect to a 002/003 it just seems to work better than a FW400 controller. I use one from TI with a XIO22I3BZAY chip. Something like this: https://eiratek.com/product/pci-e-to-1394-card-firewire-b-800/ Having said that, the driver still needs work to fix the xruns.

@takaswie

This comment has been minimized.

Copy link
Owner

@takaswie takaswie commented Feb 16, 2020

Having said that, the driver still needs work to fix the xruns.

No. The issue mentioned by this bug brings no XRUNs. It's the most important point that the periodical pop/click comes from the device design that the total number of processed PCM frames per second is not exactly the same as sampling rate. On the other hand, ALSA firewire-digi00x driver transfers the same number of PCM frames as sampling rate (=sampling transfer frequency).

The click/pop noise doesn't related to XRUNs detected by any software.

@zamaudio

This comment has been minimized.

Copy link

@zamaudio zamaudio commented Feb 16, 2020

@takaswie ok, so it's the device design fault. Regardless of the cause of the fault, we still want to have the device functioning without clicks and pops. That means a workaround is needed in the driver, that is what I meant. For example, on windows and mac the device works without any problem.

@takaswie

This comment has been minimized.

Copy link
Owner

@takaswie takaswie commented Feb 16, 2020

For example, on windows and mac the device works without any problem.

This is natural. The drivers for Windows/MacOS has been developed by vendor itself, which is responsible for the design itself.

Regardless of the cause of the fault, we still want to have the device functioning without clicks and pops. That means a workaround is needed in the driver, that is what I meant.

Please work for it.

@zamaudio

This comment has been minimized.

Copy link

@zamaudio zamaudio commented Feb 16, 2020

Please work for it.

Is there anything I can provide that would make it easier for you to do it? I don't have much time these days for writing kernel code. I have some idea of how to fix it though.

@takaswie

This comment has been minimized.

Copy link
Owner

@takaswie takaswie commented Feb 16, 2020

Hi @dylofpoke ,

Although the device itself is obsolescence, the FLOSS driver (snd-firewire-digi00x) is under development. This is due to the reason that no one in FLOSS world is not the original designer of the device.

So, if you use the device, it's better to take enough care of kernel version.

v4.1
      5      32     271
v4.2
     25     258    2012
v4.3
      4      31     277
v4.4
     75     743    5997
v4.5
     31     304    2298
v4.6
     24     249    1826
v4.7
     28     303    2319
v4.8
      2      18     148
v4.9
      4      36     302
v4.10
      6      61     459
v4.11
     15     133    1149
v4.12
     56     536    4314
v4.13
      8      78     648
v4.14
     10      98     788
v4.15
      2      18     192
v4.16
      2      13     102
v4.17
      4      42     308
v4.18
     27     261    1913
v4.19
     17     179    1282
v5.0
     41     452    3429
v5.1
     22     240    1854
v5.2
     12     102     955
v5.3
    116    1229   10120
v5.4
     54     541    4297
v5.5
     54     624    4390

This statistics is for the number of patches applied to upstream kernel for ALSA firewire stack per version in the leftmost of each line. You can see the stack sometimes got the big changes. I'd like you to use the latest version of kernel as possible.

Ubuntu Studio 16.04 is LTS (=Long Term Support) and HWE (hardware enablement) is available. I don't know exactly the latest version of kernel in xenial/HWE (v4.15 or so, perhaps) but I recommend to use the latest version of LTS because the enhanced version of driver is available.

P.S. this output comes from the script below:

#!/bin/bash

for curr in $(seq 1 19); do
  prev=$(($curr - 1))
  echo v4.$curr
  git log --oneline v4.$prev..v4.$curr sound/firewire include/uapi/sound/firewire.h | wc
done

echo v5.0
git log --oneline v4.19..v5.0 sound/firewire include/uapi/sound/firewire.h | wc

for curr in $(seq 1 5); do
  prev=$(($curr - 1))
  echo v5.$curr
  git log --oneline v5.$prev..v5.$curr sound/firewire include/uapi/sound/firewire.h | wc
done
@zamaudio

This comment has been minimized.

Copy link

@zamaudio zamaudio commented Feb 16, 2020

@takaswie As I mentioned in #24, as you know, when the SYT value is always zero for this device, you cannot do clock recovery from the packets alone, but you can measure a system timestamp instead when the packet arrives and use that for clock recovery with a delay locked loop. This is not my idea alone, but works for other Oxford devices that have SYT = 0 in ffado.

@takaswie

This comment has been minimized.

Copy link
Owner

@takaswie takaswie commented Feb 16, 2020

@zamaudio Just an idea can make nothing.

@zamaudio

This comment has been minimized.

Copy link

@zamaudio zamaudio commented Feb 16, 2020

@takaswie https://kokkinizita.linuxaudio.org/papers/usingdll.pdf I have used this algorithm before to implement clock recovery. It has C code example.

@zamaudio

This comment has been minimized.

Copy link

@zamaudio zamaudio commented Feb 16, 2020

See https://github.com/janosvitok/ffado/blob/master/trunk/libffado/src/libstreaming/amdtp-oxford/AmdtpOxfordReceiveStreamProcessor.cpp#L40 for ffado implementation of pseudo-converting non-blocking to blocking transmission.

/* The issues with the FCA202 (and possibly other oxford FW92x devices) are:
 * - they transmit in non-blocking mode
 * - the timestamps are not correct
 *
 * This requires some workarounds.
 *
 * The approach is to 'convert' non-blocking into blocking so
 * that we can use the existing streaming system.
 *
 * The streamprocessor has a "pseudo-packet" buffer that will collect
 * all frames present. Once one SYT_INTERVAL of frames is present, we indicate
 * that a packet has been received.
 *
 * To overcome the timestamping issue we use the time-of-arrival as a timestamp.
 *
 */
@takaswie

This comment has been minimized.

Copy link
Owner

@takaswie takaswie commented Feb 17, 2020

@zamaudio In IEEE 1394, isochronous packet transmission is governed by bus clock at 24.576MHz. Crystal oscillator is implemented on 1394 OHCI controller for the clock signal and it generates 49.152 MHz for divider. The controller also generates hardware IRQ depending on the clock. In this point, the device has enough accuracy for PCM frame processing.

However, in modern operating system, hardware IRQ is not handled immediately since the system implements some kind of lock primitive; e.g. spin-lock. Such lock primitive uses CPU feature of IRQ mask. The timing to execute software handler for the IRQ can be delayed and the sequence of IRQ handling includes jitter. At least, device driver developers understand it and take enough care of the jitter.

I don't know exactly what you call as system timestamp. As long as it's apart from the bus clock, it's not suitable for timestamp processing for isochronous packet on IEEE 1394. Reading current system time per call of IRQ handler doesn't make sense. In the IRQ handler, calculations should be done just by isochronous cycle which each packet has since the cycle comes from more reliable 24.576 MHz bus clock.

I'm on the above theory and have less interests in your idea.

@takaswie

This comment has been minimized.

Copy link
Owner

@takaswie takaswie commented Feb 18, 2020

In IEEE 1394, isochronous packet transmission is governed by bus clock at 24.576MHz. Crystal oscillator is implemented on 1394 OHCI controller for the clock signal and it generates 49.152 MHz for divider. The controller also generates hardware IRQ depending on the clock. In this point, the device has enough accuracy for PCM frame processing.

However, in modern operating system, hardware IRQ is not handled immediately since the system implements some kind of lock primitive; e.g. spin-lock. Such lock primitive uses CPU feature of IRQ mask. The timing to execute software handler for the IRQ can be delayed and the sequence of IRQ handling includes jitter. At least, device driver developers understand it and take enough care of the jitter.

I don't know exactly what you call as system timestamp. As long as it's apart from the bus clock, it's not suitable for timestamp processing for isochronous packet on IEEE 1394. Reading current system time per call of IRQ handler doesn't make sense. In the IRQ handler, calculations should be fdone just by isochronous cycle which each packet has since the cycle comes from more reliable 24.576 MHz bus clock.

I'm on the above theory and have less interests in your idea.

@zamaudio For example, Linux kernel has a feature isolcpus cmdline option. When CPU IDs are indicated by the option, the CPUs are not used by task scheduler. This means that no user processes are scheduled to the CPU.

Linux kernel exposes smp_affinity node in sysfs to allow users to configure assignment of process to any CPU. This is also available for assignment of IRQ handler. When configure IRQ handler of Linux driver for 1394 OHCI controller (firewire-ohci kernel module) to the isolated CPU, we can control load of the CPU without any interrupt by processes and drivers for CPU locks, in short we can use some of CPUs just to handle IRQ from the 1394 OHCI controller.

When Configuring the above options, you can see the controller works with less jitter (<=1ms). For example, the firewire-ohci module has debug module option and value 4 take the module to dump timestamp when it runs IRQ handler (this is not equals to IRQ request itself). You can see the timestamp of dump in syslog and the jitter is quite low level. This is due to the 24.576 MHz clock apart from system.

Without the above configuration, the dump includes large jitters (>= several msec or several dozen msec or so). This is due to the CPU locks by process or the other drivers, I mentioned.

You can see that reading current system time at running IRQ handler is not good idea at all. It can include large jitters unsuitable for timestamp processing.

@zamaudio

This comment has been minimized.

Copy link

@zamaudio zamaudio commented Feb 19, 2020

I don't know exactly what you call as system timestamp. As long as it's apart from the bus clock, it's not suitable for timestamp processing for isochronous packet on IEEE 1394.

As you know, the problem with the device is that the timestamps are missing from the packets, so you cannot do clock recovery from the packet information. We already agree on this. Therefore, any jittery timestamp for marking packet arrival is better than no timestamp. The idea is to use the (number of received frames in cycle / 8000) as a kind of timestamp and update a DLL on top of that to filter out the jitter periodically. With this in place, you can collect all the frames into a pseudo-packet until SYT_INTERVAL frames is filled, and then copy it to the real packet and mark the real packet as "arrived". Then you can use blocking transmission mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.