Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mytek Stereo192-DSD-DAC crackling since duplex stream synchronisation removal CIP_SYNC_TO_DEVICE in 2016 #34

Closed
vermeeren opened this issue Mar 15, 2021 · 20 comments
Assignees
Labels

Comments

@vermeeren
Copy link
Contributor

vermeeren commented Mar 15, 2021

Hi Takashi,

With Mytek Stereo192-DSD-DAC there is consistent crackling in audio stream with post-2016 driver stack, I believe due to quirk in firmware or hardware of Stereo192-DSD-DAC with Dice Mini (TCD2210) possibly in combination with ALSA firewire driver, as Focusrite Saffire Pro 24 microphone interface works well with both old and new driver stack. Here are some loopback recordings with following configuration: MPD -> Jack -> Stereo192 -> Headphones -> Microphone -> Pro24 -> Jack -> Audacity. Audio signal is 1khz test tone, there are no XRUNs in driver, except possibly 96000hz example.

demonstration.tar.gz (file 192000_olddrv.flac demonstrates correct playback with old ALSA firewire driver stack.)

I originally discovered this in 2016, and reported the issue to alsa-devel in 2017[1]. At the time, changes in ALSA firewire stack were major, causing not only crackling but also failure to change sampling frequency rate. I note current driver already re-implemented all such features in presumably a much better way, as it's working very well besides crackling issue. Later I submitted patch for hard-coded parameters for Stereo192-DSD-DAC in 2018[2], which is required for this model. Finally we had some brief contact in 2020 about another user attempting to use this device[3].

Since 2016 I have been running old ALSA firewire stack for DICE (snd_firewire_lib and snd_dice), which is based upon:

  • 2eb65d67afbf9364b525b657f1475d1a2cbc27de ALSA: dice: expand timeout to wait for Dice notification

Various minor changes and cherry picks over the years to keep it compatible with more modern kernel and backports for important bugs. Recently I upgraded from Debian buster to Debian bullseye, which is from kernel 4.19 to 5.10. It appears changes in ALSA have been significant and not fully backwards compatible, as with old modules suspend is broken, poweroff occasionally has hard CPU lockup and there are stability issues when doing insmod rmmod at times, especially rmmod firewire_ohci. Current kernel is 5.10.19-1 with Debian version 5.10.0-4-rt-amd64.

As running with old modules appears to be no longer feasible due to above issues, which do not occur with new modules, it was time to fully research the exact issue and determine which exact commit introduces this regression for Stereo192-DSD-DAC. This meant reading through the commit messages and code changes of ALSA firewire stack between last known good version 2eb65d67afbf9364b525b657f1475d1a2cbc27de and known broken version kernel v4.6, as well as general reading of all later commits to see if there may have been relevant changes.

After a tedious process of merging in changes per commit, making it build with modern kernel, and module removal and insertion I was able to identify the commit resulting in the crackling regression for Stereo192-DSD-DAC:

  • 1387e3eafa94837342b044b429b79830998009ac ALSA: dice: drop duplex streams synchronization to transfer own time stamps

This commit effectively disables usage of CIP_SYNC_TO_DEVICE, resulting in loss of duplex stream synchronisation as described in commit message. This unexpectedly causes the crackling issue on Stereo192-DSD-DAC. However, from commit message:

As long as I experienced, this causes the units to generate no sounds at
first time to receive packets. This issue occurs only with Dice II. I guess
this is due to a quirk of the PLL. In short, the PLL cannot generate firm
signals to ADCs/DACs or the other ICs when no packets are received in the
beginning of packet streaming. While, on second time or later, the unit
generates sound correctly. I guess that starting packet streaming at first
time sets the PLL correctly.

This is the exact behaviour change noted for Stereo 192-DSD-DAC, even though it is Dice Mini (TCD2210) per firmware research you did in 2018[4]. With old modules playback did almost always not work the first time at a new sampling rate, but when it plays audio the second time the lock is perfect and playback is good. With new modules playback always works first time, but also always with crackling.

Later commits however remove CIP_SYNC_TO_DEVICE feature completely from AMDTP stream implementation, as it was thought this feature was not needed.

  • dec63cc8b65b04c0ebb5d82b6b399665d6d44dca ALSA: firewire-lib: handle IT/IR contexts in each software interrupt context
  • d9a16fc926a950c9e481cb7e89c554593b8e29e2 ALSA: firewire-lib: code cleanup for incoming packet handling
  • 390a1512e6ccda2ec32ea1395814f36cf4d30e48 ALSA: firewire-lib: code cleanup for outgoing packet handling

Although I am somewhat familiar with concepts because of research to determine the issue, I am not familiar enough to re-implement duplex stream synchronisation in modern ALSA firewire driver. I also cannot determine the real problem very well, it could be duplex stream synchronisation was masking the real bug in either device or ALSA firewire driver.

What are your thoughts on this issue? If there is any further testing I can do to help feel free to let me know.

[1] https://mailman.alsa-project.org/pipermail/alsa-devel/2017-September/125524.html
[2] https://mailman.alsa-project.org/pipermail/alsa-devel/2018-May/136194.html
[3] https://mailman.alsa-project.org/pipermail/alsa-devel/2020-May/168114.html
[4] https://mailman.alsa-project.org/pipermail/alsa-devel/2018-May/136198.html

Thanks,

Melvin.

@takaswie
Copy link
Owner

takaswie commented Mar 16, 2021

@vermeeren

This is the exact behaviour change noted for Stereo 192-DSD-DAC, even though
it is Dice Mini (TCD2210) per firmware research you did in 2018[4]. With old
modules playback did almost always not work the first time at a new sampling
rate, but when it plays audio the second time the lock is perfect and playback
is good. With new modules playback always works first time, but also always
with crackling.

Another negative side of the implementation for CIP_SYNC_TO_DEVICE is to
handle one 1394 OHCI isochronous transmit (IT) context (=PCM playback direction)
in IRQ handler for one 1394 OHCI isochronous receive (IR) context (=PCM capture
direction). The implementation aims to reuse information in the packet for IR
context for packet IT context; e.g. the number of PCM frames in packet,
timestamp of the packet.

However, DICE ASICs can support two pairs of isochronous packet streams as maximum
for each direction. The implementation can't support all of models based on DICE
ASICs. For example, models by Focusrite supports:

name num. of isoc. streams for capture direction num. of isoc. streams for playback direction
Saffire Pro 14 1 1
Saffire Pro 24 1 1
Saffire Pro 24 DSP 1 1
Saffire Pro 26 2 1 (yes...)
Saffire Pro 40 2 2
Liquid Saffire 56 2 2

Furthermore, when the target device skips some isochronous cycle to transfer
packets, hardware IRQ of 1394 OHCI controller for IR context is postponed. As
a result, the implementation of CIP_SYNC_TO_DEVICE can not keep constant
throughput for IT context.
the sequence of IR packets has the lack of
information to process packets for IT context every isochronous cycle. Although
the lack of information should be complemented to process packets in IT
context, the implementation of CIP_SYNC_TO_DEVICE has no mechanism for
the above. OXFW and Fireface are the kind of device and the implementation
brings troubles to them.

For the above reasons, the implementation was dropped.

Although I am somewhat familiar with concepts because of research to determine
the issue, I am not familiar enough to re-implement duplex stream
synchronisation in modern ALSA firewire driver. I also cannot determine the
real problem very well, it could be duplex stream synchronisation was masking
the real bug in either device or ALSA firewire driver.

The main of underlying issue is that some models (and yours) require software
driver to replay the sequence of packets for IT context in a point of the number
of PCM frames per packet, but IEC 61883-1/6 packet stream engine in ALSA
firewire stack doesn't. The engine still use pre-computed table to process
packets in IT context. Although the table gives throughput exactly the same as
sampling frequency for the number of PCM frames per packet, it is not suitable
to the models. The next underlying issue is timestamp per packet.

Between 2019 and 2020, I've implemented 'AMDTP domain'[1][2]. The concept is:

  • process several IR/IT context in one hardware IRQ handler
  • the handler runs for IT context (=irq_target) for constant throughput

My work for it is still in intermediate. I have a plan to integrate it for:

  • one of the IR contexts is selected as the context in which timing
    information is reused to process for the others (not implemented yet)
  • during no packet in IR context is available, process packets for IT context
    with pre-computed timing information. (not implemented yet)

Currently, the timing information for IT context is still based on pre-computed
table which implemented in the beginning of ALSA firewire stack[3] therefore
some models still generate sound with crackling noise.

In my plan, I use this spring and summer for the above tasks.
I expect it will be tough work since ALSA firewire stack supports different 8
types of devices. The sequence of packet for the types are different each other.

(our work look to regain the items from land of that which is forgotten.)

[1]https://mailman.alsa-project.org/pipermail/alsa-devel/2019-August/153388.html
[2]https://mailman.alsa-project.org/pipermail/alsa-devel/2020-May/167447.html
[3]https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/commit/?id=31ef9134eb52

@takaswie takaswie self-assigned this Mar 16, 2021
@takaswie takaswie added the bug label Mar 16, 2021
@takaswie
Copy link
Owner

@vermeeren Anyway I respect your work to bisect patches and find the positive side of CIP_SYNC_TO_DEVICE. I apologize not to give enough information in advance.

@vermeeren
Copy link
Contributor Author

vermeeren commented Mar 19, 2021

@takaswie Thanks a lot for the detailed reply, I appreciate it. No worries about a lack of information at all.

I have done some reading in the other issues which I think is effectively the same problem as with this DAC, I see there is a design task at hand with the comment in #26 (comment). If I understand correctly, the problem is dynamic runtime determination of SYT interval?

Really I only have some crude, basic ideas for this, as I don't know enough of the technical side so perhaps there are hardware constraints. But abstractly speaking, would something like that be suitable? Assuming inbound packets from device is always device's clock.

  1. Start streaming at pre-computed table rate.
  2. Count number of inbound packets for some interval, perhaps some milliseconds or a second?
  3. Calculate packets/second or similar with above information and set SYT interval based upon newly calculated value.
  4. Reset counter, go to 2.
    • Alternatively, further refine calculations using the average of the past X intervals instead of only the latest. Probably over-engineering.

With this type of approach, sound may only have crackling the first counter interval period, which is not really a problem in practice as this type of device is used for long playback/recording session typically with Jack or perhaps ALSA directly.

Edit: I now realise this is most likely the exact same idea you proposed in earlier comment, it sounds good to me!


Unrelated: I have confirmed that there is a incredibly small chance that DAC works properly with new driver on stream start. I feel like the chance for this is a 5% at best. Presumably this is when you are lucky and the pre-computed table matches exactly the signal lock. If it does work like this, it will work the entire lock duration perfectly for many hours.

@vermeeren
Copy link
Contributor Author

@takaswie By the way, I recall the Windows driver for this device takes multiple real seconds when changing the lock. I suppose their approach is to forbid any playback/capture from userspace until the timing has stabilised which takes those few seconds. For ALSA I think such an additional forced delay has more negatives than positives, as the timing will stabilise very quickly once streams are started.

Also, feel free to CC me in any relevant devel mails, if it would help for me to do some hacking or patch reviewing I'd be glad to do so too.

@takaswie
Copy link
Owner

@vermeeren As another topic, I'm currently working for systemd to support hardware database for units in IEEE 1394[1]. I mostly generate these entries from my collection of images for configuration rom, retrieved from actual devices. I realized to have no image of your Mytek Stereo192-DSD-DAC. Would I request you to make image file and send it to me, or MR to my repository?

[1] systemd/systemd#19124
[2] https://github.com/takaswie/am-config-roms/

Thanks

@vermeeren
Copy link
Contributor Author

@takaswie Yes of course I can do that, I tried reading a bit but cannot find how to make such an image. Could you tell me how to do it?

@takaswie
Copy link
Owner

@vermeeren

Yes of course I can do that, I tried reading a bit but cannot find how to make such an image. Could you tell me how to do it?

Sorry to request without any instruction...

Linux kernel has sysfs to export attributes of kobjects, which represent any instance maintained in kernel space. Linux FireWire subsystem maintains nodes and units in IEEE 1394 bus as kobject, and adds an attribute for content of configuration ROM into the kobject corresponding to the node.

When your Mytek Stereo192-DSD DAC is detected as fw1, the image can be created by:

$ cat /sys/bus/firewire/devices/fw1/config_rom > mytek-stereo192-dsd.img

Cherrs

@vermeeren
Copy link
Contributor Author

@takaswie I have created MR to your repository, see takaswie/am-config-roms#2. Thanks!

@takaswie
Copy link
Owner

Hi @vermeeren ,

Just now I pushed patches to topic/media-clock-recovery remote branch (f3bdd2c).
Would I ask you to test it with your device?

With this patchset, ALSA IEC 61883-1/6 packet streaming engine processes incoming and
outgoing packets by:

  1. Transfer outgoing packet with pre-computed table rate in its beginning.
  2. Pool presentation timestamp in incoming packets after starting receiving them
  3. Then transfer outgoing packet by media clock recovered from the presentation timestamp

In a prompt test, it looks well with TC Electronic Konnekt 24d at 44.1/48.0 kHz. I wish it also
works well in your device.

Cheers.

@vermeeren
Copy link
Contributor Author

vermeeren commented Apr 19, 2021

Hi @takaswie

Had some time today to review and test patches. I did testing with Debian bullseye kernel 5.10.28 and can confirm it works perfectly. Tried various settings, period/buffer size and frequencies, the lock always works in one go and there are no pops or other problems in audio signal. Patches themselves I read too and it all looks very good to me.

I also tested combination with Saffire Pro 24, which worked before, and this unit still works perfectly, so no change there. We can conclude it fully fixes problem for Stereo 192-DSD DAC and presumably all Dice-based audio interfaces requiring clock recovery.

Thanks a lot for your work, I really appreciate it! Feel free to let me know if you need help with something in the future, I'll gladly take a look.

@takaswie
Copy link
Owner

Hi @vermeeren ,

Thanks for your test, and it's nice to work well in your side, too.

Although I still find some issues in the other devices (e.g. TC Electronic Studio Konnekt 48, Alesis iO 26 and iO 14, Focusrite Saffire Pro 40, Focusrite Liquid Saffire 56), it's a good start to improve ALSA IEC 61883-1/6 packet streaming engine against the long-standing problem.

The patchset is still a skeleton since it doesn't include any check against unexpected situation (e.g. the incoming packets from device is discontinued) and I continue to work this week. Later I will contact to you for further testing. Additionally, I will post to mailing lists (e.g. LAD) for further testing. Anyway I'm glad to get your cooperation again.

Cheers.

@vermeeren
Copy link
Contributor Author

@takaswie Yeah I noticed today that if a discontinuity occurs it appears to hang, freezing the userspace process holding the ALSA device also. With the old driver from 2016 on discontinuity with DAC the DAC would lose audio playback (Saffire Pro24 would still work) but packet streaming would continue, requiring restart of Jack in practice. With current patchset SIGKILL to Jack and restart Jack makes it work again.

Having a recovery mechanism for packet discontinuity would be very nice, I await with anticipation!

@takaswie
Copy link
Owner

takaswie commented Apr 27, 2021

Hi @vermeeren ,

Having a recovery mechanism for packet discontinuity would be very nice, I await with anticipation!

I implement it and move HEAD of topic/media-clock-recovery remote branch into 86a686e d54ba5d.
Would I request you to test it with your devices?

In this trial, I implements:

  • cycle discontinuity check and let applications recover ALSA PCM substream by receiving XRUN state
  • transfer isoc packets to device at the same timing even if they are transferred by several IT context
    • for dualwire function of DICE ASICs

Especially, I'm successful at last to get clear sound from below models at 44.1/48.0 kHz:

  • Focusrite Saffire Pro 40 (supporting two pairs of isoc streams)
  • Focusrite Liquid Saffire 56 (supporting two pairs of isoc streams)
  • TC Electronic Studio Konnekt 48 (supporting two pairs of isoc streams)

However, I still found some troubles at higher sampling transfer frequency.
If you attempt to test at higher sampling rate, I'm appreciate to receive your report as well.

Thanks

@vermeeren
Copy link
Contributor Author

Hi @takaswie,

Today I review new patches and test them with Stereo 192-DSD DAC. The XRUN handling appears to work nicely, as intentionally causing XRUN Jack reports it as expected but the stream recovers and keeps playing audio successfully.

About sampling rate, that is actually unexpected. Even with the previous patchset 192kHz S32_LE playback worked correctly on Stereo 192-DSD DAC, I think I also tried 176.4kHz which also worked fine. Current patchset also handles it nicely, I don't notice any changes here. However from formation file:

Output stream from unit:
        low     middle  high    MIDI
Tx 0:   8       8       8       0
Tx 1:   0       0       0       0
Input stream to unit:
        low     middle  high
Rx 0:   4       4       4       0
Rx 1:   0       0       0       0

If I understand correctly, Stereo 192-DSD DAC only uses a single pair of isoc streams even for 192kHz sampling frequency, so presumably that is why high sampling rate works correctly on this unit.

As for Focusrite Saffire Pro 24, it works well like with previous patches. This device has maximum 96kHz sampling frequency and I think also a single pair of isoc streams. The formation for this unit:

Output stream from unit:
	low	middle	high	MIDI
Tx 0:	16	12	0	1
Tx 1:	0	0	0	0
Input stream to unit:
	low	middle	high
Rx 0:	8	8	0	1
Rx 1:	0	0	0	0

Note that from userspace point of view I always open DAC in playback-only direction mode and Saffire in bidirectional mode, but I do not think this matters for kernel driver.


Unrelated to the above, sometimes I cannot open streams and following shows up in kernel ring buffer:

snd_dice fw2.0: isochronous resources exhausted

It always happens after turning on/off a device, never while they are on even for hours. Sometimes I can fix this by power cycling DAC and/or Saffire, but sometimes I feel like a reboot is needed to correct this problem, not sure. Is this an issue you are familiar with? It could be an issue with firewire_ohci too maybe?

Cheers.

@takaswie
Copy link
Owner

takaswie commented May 3, 2021

About sampling rate, that is actually unexpected. Even with the previous patchset 192kHz S32_LE playback worked correctly on Stereo 192-DSD DAC, I think I also tried 176.4kHz which also worked fine. Current patchset also handles it nicely, I don't notice any changes here. However from formation file:

Hm. You mean your device was not under locked state at 192.0 kHz?

Note that from userspace point of view I always open DAC in playback-only direction mode and Saffire in bidirectional mode, but I do not think this matters for kernel driver.

Yep.

Unrelated to the above, sometimes I cannot open streams and following shows up in kernel ring buffer:

snd_dice fw2.0: isochronous resources exhausted

It always happens after turning on/off a device, never while they are on even for hours. Sometimes I can fix this by power cycling DAC and/or Saffire, but sometimes I feel like a reboot is needed to correct this problem, not sure. Is this an issue you are familiar with? It could be an issue with firewire_ohci too maybe?

Thanks for the report. The issue seems to come from bugs in current implementation of ALSA dice driver. I'll investigate further later.

The message mean that all of available isochronous context in 1394 OHCI controller under used are reserved. In IEEE 1394, when using isochronous communication, any node should reserve isochronous resources such as channel and bandwidth, maintained by isochronous resource manager (=typically 1394 OHCI controller).

@vermeeren
Copy link
Contributor Author

Hm. You mean your device was not under locked state at 192.0 kHz?

@takaswie I mean the device (Stereo 192-DSD DAC) locks properly at 192kHz and playback works perfectly, both with current topic/media-clock-recovery and the previous version from #34 (comment).

So it is a bit strange that on the devices you test in #34 (comment) there are troubles with higher sampling frequency. At a glance the difference is that my device has only one pair of isoc streams, but yours have two pairs?

Cheers.

@takaswie
Copy link
Owner

takaswie commented May 4, 2021

I mean the device (Stereo 192-DSD DAC) locks properly at 192kHz and playback works perfectly, both with current topic/media-clock-recovery and the previous version from #34 (comment).

Now I got it. Thanks for your indication.

So it is a bit strange that on the devices you test in #34 (comment) there are troubles with higher sampling frequency. At a glance the difference is that my device has only one pair of isoc streams, but yours have two pairs?

At present, I think so. The mechanism for device to process PCM frames in received packets has difference somehow between with one pair and two pairs. I have some hypothesis that they require received packets at the same isoc cycle with the same SYT value, and they require received packets with sequence replayed from transferred packets at the beginning of the packet streaming already. I'm currently working for the hypothesis.

Thanks

@takaswie
Copy link
Owner

takaswie commented Jun 1, 2021

I note that patchset for the issue is posted to upstream:

@takaswie
Copy link
Owner

takaswie commented Jun 1, 2021

The patchset is merged to upstream. The patches are also available in master branch of the repository.
Let me close the issue. Thanks for your cooperation.

@takaswie takaswie closed this as completed Jun 1, 2021
@vermeeren
Copy link
Contributor Author

@takaswie I have updated modules with DKMS some time ago and it has been working fantastically the past week or two, thank you a lot for the work! Happy to see it merged upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants