Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dai: fix xrun handling #7044

Merged
merged 1 commit into from
Feb 15, 2023
Merged

Conversation

makarukp
Copy link
Contributor

@makarukp makarukp commented Feb 6, 2023

dai copy shuld continue flow if status return xrun warning

Signed-off-by: Piotr Makaruk piotr.makaruk@intel.com

@softwarecki
Copy link
Collaborator

dw dma driver also reports -EPIPE when it detect xrun (https://github.com/zephyrproject-rtos/zephyr/blob/main/drivers/dma/dma_dw_common.c#L804). Shouldn't we call dai_report_xrun for this driver? In my opinion, in default case should be a general status read error handling, not xrun handling.

@softwarecki
Copy link
Collaborator

Similar PR: #6965

@kv2019i kv2019i requested a review from juimonen February 6, 2023 11:35
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments inline, added @juimonen to review.

comp_warn(dev, "dai_copy(): dma_get_status() overrun occurred, ret = %u",
ret);
break;
default:
dai_report_xrun(dev, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, something is still off here. It doesn't make sense to not call "dai_report_xrun()" when "-EPIPE" is returned.

I was looking at the old non-Zephyr driver sof/src/drivers/intel/hda/hda-dma.c:hda_dma_data_size() is checking for overrun/underrun exactly the same way as the new Zephyr driver (checking DGCS_BUR and DGCS_BOR).
And in dai-lecy.c:dai_copy(), the logic is similar. dma_get_data_size_legacy() is called and if xrun is detected, an error is reported.

And with this stack, tests are passing. So I think we need to understand why the difference here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i IIRC calling dai_report_xrun(dev, 0); would cause the XRUN handler on SOF Linux driver to run (and stop/restart the stream) ? Should this not be something to enable/disable (default) at runtime ?

@kv2019i
Copy link
Collaborator

kv2019i commented Feb 6, 2023

@makarup @softwarecki I think I now understand why we have native dai-zephyr.c.

If I run test to a non-connected HDMI PCM node, with non-native HD-DMA driver I get:

[   47.628023] <wrn> dai_comp: comp:1 0x40001 dai_copy(): nothing to copy
[   47.629000] <err> hda_dma: hda_dma_link_check_xrun(): underrun detected
[   47.729056] <err> hda_dma: hda_dma_link_check_xrun(): underrun detected
[   48.648050] <inf> ll_schedule: ll timer avg 2251, max 5396, overruns 0

With native Zephyr driver I get:

[    4.125471] <wrn> dai_comp: comp:1 0x40001 dai_copy(): nothing to copy
[    4.225553] <err> dai_comp: comp:1 0x40001 dai_report_xrun(): underrun due to no data available
[    4.225565] <err> component: comp:1 0x40001 comp_underrun(): dev->comp.id = 262145, source->avail = 384, copy_bytes = 0
[    4.225573] <inf> pipe: pipe:1 0x0 pipe trigger cmd 6
[    4.225588] <err> pipe: pipe:1 0x0 pipeline_copy(): ret = -32, start->comp.id = 262145, dir = 1

So in both cases the xrun is observed (makes sense as display side is not connected in this case), but in non-native case the pipeline is not stopped. Took a while but then a realized hda_dma_link_check_xrun() in the old src/drivers/intel/hda/hda-dma.c
always returns 0 for success, even if xrun is found.

I traced this back to this old commit from 2020:

commit 293dfe2469cb2f29895847cb3dd2bc831e3ea7c3
Author: Tomasz Lauda <tomasz.lauda@linux.intel.com>
Date:   Tue Mar 10 10:09:30 2020 +0100

    hda-dma: refactor xrun handling
    
    Changes behaviour of HDA Link overruns and underruns handling.
    Let's no longer stop the stream, but just report an error.
    It might happen that just after the release buffer is still
    full/not yet empty after the previous run, but it shouldn't
    affect the data, since stream has been paused anyway.

The Linux stack is still depending on this behaviour (the Linux audio servers open all the PCM nodes and for the HDMI nodes, the expectation is that PCM streaming can be started without an error even if no display is connected at the time).

So I think we need to modify xrun handling so that for the HDA link, we just log a warning, but do not stop the pipeline. Other xrun handling should be as before.

@makarukp
Copy link
Contributor Author

makarukp commented Feb 6, 2023

@kv2019i @softwarecki
I have similar conclusion, however, now I am not 100% sure if dw_dma_driver xrun meaning is the same as hda dma DGCS_BUR and DGCS_BOR. Is it really xrun or error ???
'
#if CONFIG_DMA_DW_HW_LLI
if (!(dw_read(dev_cfg->base, DW_DMA_CHAN_EN) & DW_CHAN(channel))) {
LOG_ERR("xrun detected");
return -EPIPE;
}
#endif
'

I see that in legacy it was only warning for hda but error for dw_dma. Maybe in cases different than standard xrun resulting from delay it should be reported as an error.

What do you think ?

@kv2019i
Copy link
Collaborator

kv2019i commented Feb 6, 2023

@makarukp Ack, I can confirm that. With non-native DW-DMA driver, xruns are reported and will stop the pipe. So this special handling is only for the HD-DMA link DMA. Now for Linux, we do need to keep the special handling, and the move of drivers to Zephyr does pose a new challenge how to maintain. Hmm, maybe we put the xrun-raises-an-error logic behind a Kconfig option and otherwise just emit a warning to FW log...?

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor comments

/* DMA status can return -EPIPE and current status content if xrun occurs */
if (ret == -EPIPE)
#if XRUN_RAISES_AN_ERROR
dai_report_xrun(dev, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this will eventually call the IPC3/4 version of the notification in subsequent PRs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no action needed here, btw, just making an observation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to enclose the contents of if in {}. We have several levels of if that depends on #if. It's easy to make a mistake.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. @makarukp @softwarecki @lgirdwood I wonder if we should start by just making the xrun handling here emit warnings and not call dai_report_xrun() yet. That would allow to unblock Zephyr upstream updated.

The CONFIG_NOTIFY_HOST_ON_XRUN is very misleading as no notifies are sent to host in current code. What dai_report_xrun() is to escalate the problem to pipeline level, which will try to restart the pipeline. There is also another layer of conditional logic after dai_report_xrun(). E.g. "NO_XRUN_RECOVERY" option in src/audio/pipeline/pipeline-xrun.c at the pipeline level.

So seems more time is needed to sort out how we introduce the IPC4 notification mechanisms (which are new) and how we tackle the problem that HDA link xruns happen in common use-cases and in the past we have ignored them.

We still have xrun logic at the pipeline level driven by buffering status, so it's not like we are disabling xrun handling completely.

src/audio/dai-zephyr.c Outdated Show resolved Hide resolved
src/audio/Kconfig Outdated Show resolved Hide resolved
@makarukp makarukp force-pushed the dai_xrun_handling branch 3 times, most recently from 7c26530 to 2f9e198 Compare February 7, 2023 15:10
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose we simplify even further and just emit warnings and do not call dai_xrun_report() yet. There seems to be more opens (see inline).

/* DMA status can return -EPIPE and current status content if xrun occurs */
if (ret == -EPIPE)
#if XRUN_RAISES_AN_ERROR
dai_report_xrun(dev, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. @makarukp @softwarecki @lgirdwood I wonder if we should start by just making the xrun handling here emit warnings and not call dai_report_xrun() yet. That would allow to unblock Zephyr upstream updated.

The CONFIG_NOTIFY_HOST_ON_XRUN is very misleading as no notifies are sent to host in current code. What dai_report_xrun() is to escalate the problem to pipeline level, which will try to restart the pipeline. There is also another layer of conditional logic after dai_report_xrun(). E.g. "NO_XRUN_RECOVERY" option in src/audio/pipeline/pipeline-xrun.c at the pipeline level.

So seems more time is needed to sort out how we introduce the IPC4 notification mechanisms (which are new) and how we tackle the problem that HDA link xruns happen in common use-cases and in the past we have ignored them.

We still have xrun logic at the pipeline level driven by buffering status, so it's not like we are disabling xrun handling completely.

@lgirdwood
Copy link
Member

I propose we simplify even further and just emit warnings and do not call dai_xrun_report() yet. There seems to be more opens (see inline).

Fine by me.

"ret = %u", ret);
else
comp_warn(dev, "dai_copy(): dma_get_status() overrun occurred, "
"ret = %u", ret);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need braces here. GCC used to complain about constructs like

if (a)
	if (b)
		do_x();
	else
		do_y();

because of the potential ambiguity of the else (this ain't no python). This probably wasn't compiled with CONFIG_NOTIFY_HOST_ON_XRUN disabled

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok with the fix now, but if you are refreshing, a few style comments to the Kconfig option to make it more understandable.

help
Select for handling DMA xruns and reporting them as errors.
Otherwise, only warnings are emitted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't block on this, but if we have the Kconfig, I'd do:

  • put it at the end of audio/Kconfig after WRAP_ACTUAL_POSITION (now it's in middle of SRC related options)
  • name it DAI_REPORT_XRUNS
  • description: "Report DAI overrun/underruns to pipeline"

This would better reflect what this does.

@makarukp
Copy link
Contributor Author

makarukp commented Feb 8, 2023

I propose we simplify even further and just emit warnings and do not call dai_xrun_report() yet. There seems to be more opens (see inline).
@lgirdwood @kv2019i
Does it mean i could remove Kconfig dependency and simplify it to below example ?

switch (ret) {
case 0:
        break;
case -EPIPE:
        tr_warn()
default:
        return ret

@marc-hb marc-hb added bug Something isn't working as expected P1 Blocker bugs or important features labels Feb 8, 2023
@makarukp makarukp force-pushed the dai_xrun_handling branch 2 times, most recently from 27b1f4c to ade5593 Compare February 9, 2023 08:30
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@makarukp You have unrelated rimage changes in the updated commit.

Otherwise looks good to go, thanks!

@lgirdwood
Copy link
Member

@makarukp can you check the CI. Thanks !

lyakh
lyakh previously requested changes Feb 10, 2023
src/audio/dai-zephyr.c Show resolved Hide resolved
@lyakh lyakh dismissed their stale review February 10, 2023 15:38

comment addressed. Since I personally haven't followed closely the decision to remove xrun reporting completely in that case, I won't explicitly approve, but just remove my request for change

@lgirdwood
Copy link
Member

@makarukp can you check CI, thanks !

@kv2019i
Copy link
Collaborator

kv2019i commented Feb 13, 2023

@wszypelt This one seems to have hit unexpected errors as well.

src/audio/dai-zephyr.c Outdated Show resolved Hide resolved
dai copy should continue with flow if status returns xrun warning

Signed-off-by: Piotr Makaruk <piotr.makaruk@intel.com>
@kv2019i
Copy link
Collaborator

kv2019i commented Feb 15, 2023

@wszypelt @makarukp This time it seems an unrelated fail running "git clone" , can you check and agree how to best handle (repush or rerun of CI).

@kv2019i kv2019i merged commit 5b8c622 into thesofproject:main Feb 15, 2023
@kv2019i
Copy link
Collaborator

kv2019i commented Feb 15, 2023

Now merged! Thank you @wszypelt @makarukp @lyakh for helping to get this across the finish line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected P1 Blocker bugs or important features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants