Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOF: Multicore support failed when _widget_free introduced. #2824

Closed
keyonjie opened this issue Apr 7, 2021 · 9 comments
Closed

SOF: Multicore support failed when _widget_free introduced. #2824

keyonjie opened this issue Apr 7, 2021 · 9 comments
Assignees
Labels
bug Something isn't working P1 Blocker bugs or important features

Comments

@keyonjie
Copy link

keyonjie commented Apr 7, 2021

We call sof_pipeline_core_enable() in sof_widget_setup() to power up the corresponding DSP cores,

ret = sof_pipeline_core_enable(sdev, swidget);

but not do the corresponding powering off Cores at sof_widget_free(), e.g. at runtime suspending,
int sof_widget_free(struct snd_sof_dev *sdev, struct snd_sof_widget *swidget)

this will lead to the subsequent run of the pipeline/widget after the runtime suspend/resume cycle on non-0 DSP core fail as the driver will keep think core 1 is already enabled according to the sdev->enabled_cores_mask,
which is not aligned with the FW side, the FW will complain with "ipc_process_on_core(): core 1 is disable" and reject to run it.
https://github.com/thesofproject/sof/blob/0b0733749c55b39c9e8699538b77c65f0c322397/src/ipc/ipc-common.c#L44

[  272.220926] sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x30100000: GLB_TPLG_MSG: PIPE_NEW
[  272.220929] sof-audio-pci-intel-tgl 0000:00:1f.3: widget PIPELINE.4.SSP1.IN setup complete
[  272.220931] sof-audio-pci-intel-tgl 0000:00:1f.3: core_enable widget: PCM2P swidget core: 1
[  272.220933] sof-audio-pci-intel-tgl 0000:00:1f.3: sdev->enabled_cores_mask: 0x3
[  272.220935] sof-audio-pci-intel-tgl 0000:00:1f.3: sdev->enabled_cores_mask: 0x3
[  272.220941] sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x30010000: GLB_TPLG_MSG: COMP_NEW
[  272.221036] sof-audio-pci-intel-tgl 0000:00:1f.3: error: ipc error for 0x30010000 size 20
[  272.221041] sof-audio-pci-intel-tgl 0000:00:1f.3: error: failed to load widget PCM2P
[  272.221044] sof-audio-pci-intel-tgl 0000:00:1f.3: error: failed to restore pipeline after resume -13
[  272.221050] sof-audio-pci-intel-tgl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_pm_runtime_get on 0000:00:1f.3: -5
[  272.221113]  NoCodec-2: soc_pcm_open() failed (-5)
[  272.221118]  smart-nocodec: ASoC: dpcm_be_dai_startup() failed at NoCodec-2 (-5)
[  272.221122]  smart-nocodec: dpcm_fe_dai_startup() failed (-5)
[  272.233943] sof-audio-pci-intel-tgl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG

This is actually SOF common issue, not TGL specific, and happen in all scenarios where widgets free is used, including runtime suspend/resume, not dedicated for dynamic pipelines only.

Proposed solution:
add refcount to each DSP core, increase/decrease the refcount when a widget/pipeline using the Core is created/freed, and do the core power up/down and core enable/disable IPC only when needed.

@plbossart
Copy link
Member

@keyonjie with all due respect, this is a terrible bug description...

Can you please reword it with code pointers so that we don't have to reverse-engineer your line of thought, or why it's TGL-specific, and whether this is for dynamic pipelines or not.

@keyonjie keyonjie changed the title TGL: Multicore support failed SOF: Multicore support failed when _widget_free introduced. Apr 8, 2021
@keyonjie
Copy link
Author

keyonjie commented Apr 8, 2021

@keyonjie with all due respect, this is a terrible bug description...

Can you please reword it with code pointers so that we don't have to reverse-engineer your line of thought, or why it's TGL-specific, and whether this is for dynamic pipelines or not.

Thanks for feedback, only had 3 minutes to heads up this on github last night, now revised.

@ranj063
Copy link
Collaborator

ranj063 commented Apr 8, 2021

@keyonjie I agree we should power off cores properly depending on refcount with multicore. But we must be missing something in our suspend sequence too. We must not be updating the enabled cores mask when poering up secondary cores via IPC. Something to look into

@keyonjie
Copy link
Author

keyonjie commented Apr 8, 2021

@keyonjie I agree we should power off cores properly depending on refcount with multicore. But we must be missing something in our suspend sequence too. We must not be updating the enabled cores mask when poering up secondary cores via IPC. Something to look into

It looks to me that the powering up of the secondary cores works fine (the enabled cores mask set correctly), but the widget free or runtime suspending not clearing this mask for us. Sorry I assigned this to you as I think you know this part best.

@keyonjie
Copy link
Author

keyonjie commented Apr 14, 2021

@plbossart since @ranj063 is in leave, I may take my old PR #888 back to address this issue.

@keyonjie
Copy link
Author

Hi @plbossart @ranj063 I managed to figure out solution for this in my PR #2850, please help to review.

@kv2019i kv2019i added bug Something isn't working P1 Blocker bugs or important features labels Apr 27, 2021
@kv2019i
Copy link
Collaborator

kv2019i commented Apr 28, 2021

This is impacting a key feature in sof-dev, so applying P1.

kv2019i added a commit to kv2019i/linux that referenced this issue Apr 28, 2021
The recent changes to use common code to power up/down DSP cores
also removed initialization of core status at suspend. It turns
this is still needed.

BugLink: thesofproject#2824
Fixes: 42077f0 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
kv2019i added a commit to kv2019i/linux that referenced this issue Apr 29, 2021
The recent changes to use common code to power up/down DSP cores also
removed the reset of the core state at suspend. It turns out this is
still needed. When the firmware state is reset to
SOF_FW_BOOT_NOT_STARTED, also enabled_cores should be reset, and
existing DSP drivers depend on this.

BugLink: thesofproject#2824
Fixes: 42077f0 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
plbossart pushed a commit that referenced this issue Apr 29, 2021
The recent changes to use common code to power up/down DSP cores also
removed the reset of the core state at suspend. It turns out this is
still needed. When the firmware state is reset to
SOF_FW_BOOT_NOT_STARTED, also enabled_cores should be reset, and
existing DSP drivers depend on this.

BugLink: #2824
Fixes: 42077f0 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
plbossart pushed a commit that referenced this issue May 5, 2021
The recent changes to use common code to power up/down DSP cores also
removed the reset of the core state at suspend. It turns out this is
still needed. When the firmware state is reset to
SOF_FW_BOOT_NOT_STARTED, also enabled_cores should be reset, and
existing DSP drivers depend on this.

BugLink: #2824
Fixes: 42077f0 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
@plbossart
Copy link
Member

@keyonjie isn't this issue solved by PR #2880 already? Please close if this is the case, thank you.

@keyonjie
Copy link
Author

keyonjie commented May 7, 2021

@keyonjie isn't this issue solved by PR #2880 already? Please close if this is the case, thank you.

Yes, let's close it, and file new multi-core issues if observed by CI after topology change thesofproject/sof#4153 merged.

@keyonjie keyonjie closed this as completed May 7, 2021
plbossart pushed a commit that referenced this issue May 11, 2021
The recent changes to use common code to power up/down DSP cores also
removed the reset of the core state at suspend. It turns out this is
still needed. When the firmware state is reset to
SOF_FW_BOOT_NOT_STARTED, also enabled_cores should be reset, and
existing DSP drivers depend on this.

BugLink: #2824
Fixes: 42077f0 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
kv2019i added a commit that referenced this issue May 14, 2021
The recent changes to use common code to power up/down DSP cores also
removed the reset of the core state at suspend. It turns out this is
still needed. When the firmware state is reset to
SOF_FW_BOOT_NOT_STARTED, also enabled_cores should be reset, and
existing DSP drivers depend on this.

BugLink: #2824
Fixes: 42077f0 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
kv2019i added a commit that referenced this issue May 28, 2021
The recent changes to use common code to power up/down DSP cores also
removed the reset of the core state at suspend. It turns out this is
still needed. When the firmware state is reset to
SOF_FW_BOOT_NOT_STARTED, also enabled_cores should be reset, and
existing DSP drivers depend on this.

BugLink: #2824
Fixes: 42077f0 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
fengguang pushed a commit to 0day-ci/linux that referenced this issue May 28, 2021
The recent changes to use common code to power up/down DSP cores also
removed the reset of the core state at suspend. It turns out this is
still needed. When the firmware state is reset to
SOF_FW_BOOT_NOT_STARTED, also enabled_cores should be reset, and
existing DSP drivers depend on this.

BugLink: thesofproject#2824
Fixes: 42077f0 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
fengguang pushed a commit to 0day-ci/linux that referenced this issue Jun 1, 2021
The recent changes to use common code to power up/down DSP cores also
removed the reset of the core state at suspend. It turns out this is
still needed. When the firmware state is reset to
SOF_FW_BOOT_NOT_STARTED, also enabled_cores should be reset, and
existing DSP drivers depend on this.

BugLink: thesofproject#2824
Fixes: 42077f0 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://lore.kernel.org/r/20210528144330.2551-1-kai.vehmanen@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
woodsts pushed a commit to woodsts/linux-stable that referenced this issue Jun 16, 2021
commit b640e8a upstream.

The recent changes to use common code to power up/down DSP cores also
removed the reset of the core state at suspend. It turns out this is
still needed. When the firmware state is reset to
SOF_FW_BOOT_NOT_STARTED, also enabled_cores should be reset, and
existing DSP drivers depend on this.

BugLink: thesofproject/linux#2824
Fixes: 42077f0 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://lore.kernel.org/r/20210528144330.2551-1-kai.vehmanen@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1 Blocker bugs or important features
Projects
None yet
Development

No branches or pull requests

4 participants