Skip to content

optimize copy times and remove dai device from copier#7229

Closed
btian1 wants to merge 12 commits intothesofproject:mainfrom
btian1:daiopt
Closed

optimize copy times and remove dai device from copier#7229
btian1 wants to merge 12 commits intothesofproject:mainfrom
btian1:daiopt

Conversation

@btian1
Copy link
Contributor

@btian1 btian1 commented Mar 8, 2023

After SOF upgraded to IPC4, copier actually created 2 devices for
data input(capture) and output(playback), it is
dai and copier,
hence there are two times copy from DMA buffer
to endpoint buffer then to component buffer.

This PR intend to remove dai device and one more time copy
through use dai data to handle host interface calling.

Basically, there are several steps to go there:

Add dai data to copier data structure.
Share dai data structure to copier by add a dai copier common header file.
In copier, all dai related ops calling should be replaced with new dai exposed interface.
Remove one more time copy and dai endpoint buffer.
Remove dai device.
Performance data:

mcps, saved 1.6mcps, 5.4 --> 3.8, around 30% saved for copier.
Before: 0x40001 perf comp_copy samples 48 period 1000 cpu avg 533 peak 553
After: 0x40001 perf comp_copy samples 48 period 1000 cpu avg 378 peak 397

memory, since host endpoint buffer and host device got removed, for each copier,
16bit/32bit 384/768 bytes saved for buffer, 112 saved for dai device.
As a summary, 16/32 bit playback/capture totally saved 496/884 bytes for each dai copier usage.
For the cases that have more than 10 copier usage, memory saving is quite remarkable.

Note:multiple sink dai copier is not tested and will be fixed in follow up PR.

Signed-off-by: Baofeng Tian baofeng.tian@intel.com

@btian1 btian1 force-pushed the daiopt branch 6 times, most recently from 3dedd75 to 5f93a1f Compare March 8, 2023 07:10
@btian1 btian1 changed the title [DONT REVIEW] DRAFT PR FOR DAI OPTIMIZATION [DON'T REVIEW] DRAFT PR FOR DAI OPTIMIZATION Mar 8, 2023
@btian1 btian1 force-pushed the daiopt branch 23 times, most recently from 0806524 to 0d5f6d0 Compare March 13, 2023 14:02
@gkbldcig
Copy link
Collaborator

Can one of the admins verify this patch?

@btian1
Copy link
Contributor Author

btian1 commented Mar 29, 2023

Can one of the admins verify this patch?

this is still under development, please stay tuned.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 370 to 368
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can probably make some of these light function static inline in the header.

Copy link
Contributor Author

@btian1 btian1 Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, just because this is an formal interface for driver, if move ipc4 to inline, then legacy part also need inline, it is not a performance function, so overhead is not a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs some inline commenst to explain what this is doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be an inline in the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is part of drv interface, and copier actually does not use dai_copy, it is using dai_zephyr_copy.

@lgirdwood
Copy link
Member

mcps, saved 1.6mcps, 5.4 --> 3.8, around 30% saved for copier.
Before: 0x40001 perf comp_copy samples 48 period 1000 cpu avg 533 peak 553
After: 0x40001 perf comp_copy samples 48 period 1000 cpu avg 378 peak 397

@btian1 this is a good saving for link copier. Good work.

@btian1
Copy link
Contributor Author

btian1 commented Apr 10, 2023

pier. Good work.

this is with log print data, actual saving for MCPS should be larger, due to zephyr log print always have impact on performance measure.

Copy link
Collaborator

@RanderWang RanderWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to create endpoint buffer for dai ? when copier is bound to another module, the buffer is created. Dai can reuse copier buffer. please check host part.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return -ENODEV; error label does nothing now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else if. This time cd->ipc_gtw is unimportant? Or you could use switch (dev->ipc_config.type)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config->type = SOF_COMP_HOST;

gtw assigned type as HOST, so for DAI case, there is no possible with ipc_gtw == true.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would look better:

switch (dev->ipc_config.type) {
case SOF_COMP_HOST:
	if (!cd->ipc_gtw) {
		host_zephyr_free(cd->hd);
		rfree(cd->hd);
	}
	break;
case SOF_COMP_DAI:
	ai_zephyr_free(cd->dd, cd->endpoint[0]);
	rfree(cd->endpoint[0]);
	buffer_free(cd->endpoint_buffer[0]);
}

similar in other cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I will change to this style, previously focused on resolve multisink and other CI related errors.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch will cover 3 cases, host/dai/gtw, anything not covered?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch ()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch ()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this related?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already merged PR: #7303

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the indentation was correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, changed aligned back.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually both these lines aren't needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch ()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this "else" processing here is unnecessary. In case of capture, dev->bsink_list should not be empty but should contain one or more regular sink buffers. That case is already handled below in a loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep it for now, I tried with remove this part, I saw some input/output error with capture:
arecord: pcm_read:2221: read error: Input/output error

Copy link
Collaborator

@ranj063 ranj063 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to stop at this commit @btian1. I think you need to fix the multiple endpoint case right from the begininng and also add a common header

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this even possible for a copier to not be of type host or DAI? If not, please remove the error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this part should not be error, I observed NONE with endpoint_num = 0 case, left a warning is more suitable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we please leave dai-legacy alone and not touch it at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't want to touch, but I remember dai is different, if not touch, I can't make CI pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do it with a separate patch after this PR merged.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if DAI merge with copier is any similar to what previously was done to host, then we do need to touch legacy: 69deb6f

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need a dai_put() here isnt it? I know it doesnt exist in the current code but can we please fix it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, like new/free right? if dma_get failed, need dma_put.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, like new / free, that's why - no, not dma_put(), but dai_put(), because dai_get() was successful above

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is confusing? Are you trying to say this needs to be changed? If so, can you adda TODO here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should change or delete, it is a self-remind comments when I worked on it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we please drop the "through data_data" in all the comments. I think its pretty self-explanatory given that it is one of the args

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean delete those comments? please confirm, if so, I will drop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you not passing dev here instead of cd->endpoint[0]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just change to zephyr reset, for replace, I made it in patch 11 together.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and isn't this going to break multitple endpoint DAIs? why remove the loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

endpoint_num always 1, right? is there a case for endpoint_num >= 2?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is a case when endpoint_num is >= 2. It's called "multi-channel ALH" or "ALH multi-gateway" feature. Basically, there could be multiple ALH endpoints audio data from/to which should be multiplexed/demultiplexed into/from single stream. So, for example, stereo audio could come from two ALH gateways, one for left channel, second for right, and then that audio is multiplexed into single stereo stream. See mux_into_multi_endpoint_buffer() and demux_from_multi_endpoint_buffer().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common header please her eand all other common function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh i forgot a module copier is not of type DAI or HOST. Please remove all errors as there's just nothing to do for that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, remove error is fine, for moduler copier, what's the assigned type, or it does not need a type?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not dev instead of cd->endpoint[0]? and why remove the loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change dev with patch 11.

@serhiy-katsyuba-intel
Copy link
Contributor

@btian1 , before merge could you ask @wszypelt to run a full scope of Windows MTL tests on firmware with this PR. Unfortunately, CI is only done on some subset of tests. After similar Host copy optimizations we had some glitches/distorted audio regressions. Full scope tests take several hours, I guess, that is why they are not all added to CI.

I'll put "request changes" just to block merge until @wszypelt is back with the tests results.

Copy link
Contributor

@serhiy-katsyuba-intel serhiy-katsyuba-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking merge until @wszypelt gets back with full scope tests results.

@wszypelt
Copy link

@serhiy-katsyuba-intel @btian1 Sure thing, I can set up full scope, within 24 hours we should have test results for quick comparison
For now PR also blocks CI(Internal Intel CI System/merge/build— Tests failed ), unfortunately the sdw tests don't pass properly

btian1 added 5 commits April 28, 2023 20:55
Create and expose two functions for dai allocation and
free along with struct dai_data,these functions will be shared by both
dai and copier components.
This is in preparation to remove the creation of dai component and
endpoint buffer in init_dai().

Below is the preparation work:
Add dai data structure to copier data structure
Currently, copier call dai interface need through ops driver.
However, with dedicated work, all dai interface can be called
through dai_data, this commit add dai data into copier data.

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
Create and expose dai common functions for further copier usage.
Those functions does not depend on dai device, and it can work with
dai data directly, this PR is for DAI prepare zephyr usage.

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
Create and expose a couple of functions for host DMA reset along with
struct dai_data,these functions will be shared by both dai and
copier component.
This is in preparation to remove the creation of dai component and
endpoint buffer in init_dai().

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
Create and expose a couple of functions for dai comp trigger along with
struct dai_data,these functions will be shared by both dai and
copier component.
This is in preparation to remove the creation of dai component and
endpoint buffer in init_dai().

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
Implement dai data processed data and dai position in copier to
avoid use dai ops driver call it.

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
@btian1
Copy link
Contributor Author

btian1 commented Apr 28, 2023

unfortuna

please stay tuned, will continue on ww17, need more time to work on this PR.

btian1 added 7 commits April 28, 2023 21:10
Create and expose a couple of functions for dai parameters
along with struct dai_data,these functions will be shared by both
dai and copier component.
This is in preparation to remove the creation of dai component and
endpoint buffer in init_dai().

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
Create and expose a couple of functions for dai DMA copy
along with struct dai_data,these functions will be shared by both
dai and copier component.
This is in preparation to remove the creation of a dai component and
endpoint buffer in creat_dai().

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
Dai get hw parameters only used for ipc3 case, there is no need
keep it for dai zephyr.

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
Current code works with ipc3 only, for ipc4 with zephyr, current
implementation will cause finding device is copier device instead of
dai device, in order to fix this, all dai related zephyr api moved
to asrc module, then dai device usage will be moved out for zephyr case.

Also, with this change, asrc will have no dependency on dai zephyr.
dai zephyr interface are fully aligned with module adapter.

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
With this change, dai config was removed from dai zephyr driver.
and instead of use dai device, dai config is using copier device.
This is preparation for remove dai device.

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
Replace all dai device usage with copier device, also remove the
converter function in dai_params, assign it in copier with
initial converter function.

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
Current implementation only support one source and one sink buffer
dai data copy, however, after remove endpoint buffer, there is a need
to copy multiple times data from source buffer to multiple endpoint
buffer, loop all the sink device and copy it, after all done, consume
source buffer accordingly.

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
@ranj063 ranj063 mentioned this pull request May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants