Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve voice loading times #85

Merged
merged 14 commits into from
Apr 13, 2017
Merged

Conversation

forslund
Copy link
Collaborator

@forslund forslund commented Sep 29, 2016

This PR intends to improve the loading times of voice files (.flitevox). The new code is not active by default the option --enable-voice-load-opt needs to be added to configure for them to take effect. When this flag is enabled I see an improvement in loading speed of 20-25% (when running a series of 100 loads). I've added time_voice_load to the testsuite to check the load times (roughly).

My approach has been to try to reduce the number of calls to fread() and to limit the number of context switches into kernel mode.

The two most notable changes are

  • reducing calls to fread() in cst_read_tree_nodes() (10% improvement)
  • Increasing the vbuf for reads to 64k, this reduces context switches for reads by 90% (additional 5% improvement in time)

Many other tiny changes each reducing the load time by a couple of percent each contributes to the rest of the time improvements.

Currently I'm working on a block allocator to reduce context switches for memory allocation but I still have to fix some issues with that (make sure alignments are correct).

I haven't had the possibility to try this on a raspberry pi, the increased vbuf might make a bigger difference on such a system.

@forslund
Copy link
Collaborator Author

@el-tocino took the time to run this on a device similar to the raspberry pi and after some issues with swapping an improvement similar to mine could be observed.

However when the voice file (in this case mycroft_voice_4.0.flitevox) is completely uncached by the kernel the delay for the file operations are an order of magnitude larger than the improvements reducing the effectiveness of the optimization attempt to a couple of percent.

@forslund
Copy link
Collaborator Author

Now that Travis has run the os X build there is an actual issue. I'll see if I can fix it.

@forslund forslund force-pushed the optimize-load branch 2 times, most recently from e0bc047 to 95fdb94 Compare October 28, 2016 12:52
@codecov-io
Copy link

codecov-io commented Oct 28, 2016

Codecov Report

Merging #85 into development will increase coverage by 0.05%.
The diff coverage is 90.32%.

Impacted file tree graph

@@               Coverage Diff               @@
##           development      #85      +/-   ##
===============================================
+ Coverage        35.48%   35.53%   +0.05%     
===============================================
  Files               97       97              
  Lines            10255    10262       +7     
===============================================
+ Hits              3639     3647       +8     
+ Misses            6616     6615       -1
Impacted Files Coverage Δ
src/cg/cst_cg_map.c 92.57% <100%> (+2.28%) ⬆️
src/cg/cst_cg_load_voice.c 77.19% <57.14%> (-2.81%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20e45c0...7a6419e. Read the comment docs.

@forslund
Copy link
Collaborator Author

There, it's passing. I used the work around described in https://gist.github.com/jbenet/1087739

I haven't tested been able to test it properly since I have no computer running os X

@zeehio
Copy link
Contributor

zeehio commented Mar 20, 2017

Sorry for not replying for a long time.

To me this looks good and can be merged. My only question is that I don't see why the optimization should be hidden behind a --enable option, since it seems that performance does not decrease in any case. Why don't we drop that option and always use your optimization by default? In any case, this looks good to be merged to me.

I don't know how far did you go with the block allocator and the alignment issues, but I bet it can provide much larger improvements. Another option if the alignment issues are annoying would be to have different memory blocks per data type to ensure alignment.

@forslund
Copy link
Collaborator Author

Yeah, the flag was mainly added to test the differences, I'll make an update and remove it.

I know I did a block-allocator at one point, and I have a separate branch for that (somewhere)...I think I wanted this merged separately for some reason but I can't quite remember.

@@ -77,16 +77,22 @@ cst_cg_db *cst_cg_load_db(cst_voice *vox, cst_file fd)
cst_cg_db *db = cst_alloc(cst_cg_db, 1);
int i;
uint32_t elements[2];
uint32_t load_buff[4];
struct load_buff_s {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on sizeof(a structure) is risky. We don't know were mimic may end up being used and what alignment issues we may face. Given that this happens just once (not any inner loop) what do you think about using two cst_fread, one for the integers and one for the floats?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good idea.

I was mainly testing my options with this psuh. Frankly I didn't think this would pass Travis.

@forslund
Copy link
Collaborator Author

Updated previous commit according to your suggestion.

@zeehio zeehio merged commit 660c5ec into MycroftAI:development Apr 13, 2017
@zeehio
Copy link
Contributor

zeehio commented Apr 13, 2017

Merged! :-)

@forslund
Copy link
Collaborator Author

Then I'll move on to the block allocator :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants