Improve voice loading times #85

forslund · 2016-09-29T05:53:10Z

This PR intends to improve the loading times of voice files (.flitevox). The new code is not active by default the option --enable-voice-load-opt needs to be added to configure for them to take effect. When this flag is enabled I see an improvement in loading speed of 20-25% (when running a series of 100 loads). I've added time_voice_load to the testsuite to check the load times (roughly).

My approach has been to try to reduce the number of calls to fread() and to limit the number of context switches into kernel mode.

The two most notable changes are

reducing calls to fread() in cst_read_tree_nodes() (10% improvement)
Increasing the vbuf for reads to 64k, this reduces context switches for reads by 90% (additional 5% improvement in time)

Many other tiny changes each reducing the load time by a couple of percent each contributes to the rest of the time improvements.

Currently I'm working on a block allocator to reduce context switches for memory allocation but I still have to fix some issues with that (make sure alignments are correct).

I haven't had the possibility to try this on a raspberry pi, the increased vbuf might make a bigger difference on such a system.

forslund · 2016-09-30T05:10:23Z

@el-tocino took the time to run this on a device similar to the raspberry pi and after some issues with swapping an improvement similar to mine could be observed.

However when the voice file (in this case mycroft_voice_4.0.flitevox) is completely uncached by the kernel the delay for the file operations are an order of magnitude larger than the improvements reducing the effectiveness of the optimization attempt to a couple of percent.

forslund · 2016-10-26T06:19:49Z

Now that Travis has run the os X build there is an actual issue. I'll see if I can fix it.

codecov-io · 2016-10-28T13:15:12Z

Codecov Report

Merging #85 into development will increase coverage by 0.05%.
The diff coverage is 90.32%.

@@               Coverage Diff               @@
##           development      #85      +/-   ##
===============================================
+ Coverage        35.48%   35.53%   +0.05%     
===============================================
  Files               97       97              
  Lines            10255    10262       +7     
===============================================
+ Hits              3639     3647       +8     
+ Misses            6616     6615       -1

Impacted Files	Coverage Δ
src/cg/cst_cg_map.c	`92.57% <100%> (+2.28%)`	⬆️
src/cg/cst_cg_load_voice.c	`77.19% <57.14%> (-2.81%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20e45c0...7a6419e. Read the comment docs.

forslund · 2016-10-28T13:18:18Z

There, it's passing. I used the work around described in https://gist.github.com/jbenet/1087739

I haven't tested been able to test it properly since I have no computer running os X

zeehio · 2017-03-20T11:19:38Z

Sorry for not replying for a long time.

To me this looks good and can be merged. My only question is that I don't see why the optimization should be hidden behind a --enable option, since it seems that performance does not decrease in any case. Why don't we drop that option and always use your optimization by default? In any case, this looks good to be merged to me.

I don't know how far did you go with the block allocator and the alignment issues, but I bet it can provide much larger improvements. Another option if the alignment issues are annoying would be to have different memory blocks per data type to ensure alignment.

forslund · 2017-03-20T18:54:13Z

Yeah, the flag was mainly added to test the differences, I'll make an update and remove it.

I know I did a block-allocator at one point, and I have a separate branch for that (somewhere)...I think I wanted this merged separately for some reason but I can't quite remember.

--enable-voice-load-opt to enable optimization

- Fix errors due to unused variables when running unoptimized - Remove malloc-optimization to remove crash when unloading voice

* Removed experimental malloc options * Removed minor unintended code styling changes

zeehio · 2017-04-13T14:50:55Z

src/cg/cst_cg_map.c

@@ -77,16 +77,22 @@ cst_cg_db *cst_cg_load_db(cst_voice *vox, cst_file fd)
    cst_cg_db *db = cst_alloc(cst_cg_db, 1);
    int i;
    uint32_t elements[2];
-    uint32_t load_buff[4];
+    struct load_buff_s {


Relying on sizeof(a structure) is risky. We don't know were mimic may end up being used and what alignment issues we may face. Given that this happens just once (not any inner loop) what do you think about using two cst_fread, one for the integers and one for the floats?

A good idea.

I was mainly testing my options with this psuh. Frankly I didn't think this would pass Travis.

forslund · 2017-04-13T17:45:43Z

Updated previous commit according to your suggestion.

zeehio · 2017-04-13T18:30:57Z

Merged! :-)

forslund · 2017-04-13T19:11:34Z

Then I'll move on to the block allocator :)

forslund added the in progress label Sep 29, 2016

forslund force-pushed the optimize-load branch from d0320a8 to 456fff5 Compare October 26, 2016 05:08

forslund force-pushed the optimize-load branch 2 times, most recently from e0bc047 to 95fdb94 Compare October 28, 2016 12:52

forslund force-pushed the optimize-load branch from f6fc83c to 545511d Compare April 9, 2017 10:19

forslund added 13 commits April 12, 2017 07:29

cg_map optimizations

58d856d

Optimize cg_load_voice using read buffer

8623aff

Create test executable

f8cdf3f

Move voice loading time test to testsuite

3fe5d47

Enable optimization code with configure

66528bf

--enable-voice-load-opt to enable optimization

Fix compilation of cst_cg_map.c

4ccb142

- Fix errors due to unused variables when running unoptimized - Remove malloc-optimization to remove crash when unloading voice

Make time_voice_load more userfriendly.

559bcbe

Remove unnecessary changes from build scripts.

91dd6fd

Cleanup before PR.

e685e1a

* Removed experimental malloc options * Removed minor unintended code styling changes

Check that two arguments are given to voice load test

456f8a3

Add os X specific get time function in time_voice_load.c

605085f

Make optimize loading default

c76f18b

Enable Posix features for time_voice_load.c.

cbecc82

forslund force-pushed the optimize-load branch from 545511d to cbecc82 Compare April 13, 2017 05:17

zeehio reviewed Apr 13, 2017

View reviewed changes

forslund force-pushed the optimize-load branch from fc216db to 7a6419e Compare April 13, 2017 15:05

Fix strict-aliasing problem on windows

7a6419e

zeehio merged commit 660c5ec into MycroftAI:development Apr 13, 2017

LongBoolean removed the in progress label Apr 13, 2017

zeehio mentioned this pull request Jun 12, 2017

Request: Test on a raspberry pi unit needed (with faster compilation) #122

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve voice loading times #85

Improve voice loading times #85

forslund commented Sep 29, 2016 •

edited

forslund commented Sep 30, 2016

forslund commented Oct 26, 2016

codecov-io commented Oct 28, 2016 •

edited

forslund commented Oct 28, 2016

zeehio commented Mar 20, 2017

forslund commented Mar 20, 2017

zeehio Apr 13, 2017

forslund Apr 13, 2017

forslund commented Apr 13, 2017

zeehio commented Apr 13, 2017

forslund commented Apr 13, 2017

Improve voice loading times #85

Improve voice loading times #85

Conversation

forslund commented Sep 29, 2016 • edited

forslund commented Sep 30, 2016

forslund commented Oct 26, 2016

codecov-io commented Oct 28, 2016 • edited

Codecov Report

forslund commented Oct 28, 2016

zeehio commented Mar 20, 2017

forslund commented Mar 20, 2017

zeehio Apr 13, 2017

Choose a reason for hiding this comment

forslund Apr 13, 2017

Choose a reason for hiding this comment

forslund commented Apr 13, 2017

zeehio commented Apr 13, 2017

forslund commented Apr 13, 2017

forslund commented Sep 29, 2016 •

edited

codecov-io commented Oct 28, 2016 •

edited