Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize String Parsing routines #273

Merged
merged 10 commits into from Dec 1, 2018

Conversation

Projects
None yet
4 participants
@fmatthew5876
Copy link
Contributor

fmatthew5876 commented Oct 31, 2018

Avoids memory allocations in many cases, especially when string is small.

@fmatthew5876

This comment has been minimized.

Copy link
Contributor Author

fmatthew5876 commented Oct 31, 2018

Some data on this change: Loading Heros Realm ldb and then exiting.

Master does 10243455 calls to malloc(). This PR does 7501537 calls to malloc(). A 27% reduction in the number of allocations done when loading.

@fmatthew5876

This comment has been minimized.

Copy link
Contributor Author

fmatthew5876 commented Nov 1, 2018

I tested the new ucnv_convertEx() implementation with Violated Heroine and forcing the encoding to Shift-JIS. The program loads the ldb and exits.

Metric c9cf16a~1 c9cf16a % improve
# mallocs() 2884146 2561057 11%
runtime 0.771s 0.747s 3%
mem usage 76.6 76.6 0%

I did not test write performance, but the libicu website has this to say:

ucnv_convertEx() can convert between UTF-8 and another charset, if one of the two UConverters is a
UTF-8 converter. The conversion from UTF-8 to most other charsets uses a dedicated, optimized code
path, avoiding the pivot through UTF-16. (Conversion from other charsets to UTF-8 could be optimized as
well, but that has not been implemented yet as of ICU 4.4.)

http://userguide.icu-project.org/strings/utf-8

@fmatthew5876 fmatthew5876 changed the title Optimize ReaderLcf::ReadString() Optimize String Parsing routines Nov 1, 2018

@fmatthew5876 fmatthew5876 force-pushed the fmatthew5876:opt_str branch 2 times, most recently from 4c75a94 to a8eabd5 Nov 1, 2018

@fmatthew5876

This comment has been minimized.

Copy link
Contributor Author

fmatthew5876 commented Nov 1, 2018

The next one creates an Encoder class and stores the converters there. The result is that we only create 1 set of converters for the entire ldb read operation. The get opened once when we start and closed when we finish.

Metric a8eabd5~1 a8eabd5 % improve
# mallocs() 2561057 2020067 21%
runtime 0.747s 0.718s 4%
mem usage 76.6 76.6 0%

@fmatthew5876 fmatthew5876 force-pushed the fmatthew5876:opt_str branch 7 times, most recently from 5dca324 to e561952 Nov 1, 2018

@fmatthew5876

This comment has been minimized.

Copy link
Contributor Author

fmatthew5876 commented Nov 1, 2018

Can any one help volunteer to test encodings aren't broken for icu and iconv? I've done some testing on my side, but I wouldn't feel comfortable merging this PR unless its been verified by someone else.

@Ghabry Ghabry self-requested a review Nov 5, 2018

@Ghabry

This comment has been minimized.

Copy link
Member

Ghabry commented Nov 5, 2018

Removing ReaderUtil::Recode is a huge breaking change.
Will need PRs in Player, Editor and Tools.

@fdelapena fdelapena added the Refactor label Nov 5, 2018

@fmatthew5876 fmatthew5876 force-pushed the fmatthew5876:opt_str branch from 1242bc1 to e23b302 Nov 28, 2018

@fmatthew5876

This comment has been minimized.

Copy link
Contributor Author

fmatthew5876 commented Nov 28, 2018

I've removed the change that takes out ReaderUtil::Recode(). I would like to merge this first, get the other downstream stuff to use the new Encoder, and the remove Recode() later.

Once this and the other liblcf PRs are merged I will start working on refactoring and performance, but I'm blocked until all the current liblcf the PRs get merged.

Show resolved Hide resolved src/encoder.cpp
@fmatthew5876

This comment has been minimized.

Copy link
Contributor Author

fmatthew5876 commented Nov 28, 2018

I removed the exceptions and tied the error handling into the IsOk() mechanism that we currently use.

@fmatthew5876 fmatthew5876 force-pushed the fmatthew5876:opt_str branch from bd732eb to 4a5d80c Nov 29, 2018

@fmatthew5876

This comment has been minimized.

Copy link
Contributor Author

fmatthew5876 commented Nov 29, 2018

rebased to master to avoid jenkins build issues in Player

Show resolved Hide resolved src/encoder.cpp Outdated
@Ghabry

This comment has been minimized.

Copy link
Member

Ghabry commented Dec 1, 2018

Jenkins: test this please

@fmatthew5876 fmatthew5876 force-pushed the fmatthew5876:opt_str branch from 4a5d80c to d8ed54e Dec 1, 2018

@fmatthew5876

This comment has been minimized.

Copy link
Contributor Author

fmatthew5876 commented Dec 1, 2018

Rebased to master and fixed the macro whitespace issue.

@Ghabry

Ghabry approved these changes Dec 1, 2018

@fdelapena fdelapena merged commit 037dfe6 into EasyRPG:master Dec 1, 2018

5 checks passed

GNU/Linux Build finished.
Details
OSX Build finished.
Details
Wii Build finished.
Details
Windows Build finished.
Details
web Build finished.
Details
@Ghabry

This comment has been minimized.

Copy link
Member

Ghabry commented Dec 2, 2018

@fmatthew5876
I have no idea how that happened but the stuff we just merged contains the std::exception code again.
I'm totally certain I saw a commit which removed this before and I even checked the last force-push before merging, so wtf, how did that happen o.O.

Also this causes mass breakage in Player because of unresolved iconv references, some incorrect preprocessor macro?

@fmatthew5876

This comment has been minimized.

Copy link
Contributor Author

fmatthew5876 commented Dec 2, 2018

I've been using different computers. Maybe I messed up and lost the exception change. I can make a fix PR tonight.

@Ghabry

This comment has been minimized.

Copy link
Member

Ghabry commented Dec 2, 2018

Instead of "-f" better use "--force-with-lease" (totally obvious name...) with git push this will prevent the force push when the state of the remote repo changed since your last git fetch.

The other thing, I already know: You must include lcf_options.h which pulls in config.h for autotools...

@fmatthew5876 fmatthew5876 deleted the fmatthew5876:opt_str branch Dec 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.