Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COMMON: Rewrite Encoder and drop dependency on iconv #2586

Merged
merged 1 commit into from Nov 15, 2020

Conversation

@phcoder
Copy link
Contributor

@phcoder phcoder commented Nov 1, 2020

Different platforms have different levels of support of encodings and
often have slight variations. We already have tables for most encoding
with only CJK missing. Full transcoding inclusion allows us to get reliable
encoding results independently of platform. The biggest con is the size:

text data bss dec hex filename
3584 0 0 3584 e00 common/encodings/singlebyte.o
22272 0 0 22272 5700 common/encodings/windows932.o
44856 0 0 44856 af38 common/encodings/windows949.o
28260 0 0 28260 6e64 common/encodings/windows950.o

The total is 96K. The tables can be moved to external file if this is a
problem at a risk of this file being missing.

It removes a duplicate table for korean in graphics/korfont.cpp

@lephilousophe
Copy link
Member

@lephilousophe lephilousophe commented Nov 1, 2020

I must admit that what disturbs me the most in this PR are the various (automatic and implicit) conversions between String and U32String.
I would really prefer we drop these and use encode/decode functions in the place. That's way clearer.

@phcoder
Copy link
Contributor Author

@phcoder phcoder commented Nov 1, 2020

I must admit that what disturbs me the most in this PR are the various (automatic and implicit) conversions between String and U32String.
I would really prefer we drop these and use encode/decode functions in the place. That's way clearer.

I agree in principle but the code is peppered with implicit conversions before this PR, so I think such refactoring merits a separate PR

@athrxx
Copy link
Member

@athrxx athrxx commented Nov 1, 2020

IMHO this PR is a vast improvement over the exisiting code. Great job!

@phcoder phcoder force-pushed the phcoder:strsquash branch 3 times, most recently from d429e78 to ecc0f34 Nov 1, 2020
@phcoder phcoder requested a review from sev- Nov 7, 2020
@phcoder phcoder force-pushed the phcoder:strsquash branch from ecc0f34 to e230385 Nov 7, 2020
@phcoder phcoder requested a review from criezy Nov 8, 2020
@phcoder phcoder force-pushed the phcoder:strsquash branch from e230385 to 8382403 Nov 8, 2020
engines/testbed/encoding.cpp Show resolved Hide resolved
configure Outdated Show resolved Hide resolved
common/enc-internal.h Show resolved Hide resolved
backends/platform/sdl/win32/win32.cpp Show resolved Hide resolved
Copy link
Member

@criezy criezy left a comment

I think you are missing changes in ports.mk.
Also I think some of your changes in configure are incorrect and I added a comment there (we still need to link with iconv when using fluidsynth, because the fluidsynth library itself has a dependency on iconv).

There is also a suggestion to move the encoding tests out of the testbed engine and into test/ since we no longer need a OSystem instance for those.

And finally I am a bit worried with the added static conversion tables. We might hit issues on some platforms if we reach the executable size limit. I think it might be better to deport those tables to an encoding.dat file loaded at runtime (possibly in a lazy manner, i.e. when first needed).

But overall I think this is a good change to simplify the code and ensure consistency. So this is a thumb up from me.

configure Outdated Show resolved Hide resolved
engines/testbed/encoding.h Outdated Show resolved Hide resolved
graphics/korfont.cpp Outdated Show resolved Hide resolved
common/str-enc.cpp Outdated Show resolved Hide resolved
@phcoder phcoder force-pushed the phcoder:strsquash branch 7 times, most recently from 0e5940d to 9b6095e Nov 9, 2020
@deepcode-ci-bot
Copy link

@deepcode-ci-bot deepcode-ci-bot bot commented Nov 10, 2020

Congratulations 🎉. DeepCode analyzed your code in 4.774 seconds and we found no issues. Enjoy a moment of no bugs ☀️.

👉 View analysis in DeepCode’s Dashboard | Configure the bot

@phcoder phcoder force-pushed the phcoder:strsquash branch 3 times, most recently from 7eb3351 to a72100b Nov 10, 2020
Copy link
Member

@criezy criezy left a comment

Great work.
I am a bit worried about the apparent possibility to have uninitialised table pointers when failing to load the CJK tables in str-enc.cpp.
And I also made a few other minor comments.

common/str-enc.cpp Show resolved Hide resolved
test/mock_osystem.cpp Outdated Show resolved Hide resolved
test/mock_osystem.cpp Outdated Show resolved Hide resolved
test/common/encoding.h Outdated Show resolved Hide resolved
test/common/encoding.h Outdated Show resolved Hide resolved
@phcoder phcoder force-pushed the phcoder:strsquash branch from a72100b to 9add900 Nov 15, 2020
@phcoder
Copy link
Contributor Author

@phcoder phcoder commented Nov 15, 2020

Great work.
I am a bit worried about the apparent possibility to have uninitialised table pointers when failing to load the CJK tables in str-enc.cpp.
They're static so they're initialized to NULL
And I also made a few other minor comments.

test/common/encoding.h Outdated Show resolved Hide resolved
test/common/encoding.h Outdated Show resolved Hide resolved
test/module.mk Show resolved Hide resolved
@phcoder phcoder force-pushed the phcoder:strsquash branch 3 times, most recently from 60317c1 to f5b32c3 Nov 15, 2020
@phcoder phcoder requested a review from criezy Nov 15, 2020
@phcoder phcoder force-pushed the phcoder:strsquash branch from f5b32c3 to 1640d1e Nov 15, 2020
Different platforms have different levels of support of encodings and
often have slight variations. We already have tables for most encoding
with only CJK missing. Full transcoding inclusion allows us to get reliable
encoding results independently of platform. The biggest con is the need for
external tables encoding.dat.

It removes a duplicate table for korean in graphics/korfont.cpp
@phcoder phcoder force-pushed the phcoder:strsquash branch from 1640d1e to d77b5fd Nov 15, 2020
@criezy
criezy approved these changes Nov 15, 2020
Copy link
Member

@criezy criezy left a comment

It looks good to me now.
Thanks a lot for your work on this!

@phcoder phcoder merged commit 68a9136 into scummvm:master Nov 15, 2020
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
deepcode-ci-bot Well done, no issues found!
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.