Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redo fix cjk fullwidth character display #1668

Merged

Conversation

SlySven
Copy link
Member

@SlySven SlySven commented May 16, 2018

Brief overview of PR changes/additions

This is a clone of #1632 combined with my https://github.com/novload/Mudlet/pull/1 which had some sort of failure when @novload tried to merge the latter into the former.

In addition it also has a separate commit that I have inserted to give/record novload's copyright on the files they modified.

Finally I have also inserted the GPL template into the new files wcwidth.cpp and wcwidth.h. It should be noted/checked by someone else but it is my belief that we can/should relicense wcwidth.cpp as GPL despite however 99% (I guess) is the work of another person and carries their licencing details:

Markus Kuhn -- 2007-05-26 (Unicode 5.0)

Permission to use, copy, modify, and distribute this software
for any purpose and without fee is hereby granted. The author
disclaims all warranties with regard to this software.

@vadi2 please advise if you think this is wrong.

novload and others added 7 commits May 15, 2018 05:37
Signed-off-by: Stephen Lyons <slysven@virginmedia.com>
…jk()

This is a quick-fix to accommodate the fact that some locales (not
currently fully identified) require graphemes with the informative
"East Asian Width" Unicode property of type "Ambiguous" to be drawn as
narrow characters (the default case) and others (probably East Asian) as
wide ones.

The option is controlled (on a per profile basis) by a new checkbox on the
Profile preference's "Main Display" tab - it is stored within the profile
data so will persist between sessions.

NOTE: This commit finally puts the Q_OBJECT macro into play in the Host
class - so it is important to ensure the qmake part of the build process
is re-run after a build of a different branch of the code so that the moc
correctly records that that class is one that it has to process.

Signed-off-by: Stephen Lyons <slysven@virginmedia.com>
Leaving the profile preference control, now labelled:
"Make 'Ambiguous' E. Asian width characters wide"
in the tri-stated (partially-checked) default setting means that GBK and
GB18030 MUD server encodings will use the 'wide' setting and all others the
'narrow' one, but allows the user to manually force the setting to be
'narrow' (unchecked) or 'wide' (checked) if required.

Signed-off-by: Stephen Lyons <slysven@virginmedia.com>
…Mudlet

Something went wrong in trying to apply a Pull-Request from myself onto a
PR Huadong Qi wanted to made to the Mudlet codebase which made it
un-applyible.  To solve this I have taken their original code, appended my
extra code and intend to apply the combination as a second PR which should
merge into the main repository in a straightforward manner.

I have also, in this commit, filled out the copyright lines on the files
that Huadong Qi modified and added the GPL template with their name onto
the two new files that has been inserted into the code base.  The main one
wcwidth.cpp is a modified copy of Markus Knuth (C) 2007 wcwidth.c which
bears the following licencing terms:
"Permission to use, copy, modify, and distribute this software for any
purpose and without fee is hereby granted. The author disclaims all
warranties with regard to this software.

Latest version: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c"

As such this means that we can (need?) to re-license it under a GPL but it
is my considered opinion that this is approprate for our usage.

Signed-off-by: Stephen Lyons <slysven@virginmedia.com>
@SlySven SlySven requested review from a team as code owners May 16, 2018 02:29
Conflicts resolved in:
* src/CMakeLists.txt
* src/Host.h
* src/XMLexport.cpp

Signed-off-by: Stephen Lyons <slysven@virginmedia.com>
Copy link
Member

@vadi2 vadi2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial feedback on the code - have not been able to QA this in practice yet. Thanks for merging the PRs together! 😄

@@ -0,0 +1,326 @@
/***************************************************************************
* Copyright (C) 2018 by Huadong Qi - novload@outlook.com *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect - this entire GPL section needs to be taken out. Line 72+ provides copyright information for this file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However @novload has made modifications to this file (changing a type from a wchar_t to a uint) - you can compare it to the original here and they have also changed it from a C language to, technically C++ as far as Qt Creator identifies it - although I am not entirely sure that is anything more than the file extension change. As he has modified it to include it in our project (even though the changes are small) and we ask for material to be GPL 2+ licenced I think we have to relicense it. The original code licence does seem to be compatible with this, IMHO, but IANAL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK!

#define WCWIDTH_H

/***************************************************************************
* Copyright (C) 2018 by Huadong Qi - novload@outlook.com *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new file created by @novload - fair enough it is little more than a stub but it contains #include <QtGlobal> so it is definitely new code and is not from the original material of Markus Kuhn.

src/Host.cpp Outdated
@@ -172,6 +173,8 @@ Host::Host(int port, const QString& hostname, const QString& login, const QStrin
, mRetries(5)
, mSaveProfileOnExit(false)
, mHaveMapperScript(false)
, mIsAmbigousWidthGlyphsSettingAutomatic(true)
, mIsAmbigousWidthGlyphsToBeWide(false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mIsAmbigousWidthGlyphsSettingAutomatic and mIsAmbigousWidthGlyphsToBeWide are not human-friendly variable names. Maybe for robots they're ok 🤖

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are long - I admit 37 and 33 characters I think, but to what I laughingly call my mind they are human friendly in that they do say what they mean. I can't track it down but I believe the ISO standards say that at least the first 31 characters of an identifier are significant to a C/C++ compiler (and with the name mangling of the latter I suspect longer ones are still distinguishable).

Do you have suitable replacements you would like me to consider?

As for robots - they don't care - they (and the compiler) only worry (AFAICT) about the memory location that is assigned for that identifier...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mAmbigiousWidth enum - 3 values, auto, wide, narrow.

src/Host.cpp Outdated
@@ -1186,3 +1189,57 @@ QString Host::readProfileData(const QString& item)

return ret;
}

void Host::setUseWideAmbiguousEAsianGlyphs(const Qt::CheckState state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"set use" - this function sounds confused - pick one, either "set" or "use" is good!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have a 'set' please, Bob! *Oops, thinking back to a TV quiz from my youth and replacing the word 'set' with the letter 'P' - in that quiz the school-pupil contestants requested a question with an answer beginning with the letter asked for - and due to an unfortunate slang name collision there was always a titter when there was a request for the letter between 'O' and 'Q'*

@@ -66,6 +69,7 @@ TTextEdit::TTextEdit(TConsole* pC, QWidget* pW, TBuffer* pB, Host* pH, bool isDe
, mpConsole(pC)
, mpHost(pH)
, mpScrollBar(nullptr)
, mIsAmbigousWidthGlyphsToBeWide(pH->getUseWideAmbiguousEAsianGlyphs())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

src/Host.cpp Outdated
// Set things automatically
mIsAmbigousWidthGlyphsSettingAutomatic = true;

if ( encoding == QLatin1String("GBK")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

"the grapheme (what is to be represented). Clearing this checkbox will allow "
"the best alternative glyph from another font to be used to draw that grapheme.</p>"));
checkBox_useWideAmbiguousEastAsianGlyphs->setToolTip(QStringLiteral("<html><head/><body>%1</body></html>")
.arg("<p>Some, generally East Asian, MUDs may use glyphs (characters) that are classified in Unicode "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be simpler if we want non-native English speakers to be able to understand this.

Provide an example and say "should this character be drawn as 1 or 2 spaces wide"

src/Host.h Outdated
// Uses PartiallyChecked to set the automatic mode, otherwise Checked/Unchecked means use wide/narrow ambiguous glyphs
void setUseWideAmbiguousEAsianGlyphs( const Qt::CheckState state );
// Is used to set preference dialog control directly:
Qt::CheckState getUseWideAmbiguousEAsianGlyphsControlState() { QMutexLocker locker(& mLock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

44-character function names are not OK!

@vadi2
Copy link
Member

vadi2 commented May 16, 2018

I think the question mark threw it off:

image

    ∷∷  也不是什么愉快的过程,但是,我想我并不后悔。         ∷∷   
    ∷∷                                                          ∷∷
    ∷∷     人的一生,有几个能够脱身事外,神交千里的,又有几   ∷∷
    ∷∷  个素未谋面,可借此头的?我喜欢偶尔在这个世界里出现,   ∷∷
    ∷∷  因为在这里,我们也许欢笑一堂,也许睚眦必报,但是我们   ∷∷
    ∷∷  都没有那么虚伪,我们活得不累。娱乐存在的意义,不在于   ∷∷
    ∷∷  它给我欢乐,而在于它给了我空间,尽管这空间是虚拟的,   ∷∷
    ∷∷  但感情是真实的,这恰恰与现实世界相反——岂不怪哉?       ∷∷
    ∷∷                                                          ∷∷
    ∷∷                   -- by guodalu@pkuxkx  ∷∷
    ∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷      

Wow, the browser does a terrible job.

@SlySven
Copy link
Member Author

SlySven commented May 16, 2018

It is not necessarily your browser but could be the font you are using - and the glyphs from another font that is being used because they are not in the primary (requested) font - which depends on just which fonts are on your system and which are likely to be different for other people. If you are using a mono-spaced font it is very likely going to render wrongly because the text is duo-spaced and if you are using a proportional font then of course it is not going to look the same...

... hence the problem we are trying to solve to massage the text to line up! 😉

@SlySven
Copy link
Member Author

SlySven commented May 16, 2018

Also, replacing the normal spaces with an 'S' and the ideographic spaces with 'I' you get:

SSSS∷∷SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS∷∷
SSSS∷∷IISSS人的一生,有几个能够脱身事外,神交千里的,又有几SSS∷∷
SSSS∷∷IS个素未谋面,可借此头的?我喜欢偶尔在这个世界里出现,SSS∷∷
SSSS∷∷IS因为在这里,我们也许欢笑一堂,也许睚眦必报,但是我们SSS∷∷
SSSS∷∷IS都没有那么虚伪,我们活得不累。娱乐存在的意义,不在于SSS∷∷
SSSS∷∷IS它给我欢乐,而在于它给了我空间,尽管这空间是虚拟的,SSS∷∷
SSSS∷∷IS但感情是真实的,这恰恰与现实世界相反——岂不怪哉?SSSSSSS∷∷
SSSS∷∷SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS∷∷
SSSS∷∷IIISIIIIIIIIIIIIISS--SbySguodalu@pkuxkxIS∷∷
SSSS∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷∷SSSSSS

Not sure what conclusions can be drawn from that - other than the Pkuxkx MUD must have some weird logic behind it (or purely someone typing at an East Asian keyboard) - to come up with the choices for spacing characters that they did, IMHO.

@vadi2
Copy link
Member

vadi2 commented May 17, 2018

OK - we don't have to obsess over getting the rendering perfect. Selection doesn't work right either, and I think this is something you'll get fixed with your PR.

@vadi2 vadi2 assigned SlySven and unassigned vadi2 May 17, 2018
Signed-off-by: Stephen Lyons <slysven@virginmedia.com>
@vadi2
Copy link
Member

vadi2 commented May 21, 2018

Hey, how's progress on this? There's only a few minor changes left to do.

@SlySven
Copy link
Member Author

SlySven commented May 22, 2018

Just pushed up some changes to some names to try and shorten them a little. If there is anything else please point me at it as I've gotten a little distracted by the number of balls I am trying to keep in the air. 🙁

Copy link
Member

@vadi2 vadi2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just fix that one typo

Signed-off-by: Stephen Lyons <slysven@virginmedia.com>
@SlySven SlySven merged commit 37eeb41 into Mudlet:development May 22, 2018
@SlySven SlySven deleted the redo_fix_cjk_fullwidth_character_display branch May 22, 2018 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants