Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax: support for extended characters #212

Closed
UWN opened this issue Apr 29, 2022 · 6 comments
Closed

Syntax: support for extended characters #212

UWN opened this issue Apr 29, 2022 · 6 comments

Comments

@UWN
Copy link

UWN commented Apr 29, 2022

In 6.5 the processor character set (PCS) is defined. It not only includes the characters defined by char (6.5) but also may include additional members, known as extended characters. Note that this is not an extension in the sense of 5.5.1, but is part of the actual implementation defined character set. The standard also mentions examples in NOTE 2:

... Examples of extended small letter char (6.5.2) are small letters with grave or acute accent and Japanese Kanji characters.

Now, certainly, an implementation does not have to support additional characters at all, but it seems to make a lot of sense in the times of UTF-8.

So I would expect that

?- 改善.
2022/04/29 09:27:57 failed to query: unexpected token: <invalid 改> % unexpected

is rather treated like

?- '改善'.
2022/04/29 09:28:03 error(existence_error(procedure, '改善'/0), 'Unknown procedure.') % expected

which also many existing implementations do.

@ichiban
Copy link
Owner

ichiban commented Jul 16, 2022

Since v.0.10.0, unicode.Unified_Ideograph, unicode.Hiragana, and unicode.Katakana are added for extended small letter char.

$ $(go env GOPATH)/bin/1pl
Top level for ichiban/prolog v0.10.0
This is for testing purposes only!
See https://github.com/ichiban/prolog for more details.
Type Ctrl-C or 'halt.' to exit.
?- 改善.
2022/07/16 13:43:08 error(existence_error(procedure,改善/0),root)
?- プロログ.
2022/07/16 13:43:15 error(existence_error(procedure,プロログ/0),root)
?- はい.
2022/07/16 13:44:34 error(existence_error(procedure,はい/0),root)

@UWN UWN closed this as completed Jul 16, 2022
@UWN
Copy link
Author

UWN commented Jul 18, 2022

For Korean letters I get:

?- char_code(Ch, 0x314b).
Ch = 'ㅋ', unexpected. % Expected without quotes
?- Ch = 'ㅋ'.
2022/07/18 08:02:44 failed to query: unexpected token: {invalid 'ㅋ}
?- X='\x628\'.
X = 'ب', unexpected. % Expected without quotes
?- X='\x5d0\'.
X = 'א', unexpected. % Expected without quotes

(Warning, I am really not an expert in Unicode at all). It seems that you can treat all

@UWN UWN reopened this Jul 18, 2022
@UWN
Copy link
Author

UWN commented Jul 18, 2022

... It seems that you can treat all letters that are qualified as Lo just as Ll, that is as a lower case letter.

@guregu
Copy link
Contributor

guregu commented Jul 31, 2022

Seems like v0.10.0 broke something with ー (長音符)

?- write('あ').
2022/08/01 05:22:09 failed to query: unexpected token: {invalid 'あ}

Works in v0.9.1.

@ichiban
Copy link
Owner

ichiban commented Aug 7, 2022

Fixed in v0.10.4!

$ $(go env GOPATH)/bin/1pl
Top level for ichiban/prolog v0.10.4
This is for testing purposes only!
See https://github.com/ichiban/prolog for more details.
Type Ctrl-C or 'halt.' to exit.
?- write('あー').
あーtrue.
?- 

@UWN
Copy link
Author

UWN commented Aug 8, 2022

Even the following works:

?-  writeq('𱍊').
𱍊true.

@UWN UWN closed this as completed Aug 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants