Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode symbols? #40

Open
MuthaX opened this issue Mar 27, 2022 · 2 comments
Open

Unicode symbols? #40

MuthaX opened this issue Mar 27, 2022 · 2 comments

Comments

@MuthaX
Copy link

MuthaX commented Mar 27, 2022

Does this plugin supports strings where every cell is separate Unicode symbol (UTF-32 coded)?

@Y-Less
Copy link
Owner

Y-Less commented Mar 28, 2022

I honestly don't know. That should be tested.

@MuthaX
Copy link
Author

MuthaX commented Mar 29, 2022

I honestly don't know. That should be tested.

So... I'm tested this and got negative result.
I'm used my script for generate UTF-8 and Unicode-strings (https://github.com/MuthaX/PawnUTF)

// The source-code file in UTF-8.
#include <a_samp>
#include <sscanf2.cpp>
#include <PawnUtfConverter>
stock PrintChars(const header_msg[], const array[]) {
	printf("printing: %s", header_msg);
	new len = strlen(array);
	for(new i = 0; i < len; ++i) {
		printf("%d|%x (%d)", i, array[i], array[i]);
	}
}
main() {
	new
		utf8_stream[] = "some смешанный 123 теxt",
		utf8_string[sizeof(utf8_stream)],
		unicode_string[sizeof(utf8_string)],
		uword_1[16],
		uword_2[16],
		u_number,
		uword_3[16]
	;
	//	Because of utf8_stream' content (at Cyrillic symbols) weirdly compiled as "1_byte = 1_cell"(where non-ASCII(<128) is 2-byte long) ...
	//	... we merge bytes from separate cells into symbols in terms of UTF-8 coding to "1_symbol = 1_cell".
	PawnUTF_StreamToUTF(utf8_stream, sizeof(utf8_stream), utf8_string, sizeof(utf8_string), false);
	PrintChars("utf8_string", utf8_string);
	//	Now we translate UTF-8 string into Unicode-string (which equivalent is UTF-32).
	PawnUTF_StringUTF_ToUnicode(utf8_string, sizeof(utf8_string), unicode_string, sizeof(unicode_string));
	PrintChars("unicode_string", unicode_string);
	sscanf(unicode_string, "p< >s[16]s[16]ds[16]", uword_1, uword_2, u_number, uword_3);
	PrintChars("uword_1", uword_1);
	PrintChars("uword_2", uword_2);
	PrintChars("uword_3", uword_3);
	return 1;
}

And output is:

printing: utf8_string
0|73 (115)
1|6F (111)
2|6D (109)
3|65 (101)
4|20 (32)
5|D181 (53633)
6|D0BC (53436)
7|D0B5 (53429)
8|D188 (53640)
9|D0B0 (53424)
10|D0BD (53437)
11|D0BD (53437)
12|D18B (53643)
13|D0B9 (53433)
14|20 (32)
15|31 (49)
16|32 (50)
17|33 (51)
18|20 (32)
19|D182 (53634)
20|D0B5 (53429)
21|78 (120)
22|74 (116)
printing: unicode_string
0|73 (115)
1|6F (111)
2|6D (109)
3|65 (101)
4|20 (32)
5|441 (1089)
6|43C (1084)
7|435 (1077)
8|448 (1096)
9|430 (1072)
10|43D (1085)
11|43D (1085)
12|44B (1099)
13|439 (1081)
14|20 (32)
15|31 (49)
16|32 (50)
17|33 (51)
18|20 (32)
19|442 (1090)
20|435 (1077)
21|78 (120)
22|74 (116)
printing: uword_1
0|73 (115)
1|6F (111)
2|6D (109)
3|65 (101)
printing: uword_2
0|41 (65)
1|3C (60)
2|35 (53)
3|48 (72)
4|30 (48)
5|3D (61)
6|3D (61)
7|4B (75)
8|39 (57)
printing: uword_3
0|42 (66)
1|35 (53)
2|78 (120)
3|74 (116)

As you can see the Unicode-string' symbols just truncated to 1 byte width.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants