split encoding.h #4909

shyouhei · 2021-09-29T07:34:46Z

After I organise header files, this is the largest one now. I think it's a bit too large to read through. Why not split it into files.

Why did they even exist?

My bad. The document is clearly broken. Maybe I pressed my delete key too much. [ci skip]

2,291 lines are too much! include/ruby/encoding.h became the biggest header file once it had doxygen comments. Let us split it into smaller parts, so that we can better organise their contents.

Less macros == huge win.

string.c

These functions assume ASCII compatibility. That has to be ensured in their caller.

Nobu doesn't like (char*) cast.

nobu · 2021-10-05T04:28:32Z

I meant

diff --git a/string.c b/string.c
index 78e2ba923f3..7c2235450d6 100644
--- a/string.c
+++ b/string.c
@@ -5706,7 +5706,7 @@ rb_str_setbyte(VALUE str, VALUE index, VALUE value)
     long pos = NUM2LONG(index);
     long len = RSTRING_LEN(str);
     char *head, *left = 0;
-    unsigned char *ptr;
+    char *ptr;
     rb_encoding *enc;
     int cr = ENC_CODERANGE_UNKNOWN, width, nlen;
 
@@ -5717,18 +5717,18 @@ rb_str_setbyte(VALUE str, VALUE index, VALUE value)
 
     VALUE v = rb_to_int(value);
     VALUE w = rb_int_and(v, INT2FIX(0xff));
-    unsigned char byte = NUM2INT(w) & 0xFF;
+    char byte = (char)(NUM2INT(w) & 0xFF);
 
     if (!str_independent(str))
 	str_make_independent(str);
     enc = STR_ENC_GET(str);
     head = RSTRING_PTR(str);
-    ptr = (unsigned char *)&head[pos];
+    ptr = &head[pos];
     if (!STR_EMBED_P(str)) {
 	cr = ENC_CODERANGE(str);
 	switch (cr) {
 	  case ENC_CODERANGE_7BIT:
-            left = (char *)ptr;
+            left = ptr;
 	    *ptr = byte;
 	    if (ISASCII(byte)) goto end;
 	    nlen = rb_enc_precise_mbclen(left, head+len, enc);

shyouhei · 2021-10-05T05:20:50Z

@nobu That's also okay. But ultimately, ONIGENC_LEFT_ADJUST_CHAR_HEAD operates over OnigUChar*. Using unsigned char here seems a bit more natural, to me.

nobu · 2021-10-05T06:08:29Z

But rb_enc_left_char_head does not use unsigned char.

shyouhei added 5 commits September 29, 2021 11:16

add undeclared variables

5fb8db6

Why did they even exist?

rb_ractor_shareable_p(): fix doxygen

2c6d620

My bad. The document is clearly broken. Maybe I pressed my delete key too much. [ci skip]

split include/ruby/encoding.h

15f02ef

2,291 lines are too much! include/ruby/encoding.h became the biggest header file once it had doxygen comments. Let us split it into smaller parts, so that we can better organise their contents.

include/ruby/encoding.h: convert macros into inline functions

4032d2d

Less macros == huge win.

ruby tool/update-deps --fix

3978a2a

nobu reviewed Sep 29, 2021

View reviewed changes

string.c Outdated Show resolved Hide resolved

nobu reviewed Sep 29, 2021

View reviewed changes

string.c Outdated Show resolved Hide resolved

shyouhei added 2 commits September 30, 2021 15:32

downcase_single/upcase_single: assume ASCII

91f2fc5

These functions assume ASCII compatibility. That has to be ensured in their caller.

rb_enc_left_char_head(): take void*

6f0acaa

Nobu doesn't like (char*) cast.

shyouhei merged commit f032c09 into ruby:master Oct 5, 2021

shyouhei deleted the encoding branch October 5, 2021 05:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

split encoding.h #4909

split encoding.h #4909

shyouhei commented Sep 29, 2021

nobu commented Oct 5, 2021

shyouhei commented Oct 5, 2021

nobu commented Oct 5, 2021

split encoding.h #4909

split encoding.h #4909

Conversation

shyouhei commented Sep 29, 2021

nobu commented Oct 5, 2021

shyouhei commented Oct 5, 2021

nobu commented Oct 5, 2021