Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash at class instance value access #5201

Closed
yamori813 opened this issue Nov 28, 2020 · 34 comments
Closed

crash at class instance value access #5201

yamori813 opened this issue Nov 28, 2020 · 34 comments

Comments

@yamori813
Copy link
Contributor

yamori813 commented Nov 28, 2020

I use mruby on realtek mips soc by bare metal.

I found problem at 2e30c68.

That is crash on modified mruby-simplehttp class at line of @uri[:path] = path in request method.

https://github.com/yamori813/mruby-simplehttp

I do workaround for this problem by not use @uri[:path].

I don't have crash but @uri[:port] is half value. I send double value in script then work is fine.

@yamori813 yamori813 changed the title crash at class global value access crash at class instance value access Nov 29, 2020
@yamori813
Copy link
Contributor Author

I tested mruby-simplehttp by host binary FreeBSD/i386. This is no problem.

Realtek soc is Big Endian. Is this relation problem ?

@dearblue
Copy link
Contributor

dearblue commented Dec 3, 2020

I have FreeBSD 12.1 running on qemu-system-mips -M malta (mips32 r3000? Big-endian).
No issues were found with gembox "full-core" (only the UDPSocket test failed because ipv6 is unavailable).

@yamori813, Can you show me the specific build settings and the Ruby code you tried?
I don't own a real MIPS machine, so the FreeBSD version is good.

For FreeBSD/mips on qemu, I referred to https://wiki.freebsd.org/MipsEmulation.
The cross-compiler used at this time was "gcc version 4.2.1 20070831 patched [FreeBSD]".

@yamori813
Copy link
Contributor Author

@yamori813
Copy link
Contributor Author

I try simple code.

This is crash

a = {}
a[:aaa] = "."

This is no crash.

a = {}
a[:aaa] = 0

@yamori813
Copy link
Contributor Author

This is mrbc message by crash script on FreeBSD 12.2R i386.

https://gist.github.com/yamori813/cb6cd857d85d5f59b1e588feaf7b956c

Is this relation the problem?

@dearblue
Copy link
Contributor

dearblue commented Dec 4, 2020

I try simple code.

This is crash

a = {}
a[:aaa] = "."

This is no crash.

a = {}
a[:aaa] = 0

This issue has already been reported and fixed.
But I don't think it has anything to do with the problem you first presented.

I got caught a little, so I saw the code of mruby-yabm.
Well, it uses mrb_get_args(), but there are some bad things. For example:
https://github.com/yamori813/mruby-yabm/blob/8de59864d95b8fcd5b05bf3b9102f8592c385a71/src/mrb_yabm.c#L215-L216

The receiving types corresponding to the i specifier is correctly mrb_int, but incorrectly mrb_value.
In addition, there are other places that try to receive with the type int.

Is it possible to fix these first and check again?
😀

@yamori813
Copy link
Contributor Author

yamori813 commented Dec 5, 2020

I update mruby code to 00d1fd0.

I understand half value cause of mrb_value in mruby-yabm.

I found fix that is hash access not in class.

But I still have crash(hangup) by hash access at request method in mruby-simplehttp.

@matz
Copy link
Member

matz commented Dec 5, 2020

Did you fix all the bugs (mis-uses) of mrb_get_args with specifier i? As of yamori813/mruby-yabm@8de5986, I found at least 8 mis-uses. i takes mrb_int, not mrb_value nor int.

@yamori813
Copy link
Contributor Author

I did fix all. yamori813/mruby-yabm@30e2fd6 But still hang up at @uri[:path] = path in mruby-simplehttp.

@dearblue
Copy link
Contributor

dearblue commented Dec 5, 2020

I thought after seeing #5214, what about adding conf.cc.defines << "MRB_USE_METHOD_T_STRUCT" to build_config.rb?

@dearblue
Copy link
Contributor

dearblue commented Dec 5, 2020

@yamori813, I say extra meddling.
Let's enclose the Ruby code in back quotes (`~~~`). For example, `@uri[:path] = path`
In particular, if you write the instance variable of Ruby as it is, it will be mentioned to user on github.
uri-san on github has been called upon so many times that he may be surprised each time.

@dearblue
Copy link
Contributor

dearblue commented Dec 5, 2020

I'm sorry, I made a mistake.
Try MRB_USE_METHOD_T_STRUCT instead of MRB_METHOD_T_STRUCT.

@yamori813
Copy link
Contributor Author

yamori813 commented Dec 5, 2020

I think MRB_USE_METHOD_T_STRUCT is default at 32bit cpu in mrbconf.h.

Now addtion hang up SimpleHttpResponse.new(response_text). This is my work aound.

  def request(method, path, req, &b)
#    @uri[:path] = path
#    if @uri[:path].nil?
#      @uri[:path] = "/"
#    elsif @uri[:path][0] != "/"
#      @uri[:path] = "/" + @uri[:path]
#    end
    request_header = create_request_header(method.upcase.to_s, path, req)
    response_text = send_request(request_header, &b)
#    SimpleHttpResponse.new(response_text)
  end

This is work.

@shuujii
Copy link
Contributor

shuujii commented Dec 6, 2020

@yamori813 Could you tell us a few things?

  • Will it be reproduced even if you build after rake MRUBY_CONFIG=... clean?
  • Is it possible to get the C source location, backtrace, and so on when hung up by debugger etc.?
  • Is it only Hash access that doesn't work? If so, could you please tell us the following as well?
    • What is the content of Hash that does not work?
    • If Hash has the same content as above, is it reproduced with a local variable instead of an instance variable?
    • Is it reproduced without mruby-simplehttp?

@yamori813
Copy link
Contributor Author

I do clean some time but I have problem.
This is Realtek mips soc bear metal enviroment. I can't use debugger. I call Kani-san,
https://qiita.com/yamori813/items/2070538f5a95c2d31893
I check simple test class with instance value hash that is work.

shuujii added a commit to shuujii/mruby that referenced this issue Dec 9, 2020
@yamori813
Copy link
Contributor Author

I seem hang up is case of CPU exception. If this is true UART have message. I don't to access to UART now. I can't check it.

@yamori813
Copy link
Contributor Author

I get UART access now. Exception is TLBMissRd(2). I seem memory access problem.

@matz
Copy link
Member

matz commented Dec 16, 2020

Recently merged #5223 may improve the situation. Could you try it?

@shuujii
Copy link
Contributor

shuujii commented Dec 16, 2020

I asked @yamori813 san on Twitter before to check the behavior with the same fix as #5223, but the situation seems to be the same.

@matz
Copy link
Member

matz commented Dec 16, 2020

Sigh.

@dearblue
Copy link
Contributor

@yamori813, Could you try conf.cc.defines << "MRB_NO_BOXING"?

Since mruby3, MRB_WORD_BOXING has become the standard, so I would like to know if it is affecting it.

And how about disabling optimization?
conf.cc.flags << "-O0"

@yamori813
Copy link
Contributor Author

I try MRB_NO_BOXING and -O0. But Still have exception.

@dearblue
Copy link
Contributor

Thank you very much.
I have no idea what the cause is...

@yamori813
Copy link
Contributor Author

only hash code back to 2.1.2. Then CPU exception is not occur.

debian % git status
HEAD detached from 81fc5ff4
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   include/mruby/hash.h
        modified:   mrblib/hash.rb
        modified:   src/hash.c
        modified:   test/t/hash.rb

no changes added to commit (use "git add" and/or "git commit -a")

But I still have port number is half problem.

@yamori813
Copy link
Contributor Author

port number problem fixed by delete build directory.

@yamori813
Copy link
Contributor Author

yamori813 commented Jan 1, 2021

I find fix.

diff --git a/src/hash.c b/src/hash.c
index b800a251..d70e6d15 100644
--- a/src/hash.c
+++ b/src/hash.c
@@ -360,7 +360,7 @@ ea_next_capa_for(uint32_t size, uint32_t max_capa)
     return AR_DEFAULT_CAPA;
   }
   else {
-    uint64_t capa = U64(size) * EA_INCREASE_RATIO, inc = capa - size;
+    uint64_t capa = size * EA_INCREASE_RATIO, inc = capa - size;
     if (EA_MAX_INCREASE < inc) capa = size + EA_MAX_INCREASE;
     return capa <= max_capa ? U32(capa) : max_capa;
   }

@yamori813
Copy link
Contributor Author

I seem this is gcc4-mips-realtek bug.

Work code is this.

000013d8 <ea_next_capa_for>:
    13d8:	27bdffe8 	addiu	sp,sp,-24
    13dc:	afbe0014 	sw	s8,20(sp)
    13e0:	03a0f021 	move	s8,sp
    13e4:	afc40018 	sw	a0,24(s8)
    13e8:	afc5001c 	sw	a1,28(s8)
    13ec:	8fc20018 	lw	v0,24(s8)
    13f0:	00000000 	nop
    13f4:	2c420004 	sltiu	v0,v0,4
    13f8:	10400004 	beqz	v0,140c <ea_next_capa_for+0x34>
    13fc:	00000000 	nop
    1400:	24020004 	li	v0,4
    1404:	08000541 	j	1504 <ea_next_capa_for+0x12c>
    1408:	00000000 	nop
    140c:	8fc20018 	lw	v0,24(s8)
    1410:	00000000 	nop
    1414:	00021040 	sll	v0,v0,0x1
    1418:	00021880 	sll	v1,v0,0x2
    141c:	00621823 	subu	v1,v1,v0
    1420:	3c02cccc 	lui	v0,0xcccc
    1424:	3442cccd 	ori	v0,v0,0xcccd
    1428:	00620019 	multu	v1,v0
    142c:	00001010 	mfhi	v0
    1430:	00021082 	srl	v0,v0,0x2
    1434:	24420006 	addiu	v0,v0,6
    1438:	afc2000c 	sw	v0,12(s8)
    143c:	afc00008 	sw	zero,8(s8)
    1440:	8fc90018 	lw	t1,24(s8)
    1444:	00004021 	move	t0,zero
    1448:	8fc5000c 	lw	a1,12(s8)
    144c:	8fc40008 	lw	a0,8(s8)
    1450:	00a91823 	subu	v1,a1,t1
    1454:	00a3502b 	sltu	t2,a1,v1
    1458:	00881023 	subu	v0,a0,t0
    145c:	004a2023 	subu	a0,v0,t2
    1460:	00801021 	move	v0,a0
    1464:	afc30004 	sw	v1,4(s8)
    1468:	afc20000 	sw	v0,0(s8)
    146c:	8fc20000 	lw	v0,0(s8)
    1470:	00000000 	nop
    1474:	1440000a 	bnez	v0,14a0 <ea_next_capa_for+0xc8>
    1478:	00000000 	nop
    147c:	8fc20000 	lw	v0,0(s8)
    1480:	00000000 	nop
    1484:	1440000b 	bnez	v0,14b4 <ea_next_capa_for+0xdc>
    1488:	00000000 	nop
    148c:	8fc30004 	lw	v1,4(s8)
    1490:	3c020001 	lui	v0,0x1
    1494:	0062102b 	sltu	v0,v1,v0
    1498:	14400006 	bnez	v0,14b4 <ea_next_capa_for+0xdc>
    149c:	00000000 	nop
    14a0:	8fc30018 	lw	v1,24(s8)
    14a4:	3402ffff 	li	v0,0xffff
    14a8:	00621021 	addu	v0,v1,v0
    14ac:	afc2000c 	sw	v0,12(s8)
    14b0:	afc00008 	sw	zero,8(s8)
    14b4:	8fc7001c 	lw	a3,28(s8)
    14b8:	00003021 	move	a2,zero
    14bc:	8fc20008 	lw	v0,8(s8)
    14c0:	00000000 	nop
    14c4:	00c2102b 	sltu	v0,a2,v0
    14c8:	1440000d 	bnez	v0,1500 <ea_next_capa_for+0x128>
    14cc:	00000000 	nop
    14d0:	8fc30008 	lw	v1,8(s8)
    14d4:	00c01021 	move	v0,a2
    14d8:	14620006 	bne	v1,v0,14f4 <ea_next_capa_for+0x11c>
    14dc:	00000000 	nop
    14e0:	8fc2000c 	lw	v0,12(s8)
    14e4:	00000000 	nop
    14e8:	00e2102b 	sltu	v0,a3,v0
    14ec:	14400004 	bnez	v0,1500 <ea_next_capa_for+0x128>
    14f0:	00000000 	nop
    14f4:	8fc2000c 	lw	v0,12(s8)
    14f8:	08000541 	j	1504 <ea_next_capa_for+0x12c>
    14fc:	00000000 	nop
    1500:	8fc2001c 	lw	v0,28(s8)
    1504:	03c0e821 	move	sp,s8
    1508:	8fbe0014 	lw	s8,20(sp)
    150c:	27bd0018 	addiu	sp,sp,24
    1510:	03e00008 	jr	ra
    1514:	00000000 	nop

Crash code is this.

000013d8 <ea_next_capa_for>:
    13d8:	27bdffc8 	addiu	sp,sp,-56
    13dc:	afbf0034 	sw	ra,52(sp)
    13e0:	afbe0030 	sw	s8,48(sp)
    13e4:	afb3002c 	sw	s3,44(sp)
    13e8:	afb20028 	sw	s2,40(sp)
    13ec:	afb10024 	sw	s1,36(sp)
    13f0:	afb00020 	sw	s0,32(sp)
    13f4:	03a0f021 	move	s8,sp
    13f8:	afc40038 	sw	a0,56(s8)
    13fc:	afc5003c 	sw	a1,60(s8)
    1400:	8fc40038 	lw	a0,56(s8)
    1404:	00000000 	nop
    1408:	2c840004 	sltiu	a0,a0,4
    140c:	10800004 	beqz	a0,1420 <ea_next_capa_for+0x48>
    1410:	00000000 	nop
    1414:	24020004 	li	v0,4
    1418:	0800055f 	j	157c <ea_next_capa_for+0x1a4>
    141c:	00000000 	nop
    1420:	8fc30038 	lw	v1,56(s8)
    1424:	00001021 	move	v0,zero
    1428:	000327c2 	srl	a0,v1,0x1f
    142c:	00024040 	sll	t0,v0,0x1
    1430:	00884025 	or	t0,a0,t0
    1434:	00034840 	sll	t1,v1,0x1
    1438:	01201821 	move	v1,t1
    143c:	01001021 	move	v0,t0
    1440:	00032782 	srl	a0,v1,0x1e
    1444:	00023080 	sll	a2,v0,0x2
    1448:	00863025 	or	a2,a0,a2
    144c:	00033880 	sll	a3,v1,0x2
    1450:	00e32823 	subu	a1,a3,v1
    1454:	00e5402b 	sltu	t0,a3,a1
    1458:	00c22023 	subu	a0,a2,v0
    145c:	00881023 	subu	v0,a0,t0
    1460:	00402021 	move	a0,v0
    1464:	00a01821 	move	v1,a1
    1468:	00801021 	move	v0,a0
    146c:	3c040000 	lui	a0,0x0
    1470:	24880000 	addiu	t0,a0,0
    1474:	00602821 	move	a1,v1
    1478:	00402021 	move	a0,v0
    147c:	24070005 	li	a3,5
    1480:	00003021 	move	a2,zero
    1484:	0100f809 	jalr	t0
    1488:	00000000 	nop
    148c:	00602821 	move	a1,v1
    1490:	00402021 	move	a0,v0
    1494:	24070006 	li	a3,6
    1498:	00003021 	move	a2,zero
    149c:	00a71821 	addu	v1,a1,a3
    14a0:	0065402b 	sltu	t0,v1,a1
    14a4:	00861021 	addu	v0,a0,a2
    14a8:	01022021 	addu	a0,t0,v0
    14ac:	00801021 	move	v0,a0
    14b0:	afc3001c 	sw	v1,28(s8)
    14b4:	afc20018 	sw	v0,24(s8)
    14b8:	8fd30038 	lw	s3,56(s8)
    14bc:	00009021 	move	s2,zero
    14c0:	8fc5001c 	lw	a1,28(s8)
    14c4:	8fc40018 	lw	a0,24(s8)
    14c8:	00b31823 	subu	v1,a1,s3
    14cc:	00a3302b 	sltu	a2,a1,v1
    14d0:	00921023 	subu	v0,a0,s2
    14d4:	00462023 	subu	a0,v0,a2
    14d8:	00801021 	move	v0,a0
    14dc:	afc30014 	sw	v1,20(s8)
    14e0:	afc20010 	sw	v0,16(s8)
    14e4:	8fc20010 	lw	v0,16(s8)
    14e8:	00000000 	nop
    14ec:	1440000a 	bnez	v0,1518 <ea_next_capa_for+0x140>
    14f0:	00000000 	nop
    14f4:	8fc20010 	lw	v0,16(s8)
    14f8:	00000000 	nop
    14fc:	1440000b 	bnez	v0,152c <ea_next_capa_for+0x154>
    1500:	00000000 	nop
    1504:	8fc30014 	lw	v1,20(s8)
    1508:	3c020001 	lui	v0,0x1
    150c:	0062102b 	sltu	v0,v1,v0
    1510:	14400006 	bnez	v0,152c <ea_next_capa_for+0x154>
    1514:	00000000 	nop
    1518:	8fc30038 	lw	v1,56(s8)
    151c:	3402ffff 	li	v0,0xffff
    1520:	00621021 	addu	v0,v1,v0
    1524:	afc2001c 	sw	v0,28(s8)
    1528:	afc00018 	sw	zero,24(s8)
    152c:	8fd1003c 	lw	s1,60(s8)
    1530:	00008021 	move	s0,zero
    1534:	8fc20018 	lw	v0,24(s8)
    1538:	00000000 	nop
    153c:	0202102b 	sltu	v0,s0,v0
    1540:	1440000d 	bnez	v0,1578 <ea_next_capa_for+0x1a0>
    1544:	00000000 	nop
    1548:	8fc30018 	lw	v1,24(s8)
    154c:	02001021 	move	v0,s0
    1550:	14620006 	bne	v1,v0,156c <ea_next_capa_for+0x194>
    1554:	00000000 	nop
    1558:	8fc2001c 	lw	v0,28(s8)
    155c:	00000000 	nop
    1560:	0222102b 	sltu	v0,s1,v0
    1564:	14400004 	bnez	v0,1578 <ea_next_capa_for+0x1a0>
    1568:	00000000 	nop
    156c:	8fc2001c 	lw	v0,28(s8)
    1570:	0800055f 	j	157c <ea_next_capa_for+0x1a4>
    1574:	00000000 	nop
    1578:	8fc2003c 	lw	v0,60(s8)
    157c:	03c0e821 	move	sp,s8
    1580:	8fbf0034 	lw	ra,52(sp)
    1584:	8fbe0030 	lw	s8,48(sp)
    1588:	8fb3002c 	lw	s3,44(sp)
    158c:	8fb20028 	lw	s2,40(sp)
    1590:	8fb10024 	lw	s1,36(sp)
    1594:	8fb00020 	lw	s0,32(sp)
    1598:	27bd0038 	addiu	sp,sp,56
    159c:	03e00008 	jr	ra
    15a0:	00000000 	nop

@shuujii
Copy link
Contributor

shuujii commented Jan 1, 2021

Thank you for the report.

I find fix.

However, this fix looks inadequate. The original code casts size to uint64_t because a large size would exceed 32-bit range during the calculation.

I seem this is gcc4-mips-realtek bug.

If it is a compiler bug for 64-bit operation, mruby uses 64-bit operation elsewhere, so I think it is difficult for mruby to handle it.

@dearblue
Copy link
Contributor

dearblue commented Jan 1, 2021

In 32-bit CPU mode, I don't think the computational range of ea_next_capa_for() will exceed uint32_t even if the entire 32-bit address space is available as a heap.

sizeof(struct hash_entry)
# => 8 (with MRB_WORD_BOXING)

UINT32_MAX / sizeof(struct hash_entry)
# => 536870911 (0x1fffffff)

UINT32_MAX / sizeof(struct hash_entry) * 6
# => 3221225466 (0xbffffffa)

If the result of the calculation exceeds UINT32_MAX / sizeof(struct hash_entry), subsequent resizes should fail.
# It would be useful to have an mrb_reallocarray() to mimic the OpenBSD originated reallocarray().

How about replacing uint32_t with uintptr_t (or size_t), considering 64-bit CPU mode?

@yamori813
Copy link
Contributor Author

yamori813 commented Jan 1, 2021

Sorry I use Japanese.

リアルテックのSDKに含まれるgccはLexraアーキテクチャに修正されたもので、gccの
メインラインにはフィードバックされてないものになります。LexraとはMIPSアーキテ
クチャの亜種になります。今使っているgccは4.4.5を元にリアルテックが修正したバー
ジョンです。このため独自でgccをビルドすることはできません。いくつかのgccのバ
ージョンがSDKにはありますが、他のライブラリにも影響を与える事や、また複数の
Lexraのアーキテクチャをビルドしていて環境が煩雑になるため、また闇雲に確認する
のはとろうになる可能性が極めて高いと思います。

私としてはマクロよるifdefのような解決策でも良いかと思っています。

@matz
Copy link
Member

matz commented Jan 2, 2021

OK. I understand the situation. I try to remove (or at least reduce) int64_t operations on 32-bit architectures.
@yamori813 could you tell us if there's a platform macro that specifies your platform?

@shuujii
Copy link
Contributor

shuujii commented Jan 2, 2021

In 32-bit CPU mode, I don't think the computational range of ea_next_capa_for() will exceed uint32_t even if the entire 32-bit address space is available as a heap.

Certainly! It doesn't exceed 32-bit range if EA_INCREASE_RATIO is current value.

However, even on 32-bit CPU, It exceeds 32-bit range in the following parts.

mruby/src/hash.c

Lines 625 to 627 in 1fbabe2

bit_pos = U64(it->bit) * (it->pos + 1) - 1;
it->ary_index = U32(bit_pos / IB_TYPE_BIT);
it->shift2 = U32((U64(it->ary_index) + 1) * IB_TYPE_BIT - bit_pos - 1);

@yamori813 When you insert more than 17 entries into Hash, the above part is executed, does it work correctly?

I think there is a way to calculate within 32-bit range, but I can't think of it.

@shuujii
Copy link
Contributor

shuujii commented Jan 2, 2021

私としてはマクロよるifdefのような解決策でも良いかと思っています。

その方針でうまく解決できる方法が分かっているのでしたら、とりあえずパッチを書いていただけないでしょうか。

matz added a commit that referenced this issue Jan 2, 2021
Also avoid using `uint64_t`.
@matz matz closed this as completed in eb9d9e3 Jan 3, 2021
matz added a commit that referenced this issue Jan 3, 2021
…hash.c

Avoid 64-bit operations in `src/hash.c`; close #5201
@yamori813
Copy link
Contributor Author

I find fix this issue by 7c9d9b1. Thanks for all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants