Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rbx 2.2.7 sigsegv on "load 'file' #3058

Closed
slaught opened this issue Jun 5, 2014 · 11 comments
Closed

Rbx 2.2.7 sigsegv on "load 'file' #3058

slaught opened this issue Jun 5, 2014 · 11 comments

Comments

@slaught
Copy link
Contributor

slaught commented Jun 5, 2014

https://gist.github.com/slaught/ef91508e2a4124962719

Crash while doing

rbx/irb
load 'out3'

out3 is 97mb and is the assignment of an array to a variable. The array has a series of complex hashes.

@yorickpeterse
Copy link
Contributor

Is it possible to provide some form of subset of the data that reproduces the problem? An array with a series of complex hashes could be anything.

@slaught
Copy link
Contributor Author

slaught commented Jun 5, 2014

It is proprietary data. But I am working on making a smaller example. I think the issue is just the size but I will try to write a generate to create sample data.

@yorickpeterse
Copy link
Contributor

Thanks!

@slaught
Copy link
Contributor Author

slaught commented Jun 6, 2014

I tried it with -Xhash.hamt and it didn't crash.

% time rbx -Xhash.hamt out3
184.280u 8.138s 3:07.15 102.8% 0+0k 310+44io 197pf+0w

@yorickpeterse
Copy link
Contributor

@slaught Are you still having this problem without using -Xhash.hamt?

@slaught
Copy link
Contributor Author

slaught commented Jan 5, 2015

In rbx 2.4.1 -Xhash.hamt use to finish in about 3 minutes. Now it doesn't finish after about 20 and neither does a run without it. I'll try to let it run long enough to see if it seg faults still.

@brixen
Copy link
Member

brixen commented Jan 21, 2015

@slaught any chance you could anonymize the data so we can test both the performance issue and the possible segv using non HAMT Hash?

@slaught
Copy link
Contributor Author

slaught commented Jan 22, 2015

Still segv's in 2.5.0 and it takes 22m to get there on my crappy laptop. I took at a pass at anonimizing the data. I need to check with the data owner first if they are ok with what I did.

@slaught
Copy link
Contributor Author

slaught commented Jan 23, 2015

@brixen

here is a first attempt at the file: https://github.com/slaught/rbx-issues-3058
I just did a simple substitution. The keys and values are not as unique as the original data set. I am working on a file with unique strings and random values.

@brixen
Copy link
Member

brixen commented Jan 24, 2015

With the following patch so the code runs, I get the same time (~11m) both with and without HAMT (and no segv, obviously).

Since I plan to remove the chained-bucket implementation as soon as I address a couple issues with HAMT Hash, I don't think there's much value trying to debug this. However, if I can get a case that repros the segv, I'd like to look at it.

diff --git a/hash-file.rb b/hash-file.rb
index f634ce3..96e2efc 100644
--- a/hash-file.rb
+++ b/hash-file.rb
@@ -237,7 +237,7 @@
  "xxxxxxx xxx xxxxx"=>"4",
  "xxxxxxxx xxx xxxxx"=>"4",
  "xxxxxxx xxxxxx (xx)"=>"44.44",
- "xxxxxxx xxxx xxxxxx (xx)"=>"444.44"}], "xxxxxxxxxxxxxxxxx"=>xxxx, "xxxx"=>"444.4444",
+ "xxxxxxx xxxx xxxxxx (xx)"=>"444.44"}], "xxxxxxxxxxxxxxxxx"=>4.4, "xxxx"=>"444.4444",
  "xxxxxxxxxxxxxxxx"=>"44/44/4444",
  "xxxxxxxxxxx"=>"444.4444",
  "xxxx"=>"444.4444",

@brixen
Copy link
Member

brixen commented Jan 4, 2020

Much of the internals of Rubinius have been completely or mostly rewritten in the past couple years. This includes the garbage collector, concurrency facilities, Fibers, much of the instruction set, and a migration away from "primitive" functions that implement Ruby features.

Since a number of segfaults or process hangs have occurred in these features over time, this issue may be fixed.

The focus for Rubinius in the near term is on the following capabilities:

  1. Instruction set
  2. Debugger
  3. Profiler
  4. Just-in-time compiler
  5. Concurrency
  6. Garbage collector

Contributions in the form of PRs for any of the areas of focus above are appreciated. Once these capabilities are more robust, it will be possible to more efficiently debug and fix any process crashes.

Other than these core capabilities, PRs to fix any specific issue are always welcome.

@brixen brixen closed this as completed Jan 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants