-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a spec-runner configuration that tests all of Ruby Core #1088
Conversation
It runs to completion! ``` Passed 4500, skipped 1097, not implemented 774, failed 5188 specs. ``` This change adds a self-referential Hash check to `Hash#inspect` to prevent a stack overflow. This change modifes `mrb_convert_type` to prevent a recursive attempt to raise a `TypeError` when attempting to convert a type to `String` that doesn't implement `#to_s`. The `all-core-specs.toml` configuration enables all top-level specs in `spec-runner/vendor/spec/core` except: - `IO` (attempt to instantiate a binary `Regexp` that aborts the spec-runner. - `Kernel` (infinite loop somewhere in there) - `Mutex` - `ObjectSpace` - `Process` - `Thread` - `ThreadGroup` - `TracePoint` This config also copies the permitted spec list from `enforced-specs.toml` for `Array`. `Array#initialize` specs have an exception that crashes the spec-runner.
Hi @eregon 👋 This PR adds a configuration profile for Artichoke's spec runner that tests a wider swath or Ruby Core APIs. I was curious if you'd be willing to add a note to https://eregon.me/blog/2020/06/27/ruby-spec-compatibility-report.html that points folks at this PR for a significant update. |
In the output above, does it mean 4500 spec examples ( It'd be great to have this run in CI if possible, and ideally separate for language/core/library specs. |
@eregon yes that's correct.
Artichoke vendors the
Yes this is something I'd like to do, but haven't gotten around to it. Linking #101 and #142. |
Tweaking the config a bit, I get these numbers
but also a reliable segfault on interpreter teardown.
|
I added a sentence in the blog post:
I think all other Rubies actually run specs in CI (important to avoid regressions), and in that blog post I relied on Ruby implementations untagging passing specs rather than trying to run all specs blindly and ignore errors, so I wanted to clarify how these numbers differ from the rest. |
Thanks @eregon! I know Artichoke doesn't integrate with ruby/spec and MSpec the way it's intended. It's still early days for Artichoke; when the time is right, these bits of polish will get prioritized. |
Hi @eregon 👋 I'm excited to share that Artichoke runs ruby/spec on every commit as of #1328, #1329, #1330, #1331, #1332, #1333. I've filed ruby/spec#844 to reflect this upstream in the ruby/spec README. 🚀 Artichoke doesn't have |
That's great! |
Not yet. I'm planning on turning the raw data into a webpage at some point. The suite currently only runs a subset of I've never run any Note: this isn't my biggest priority, but I am excited that this bit of infrastructure is in place to do things automatically. The hardest part is done I think 😄 |
Implement `String` in Rust **Implement the `String` class from Ruby core in Rust with `spinoso-string` in `artichoke-backend`.** 💎 🎉 💎 🎉 💎 [All 144 APIs available on `String`][ruby-core-string] in MRI 3.0.2 are either implemented or stubbed to raise `NotImplementedError`. [ruby-core-string]: https://ruby-doc.org/core-3.0.2/String.html ## Design Artichoke's `String` is based on [`spinoso_string::String`[spinoso-string-doc]. `String` is a byte buffer and associated encoding. `String` supports UTF-8, ASCII, and binary encodings. Encodings are conventional (a UTF-8 `String` may contain invalid UTF-8 byte sequences, an ASCII `String` may contain bytes greater than `0x7F`). Character oriented APIs like `String::center`, `String::get_char` and `String::char_len` are implemented according to the `String`'s conventional encoding. Unlike `String` in MRI, Artichoke's `String` does not cache whether a `String`'s encoding is valid, nor does it cache whether the `String` is `ascii_only`. These enhancements are expected to come in a future set of changes. I've [discussed how I would like to restructure `String`'s internals][twitter-string-nirvana] to make this happen on Twitter `spinoso-string` uses state of the art Rust dependencies for encoding-oriented operations with first-class SIMD implementations. `spinoso-string` is built on top of `bstr` (ASCII and UTF-8 decoding, `find`-oriented APIs), `simdutf8` (UTF-8 validity checks), and `bytecount` (UTF-8 character length). [spinoso-string-doc]: https://artichoke.github.io/artichoke/spinoso_string/struct.String.html [twitter-string-nirvana]: https://twitter.com/artichokeruby/status/1464117285002440707 ### mruby FFI Compatibility Much like earlier work on `Array`, the `MRB_API` functions for strings are used heavily in the VM in addition to being exposed to Ruby code. The mruby VM treats strings as memory managed byte buffers and uses them extensively in its internals. It is not sufficient to `#ifdef` out the C API since it is used outside of the `string.c` compilation unit. Artichoke includes several reimplementations of mruby C APIs, mostly in Rust and some in the `mruby-sys` C extension. During the development of these FFI compat functions, I encountered 1 segfault and 1 buffer overrun, both documented in the commits that address them: - 7a13a30 - 279c698 See https://twitter.com/artichokeruby/status/1463024367948812293 for some more color around these bugs. ## API Coverage Artichoke implements the following String APIs in mostly native code with some pure Ruby implementations and the `mruby-pack` mrbgem: ### Native Rust Implementations - `String#*` - `String#+` - `String#<<` - `String#<=>` - `String#==` - `String#[]` - `String#[]=` (stubbed, raises `NotImplementedError`) - `String#ascii_only?` - `String#b` - `String#bytes` - `String#bytesize` - `String#byteslice` - `String#capitalize` - `String#capitalize!` - `String#casecmp` - `String#casecmp?` - `String#center` - `String#chars` - `String#chomp` - `String#chomp!` - `String#chop` - `String#chop!` - `String#chr` - `String#clear` - `String#codepoints` - `String#concat` (stubbed, raises `NotImplementedError`) - `String#downcase` - `String#downcase!` - `String#empty?` - `String#eql?` - `String#getbyte` - `String#hash` - `String#include?` - `String#index` (only supports `String` patterns) - `String#initialize` - `String#initialize_copy` - `String#inspect` - `String#intern` - `String#length` - `String#ord` - `String#replace` - `String#reverse` - `String#reverse!` - `String#rindex` (only supports `String` patterns) - `String#scan` - `String#setbyte` - `String#size` - `String#slice` - `String#slice!` (stubbed, raises `NotImplementedError`) - `String#to_f` (stubbed, raises `NotImplementedError`) - `String#to_i` - `String#to_s` - `String#to_sym` - `String#upcase` - `String#upcase!` - `String#valid_encoding?` ### Pure Ruby Implementations - `String::try_convert` - `String#%` - `String#+@` - `String#-@` - `String#/` - `String#===` - `String#=~` - `String#count` (stubbed, raises `NotImplementedError`) - `String#crypt` (stubbed, raises `NotImplementedError`, not intended to be implemented for security considerations) - `String#delete` - `String#delete!` - `String#delete_prefix` - `String#delete_suffix` - `String#delete_suffix!` - `String#dump` (stubbed, raises `NotImplementedError`) - `String#each_byte` - `String#each_char` - `String#each_codepoint` - `String#each_grapheme_cluster` (stubbed, raises `NotImplementedError`) - `String#each_line` - `String#encode` (stubbed, pending completion of #183) - `String#encode!` (stubbed, pending completion of #183) - `String#encoding` (stubbed, pending completion of #183) - `String#end_with?` - `String#force_encoding` (stubbed, pending completion of #183) - `String#grapheme_clusters` (stubbed, raises `NotImplementedError`) - `String#gsub` - `String#gsub!` - `String#hex` (stubbed, raises `NotImplementedError`) - `String#insert` - `String#lines` - `String#ljust` - `String#lstrip` - `String#lstrip!` - `String#match` - `String#match?` - `String#next` (stubbed, raises `NotImplementedError`) - `String#next!` (stubbed, raises `NotImplementedError`) - `String#oct` (stubbed, raises `NotImplementedError`) - `String#partition` - `String#prepend` - `String#rjust` - `String#rpartition` (stubbed, raises `NotImplementedError`) - `String#rstrip` - `String#rstrip!` - `String#scrub` (stubbed, raises `NotImplementedError`, this should be implemented in native code) - `String#scrub!` (stubbed, raises `NotImplementedError`, this should be implemented in native code) - `String#split` (partial implementation, buggy, does not support block arguments) - `String#squeeze` (partial implementation, does not support character set arguments) - `String#squeeze!` (partial implementation, does not support character set arguments) - `String#start_with?` - `String#strip` - `String#strip!` - `String#sub` - `String#sub!` - `String#succ` (stubbed, raises `NotImplementedError`) - `String#succ!` (stubbed, raises `NotImplementedError`) - `String#sum` (stubbed, raises `NotImplementedError`) - `String#swapcase` (stubbed, raises `NotImplementedError`) - `String#swapcase!` (stubbed, raises `NotImplementedError`) - `String#to_a` - `String#to_c` (stubbed, raises `NotImplementedError`, Artichoke does not support `Complex` numbers) - `String#to_r` (stubbed, raises `NotImplementedError`, Artichoke does not support `Rational` numbers) - `String#to_str` - `String#tr` - `String#tr!` - `String#tr_s` - `String#tr_s!` - `String#undump` (stubbed, raises `NotImplementedError`) - `String#unicode_normalize` (stubbed, raises `NotImplementedError`) - `String#unicode_normalize!` (stubbed, raises `NotImplementedError`) - `String#unicode_normalized?` (stubbed, raises `NotImplementedError`) - `String#upto` ### mruby mrbgem C extension - `String#unpack` - `String#unpack1` ## ruby/spec compliance This PR improves ruby/spec compliance for Artichoke 📈 🚀 On current trunk, when running `rake spec`, Artichoke has the following stats: ``` Passed 1837, skipped 238, not implemented 14, failed 0 specs. ``` On the tip of #1222, when running `rake spec`, Artichoke has the following stats: ``` Passed 1845, skipped 238, not implemented 14, failed 0 specs. ``` This PR additionally allows for `SecureRandom#random_bytes` specs to pass since the returned `String` can properly declare their binary encoding. Previously, the returned `String`s had UTF-8 encoding which meant that if the random bytes happened to contain a multi-byte UTF-8 character byte sequence, the reported length would be less than `bytesize`. See: - fac36b9 ## Future Work - Implement `String#[]=`. - Add fast paths when `String`s of any encoding are valid per their encoding. - Add fast paths when `String`s of any encoding are ASCII only. - Remove dependency on `mruby-pack`. - Expose `Encoding` in Ruby, see #183. - Use encoding-aware `String` constructors in native APIs where possible/required per ruby/spec. ## Linked Issues Closes #184. Partially addresses #183. These PRs were foundational to getting this PR to a mergeable state: - #1012, thanks @ekroon - #1045 - #1047 - #1048 - #1049 - #1051 - #1088 - #1141 - #1271 - #1348 - #1349 - #1353 - #1357 - #1402 - #1403 - #1449 - #1450 (comment) and #1510, thanks @rurban - #1451 - #1488, thanks @masonforest - #1489 - #1490 - #1491 - #1505, thanks @briankung - #1512 - #1513 - #1514 - #1515 - #1516 - #1517 - #1519
It runs to completion!
Gist with the full error log: https://gist.github.com/lopopolo/d7f758a4482a64ede6bb8a77918baba6.
There are still a few segfaults in there, but as configured, the specs reliably execute to completion.
This change adds a self-referential Hash check to
Hash#inspect
to prevent a stack overflow.This change modifes
mrb_convert_type
to prevent a recursive attempt to raise aTypeError
when attempting to convert a type toString
that doesn't implement#to_s
. This was fixed upstream in mruby/mruby@01a8d84.The
all-core-specs.toml
configuration enables all top-level specs inspec-runner/vendor/spec/core
except:IO
(attempts to instantiate a binaryRegexp
that aborts the spec-runner)Kernel
(infinite loop somewhere in there)Mutex
ObjectSpace
Process
Thread
ThreadGroup
TracePoint
This config also copies the permitted spec list from
enforced-specs.toml
forArray
.Array#initialize
specs have an exception that crashes the spec-runner.