Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
235 lines (213 sloc) 9.69 KB
-----BEGIN CREV PACKAGE REVIEW-----
version: -1
date: "2019-08-22T19:10:07.491610585-04:00"
from:
id-type: crev
id: VylyTuk8CMGqIxgHixWaqfiUn3xZyzOA1wFrQ0sR1As
url: "https://github.com/BurntSushi/crev-proofs"
package:
source: "https://crates.io"
name: memchr
version: 2.2.1
revision: 75155594dd6c130d47c944bbf31813f454d94341
digest: ae5zOlbfexMaCjagJcotoowq1Ql1mDNxtrIcHuGakjM
review:
thoroughness: high
understanding: high
rating: strong
comment: |
I wrote this crate, so this review is a reflection as a result of writing
the code and then reviewing it again for this review.
The entire purpose of the memchr crate is to do this:
haystack.iter().position(|&b| b == needle)
... but really fast. As a result, this crate uses SIMD via CPU specific
vendor intrinsics. Consequently, there is a lot of unsafe code in this
crate. There is really no way to avoid this, other than perhaps using a
higher level platform independent SIMD API. But no such thing of sufficient
quality exists for stable Rust at the time of writing.
The testing strategy is the most important bit here. In particular, every
public API item is tested using a permutation of tests that exercise all
possible alignments found in a haystack. (Because the implementations used
aligned loads/stores, which are only correct if the address arithmetic is
correct.) Additionally, there are quickcheck tests that act as a sort of
fuzzer guaranteeing that the implementation is correct for a range of inputs.
I gave the highest rating possible because of the extensive use this crate
has seen, in addition to its level of testing. In particular, memchr
underlies a significant chunk of all text search in the Rust ecosystem.
-----BEGIN CREV PACKAGE REVIEW SIGNATURE-----
k9OhqAzmXsrZVASDN_k2H7l-RQm4sasBEtCn1qGl12txvkCDNgAfIipJbhiJxxajMq2Cx9PhZfBXCHxjqe6-AA
-----END CREV PACKAGE REVIEW-----
-----BEGIN CREV PACKAGE REVIEW-----
version: -1
date: "2019-08-22T19:39:15.333196905-04:00"
from:
id-type: crev
id: VylyTuk8CMGqIxgHixWaqfiUn3xZyzOA1wFrQ0sR1As
url: "https://github.com/BurntSushi/crev-proofs"
package:
source: "https://crates.io"
name: aho-corasick
version: 0.7.6
revision: f1bf13c9ba179733a172794507dcdd0539fa4269
digest: 4dvyEm-kmxXzGE2_36dIyFFH9y2Q47kF3Ae6ceXhdrA
review:
thoroughness: high
understanding: high
rating: positive
comment: |
I wrote this crate, so this review is a reflection as a result of writing the
code and then reviewing it again for this review.
While the aho-corasick crate is not often used directly, it is a key
optimization technique used in the `regex` crate for quickly finding
potential matches by searching literals.
I gave this crate a rating of `positive` instead of the highest `strong`
because it was somewhat recently rewritten. So it hasn't been thoroughly
vetted yet.
At a higher level, one concern point of this crate is that it has a lot of
`unsafe` usage. While a small number of those `unsafe` uses are for the
Aho-Corasick algorithm itself---mostly for explicitly eliding bounds checks
for performance reasons---the vast majority of all `unsafe` uses are for the
implementation of the Teddy algorithm, which makes heavy use of SIMD through
explicit CPU specific vendor intrinsics. The Teddy algorithm is used as a
fast prefilter to quickly find potential matches when searching for a smaller
number of patterns. The speedups can be an order of magnitude, so the extra
code complexity is worth it.
As with the memchr crate, both the Aho-Corasick algorithm and the Teddy
algorithm are thoroughly tested. Both are tested independently of one another
and when they are used together. Like the memchr crate, the Teddy
algorithm is tested on a wide variety of haystack configurations to test
different haystack lengths and match positions, all of which can exercise
different aspects of the Teddy algorithm. If one counted the total number of
tests for the entire crate (including variations on each), it would easily
be in the tens of thousands.
-----BEGIN CREV PACKAGE REVIEW SIGNATURE-----
pwSedX-V7MFo_7UPSK40cvJaFcLffTpVv9xknp2kVOQyREK3g4XBhLCERxWELxKXw1VnwguHxJQaJtGK187MAA
-----END CREV PACKAGE REVIEW-----
-----BEGIN CREV PACKAGE REVIEW-----
version: -1
date: "2019-08-22T19:53:17.672001511-04:00"
from:
id-type: crev
id: VylyTuk8CMGqIxgHixWaqfiUn3xZyzOA1wFrQ0sR1As
url: "https://github.com/BurntSushi/crev-proofs"
package:
source: "https://crates.io"
name: regex-syntax
version: 0.6.11
revision: 169783c1d6e045ff2ee418ea3556d09ed97e8f3c
digest: jpb-7eZGGg3VD4Qv8s64grLU-tFlXore5J-p4xG9JKg
review:
thoroughness: high
understanding: high
rating: strong
comment: |
I wrote this crate, so this review is a reflection as a result of writing
the code and then reviewing it again for this review.
The regex-syntax crate is mostly an internal implementation detail of the
regex crate. It is exposed as a stand-alone crate for the occasional use
case where it is convenient to analyze the syntax of a regex. The primary
thing that the regex-syntax crate provides is a parser for the concrete
syntax supported by the `regex` crate.
The regex-syntax crate is very large, but there is no unsafe used anywhere.
Moreover, it has no dependencies. That means that no matter what concrete
syntax is given, the worst that can happen is a panic. Memory safety should
be preserved (modulo bugs in the compiler or the standard library). On top of
that, the parser does not use explicit recursion and enforces a nest depth
by default, so it should generally be safe with respect to arbitrary input.
-----BEGIN CREV PACKAGE REVIEW SIGNATURE-----
pMl9VoBusVzqWKrDWdY7EkB93IJO7n-qws5-AzYHlFIvoYvBJIerm8xCAASYau1Vruilo2stJOCKmzIUcxeADA
-----END CREV PACKAGE REVIEW-----
-----BEGIN CREV PACKAGE REVIEW-----
version: -1
date: "2019-08-22T20:08:43.486032665-04:00"
from:
id-type: crev
id: VylyTuk8CMGqIxgHixWaqfiUn3xZyzOA1wFrQ0sR1As
url: "https://github.com/BurntSushi/crev-proofs"
package:
source: "https://crates.io"
name: thread_local
version: 0.3.6
digest: ZFqLfovJbiB-qdGhcZ-cIqJMauR-pbY0UyJr0ds2H54
review:
thoroughness: medium
understanding: medium
rating: positive
comment: |
This crate was written by Amanieu d'Antras in response to my efforts to make
thread safe caching in the regex crate faster. The way to think about it is
"dynamic" thread local storage.
The crate does use a fair amount of unsafe, and the reason for this is to
minimize latency as much as is possible. Each use of unsafe isn't
particularly tricky---it's mostly managing atomic access to a hash table.
With that said, it would be nice if each use of unsafe had a comment
explaining/justifying it. There are some comments scattered about making
reasonable arguments, but more would be good.
This crate did have an issue a while back where if a program used the same
regex (for example) many times from many short lived threads, then the
cached data could grow without bound. Amanieu patched that issue a while
back by reusing thread ids. A thread id is tracked using normal TLS, which
is dropped when the thread is destroyed.
I gave this crate a rating of positive because I cannot identify any issues
after reviewing the code, although, I am not completely certain that there
are no issues because of the use of unsafe code.
-----BEGIN CREV PACKAGE REVIEW SIGNATURE-----
Ddmx5eKdL3h3YmEJJCVym02iS0InUFOQH4nVDCrbdN-XcKFLSu_0mIPs6bmLCo33UF_e9aosvpRl6B2lpfSPDA
-----END CREV PACKAGE REVIEW-----
-----BEGIN CREV PACKAGE REVIEW-----
version: -1
date: "2019-08-23T06:40:48.892265615-04:00"
from:
id-type: crev
id: VylyTuk8CMGqIxgHixWaqfiUn3xZyzOA1wFrQ0sR1As
url: "https://github.com/BurntSushi/crev-proofs"
package:
source: "https://crates.io"
name: regex
version: 1.2.1
revision: 0b1e1e8c54287d382e505acc68d983801daddeee
digest: ocH3h6Mkhx80kLs5NaA8LnoHj-5vO7Esk27gJBwXEt8
review:
thoroughness: high
understanding: high
rating: positive
comment: |
I wrote this crate, so this review is a reflection as a result of writing the
code and then reviewing it again for this review.
I gave a rating of positive because the implementation in the regex crate has
remain mostly unchanged for a long period of time, so it has matured and been
battle tested. Moreover, the crate uses very little unsafe directly. The
only uses are in the DFA regex engine for explicitly eliding bounds checks
in very hot loops.
A better rating is likely deserved, but there are still a few outstanding
bugs that can produce incorrect matches. In many cases, these bugs are a
result of optimizations using literal searches, which can be subtle and
difficult to get right.
-----BEGIN CREV PACKAGE REVIEW SIGNATURE-----
B2ytIApfPvGEtYKD4JSGNxi5aSVdUbTWA55rMJoZoDfeaA8uj4cfIWoXdtk6SmptxjcfbkFfOdj-98slpYWeCQ
-----END CREV PACKAGE REVIEW-----
-----BEGIN CREV PACKAGE REVIEW-----
version: -1
date: "2019-08-28T20:36:24.694340048-04:00"
from:
id-type: crev
id: VylyTuk8CMGqIxgHixWaqfiUn3xZyzOA1wFrQ0sR1As
url: "https://github.com/BurntSushi/crev-proofs"
package:
source: "https://crates.io"
name: lazy_static
version: 1.4.0
revision: 421669662b35fcb455f2902daed2e20bbbba79b6
digest: 95REZRx8ItDao_V9l7sOsW9-OB6Jc5C7azc9mNC07fI
review:
thoroughness: medium
understanding: high
rating: strong
comment: |
The 1.4.0 release removes the hand-rolled unreachable hint in favor of
`unreachable_unchecked`, which was a new API introduced in Rust 1.27.
Otherwise, nothing substantial was changed, other than allowing some
deprecated APIs to support older versions of Rust.
-----BEGIN CREV PACKAGE REVIEW SIGNATURE-----
MbvH99FJXvZAWhto_bjaDAuI-Xw3nG8F-uKpDxsuPfLXuscHlPc7GD_IIIKWu033ZqfEEilytHY9v4lp5NkfBw
-----END CREV PACKAGE REVIEW-----
You can’t perform that action at this time.