Skip to content

Commit

Permalink
Rollup merge of rust-lang#119676 - notriddle:notriddle/rustdoc-search…
Browse files Browse the repository at this point in the history
…-hof, r=GuillaumeGomez

rustdoc-search: search types by higher-order functions

This feature extends rustdoc with syntax and search index information for searching function pointers and closures (Higher-Order Functions, or HOF). Part of rust-lang#60485

This PR adds two syntaxes: a high-level one for finding any kind of HOF, and a direct implementation of the parenthesized path syntax that Rust itself uses.

## Preview pages

| Query | Results |
|-------|---------|
| [`option<T>, (fnonce (T) -> bool) -> option<T>`][optionfilter] | `Option::filter` |
| [`option<T>, (T -> bool) -> option<T>`][optionfilter2] | `Option::filter` |

Updated chapter of the book: https://notriddle.com/rustdoc-html-demo-9/search-hof/rustdoc/read-documentation/search.html

[optionfilter]: https://notriddle.com/rustdoc-html-demo-9/search-hof/std/vec/struct.Vec.html?search=option<T>%2C+(fnonce+(T)+->+bool)+->+option<T>&filter-crate=std
[optionfilter2]: https://notriddle.com/rustdoc-html-demo-9/search-hof/std/vec/struct.Vec.html?search=option<T>%2C+(T+->+bool)+->+option<T>&filter-crate=std

## Motivation

When type-based search was first landed, it was directly [described as incomplete][a comment].

[a comment]: rust-lang#23289 (comment)

Filling out the missing functionality is going to mean adding support for more of Rust's [type expression] syntax, such as references, raw pointers, function pointers, and closures. This PR adds function pointers and closures.

[type expression]: https://doc.rust-lang.org/reference/types.html#type-expressions

There's been demand for something "like Hoogle, but for Rust" expressed a few times [1](https://www.reddit.com/r/rust/comments/y8sbid/is_there_a_website_like_haskells_hoogle_for_rust/) [2](https://users.rust-lang.org/t/rust-equivalent-of-haskells-hoogle/102280) [3](https://internals.rust-lang.org/t/std-library-inclusion-policy/6852/2) [4](https://discord.com/channels/442252698964721669/448238009733742612/1109502307495858216). Some of them just don't realize what functionality already exists ([`Duration -> u64`](https://doc.rust-lang.org/nightly/std/?search=duration%20-%3E%20u64) already works), but a lot of them specifically want to search for higher-order functions like option combinators.

## Guide-level explanation (from the Rustdoc book)

To search for a function that accepts a function as a parameter, like `Iterator::all`, wrap the nested signature in parenthesis, as in [`Iterator<T>, (T -> bool) -> bool`][iterator-all]. You can also search for a specific closure trait, such as `Iterator<T>, (FnMut(T) -> bool) -> bool`, but you need to know which one you want.

[iterator-all]: https://notriddle.com/rustdoc-html-demo-9/search-hof/std/vec/struct.Vec.html?search=Iterator<T>%2C+(T+->+bool)+->+bool&filter-crate=std

## Reference-level description (also from the Rustdoc book)

### Primitives with Special Syntax

<table>
<thead>
  <tr>
    <th>Shorthand</th>
    <th>Explicit names</th>
  </tr>
</thead>
<tbody>
  <tr><td colspan="2">Before this PR</td></tr>
  <tr>
    <td><code>[]</code></td>
    <td><code>primitive:slice</code> and/or <code>primitive:array</code></td>
  </tr>
  <tr>
    <td><code>[T]</code></td>
    <td><code>primitive:slice&lt;T&gt;</code> and/or <code>primitive:array&lt;T&gt;</code></td>
  </tr>
  <tr>
    <td><code>!</code></td>
    <td><code>primitive:never</code></td>
  </tr>
  <tr>
    <td><code>()</code></td>
    <td><code>primitive:unit</code> and/or <code>primitive:tuple</code></td>
  </tr>
  <tr>
    <td><code>(T)</code></td>
    <td><code>T</code></td>
  </tr>
  <tr>
    <td><code>(T,)</code></td>
    <td><code>primitive:tuple&lt;T&gt;</code></td>
  </tr>
  <tr><td colspan="2">After this PR</td></tr>
  <tr>
    <td><code>(T, U -> V, W)</code></td>
    <td><code>fn(T, U) -> (V, W)</code>, Fn, FnMut, and FnOnce</td>
  </tr>
</tbody>
</table>

The `->` operator has lower precedence than comma. If it's not wrapped in brackets, it delimits the return value for the function being searched for. To search for functions that take functions as parameters, use parenthesis.

### Search query grammar

```ebnf
ident = *(ALPHA / DIGIT / "_")
path = ident *(DOUBLE-COLON ident) [BANG]
slice-like = OPEN-SQUARE-BRACKET [ nonempty-arg-list ] CLOSE-SQUARE-BRACKET
tuple-like = OPEN-PAREN [ nonempty-arg-list ] CLOSE-PAREN
arg = [type-filter *WS COLON *WS] (path [generics] / slice-like / tuple-like)
type-sep = COMMA/WS *(COMMA/WS)
nonempty-arg-list = *(type-sep) arg *(type-sep arg) *(type-sep) [ return-args ]
generic-arg-list = *(type-sep) arg [ EQUAL arg ] *(type-sep arg [ EQUAL arg ]) *(type-sep)
normal-generics = OPEN-ANGLE-BRACKET [ generic-arg-list ] *(type-sep)
            CLOSE-ANGLE-BRACKET
fn-like-generics = OPEN-PAREN [ nonempty-arg-list ] CLOSE-PAREN [ RETURN-ARROW arg ]
generics = normal-generics / fn-like-generics
return-args = RETURN-ARROW *(type-sep) nonempty-arg-list

exact-search = [type-filter *WS COLON] [ RETURN-ARROW ] *WS QUOTE ident QUOTE [ generics ]
type-search = [ nonempty-arg-list ]

query = *WS (exact-search / type-search) *WS

; unchanged parts of the grammar, like the full list of type filters, are omitted
```

## Future direction

### The remaining type expression grammar

As described in rust-lang#118194, this is another step in the type expression grammar: BareFunction, and the function-like mode of TypePath, are now supported.

* RawPointerType and ReferenceType actually are a priority.
* ImplTraitType and TraitObjectType (and ImplTraitTypeOneBound and TraitObjectTypeOneBound) aren't as much of a priority, since they desugar pretty easily.

### Search subtyping and traits

This is the other major factor that makes it less useful than it should be.

* `iterator<result<t>> -> result<t>` doesn't find `Result::from_iter`. You have to search [`intoiterator<result<t>> -> result<t>`](https://notriddle.com/rustdoc-html-demo-9/search-hof/std/vec/struct.Vec.html?search=intoiterator%3Cresult%3Ct%3E%3E%20-%3E%20result%3Ct%3E&filter-crate=std). Nobody's going to search for IntoIterator unless they basically already know about it and don't need the search engine anyway.

* Iterator combinators are usually structs that happen to implement Iterator, like `std::iter::Map`.

To solve these cases, it needs to look at trait implementations, knowing that Iterator is a "subtype of" IntoIterator, and Map is a "subtype of" Iterator, so `iterator -> result` is a subtype of `intoiterator -> result` and `iterator<t>, (t -> u) -> iterator<u>` is a subtype of [`iterator<t>, (t -> u) -> map<t -> u>`](https://notriddle.com/rustdoc-html-demo-9/search-hof/std/vec/struct.Vec.html?search=iterator%3Ct%3E%2C%20(t%20-%3E%20u)%20-%3E%20map%3Ct%20-%3E%20u%3E&filter-crate=std).
  • Loading branch information
matthiaskrgr committed Mar 11, 2024
2 parents 7ec3516 + 84d7a2c commit c87410e
Show file tree
Hide file tree
Showing 8 changed files with 1,177 additions and 148 deletions.
54 changes: 35 additions & 19 deletions src/doc/rustdoc/src/read-documentation/search.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,11 +63,12 @@ Before describing the syntax in more detail, here's a few sample searches of
the standard library and functions that are included in the results list:

| Query | Results |
|-------|--------|
|-------|---------|
| [`usize -> vec`][] | `slice::repeat` and `Vec::with_capacity` |
| [`vec, vec -> bool`][] | `Vec::eq` |
| [`option<T>, fnonce -> option<U>`][] | `Option::map` and `Option::and_then` |
| [`option<T>, fnonce -> option<T>`][] | `Option::filter` and `Option::inspect` |
| [`option<T>, (fnonce (T) -> bool) -> option<T>`][optionfilter] | `Option::filter` |
| [`option<T>, (T -> bool) -> option<T>`][optionfilter2] | `Option::filter` |
| [`option -> default`][] | `Option::unwrap_or_default` |
| [`stdout, [u8]`][stdoutu8] | `Stdout::write` |
| [`any -> !`][] | `panic::panic_any` |
Expand All @@ -77,7 +78,8 @@ the standard library and functions that are included in the results list:
[`usize -> vec`]: ../../std/vec/struct.Vec.html?search=usize%20-%3E%20vec&filter-crate=std
[`vec, vec -> bool`]: ../../std/vec/struct.Vec.html?search=vec,%20vec%20-%3E%20bool&filter-crate=std
[`option<T>, fnonce -> option<U>`]: ../../std/vec/struct.Vec.html?search=option<T>%2C%20fnonce%20->%20option<U>&filter-crate=std
[`option<T>, fnonce -> option<T>`]: ../../std/vec/struct.Vec.html?search=option<T>%2C%20fnonce%20->%20option<T>&filter-crate=std
[optionfilter]: ../../std/vec/struct.Vec.html?search=option<T>%2C+(fnonce+(T)+->+bool)+->+option<T>&filter-crate=std
[optionfilter2]: ../../std/vec/struct.Vec.html?search=option<T>%2C+(T+->+bool)+->+option<T>&filter-crate=std
[`option -> default`]: ../../std/vec/struct.Vec.html?search=option%20-%3E%20default&filter-crate=std
[`any -> !`]: ../../std/vec/struct.Vec.html?search=any%20-%3E%20!&filter-crate=std
[stdoutu8]: ../../std/vec/struct.Vec.html?search=stdout%2C%20[u8]&filter-crate=std
Expand Down Expand Up @@ -151,16 +153,26 @@ will match these queries:

But it *does not* match `Result<Vec, u8>` or `Result<u8<Vec>>`.

To search for a function that accepts a function as a parameter,
like `Iterator::all`, wrap the nested signature in parenthesis,
as in [`Iterator<T>, (T -> bool) -> bool`][iterator-all].
You can also search for a specific closure trait,
such as `Iterator<T>, (FnMut(T) -> bool) -> bool`,
but you need to know which one you want.

[iterator-all]: ../../std/vec/struct.Vec.html?search=Iterator<T>%2C+(T+->+bool)+->+bool&filter-crate=std

### Primitives with Special Syntax

| Shorthand | Explicit names |
| --------- | ------------------------------------------------ |
| `[]` | `primitive:slice` and/or `primitive:array` |
| `[T]` | `primitive:slice<T>` and/or `primitive:array<T>` |
| `()` | `primitive:unit` and/or `primitive:tuple` |
| `(T)` | `T` |
| `(T,)` | `primitive:tuple<T>` |
| `!` | `primitive:never` |
| Shorthand | Explicit names |
| ---------------- | ------------------------------------------------- |
| `[]` | `primitive:slice` and/or `primitive:array` |
| `[T]` | `primitive:slice<T>` and/or `primitive:array<T>` |
| `()` | `primitive:unit` and/or `primitive:tuple` |
| `(T)` | `T` |
| `(T,)` | `primitive:tuple<T>` |
| `!` | `primitive:never` |
| `(T, U -> V, W)` | `fn(T, U) -> (V, W)`, `Fn`, `FnMut`, and `FnOnce` |

When searching for `[]`, Rustdoc will return search results with either slices
or arrays. If you know which one you want, you can force it to return results
Expand All @@ -180,6 +192,10 @@ results for types that match tuples, even though it also matches the type on
its own. That is, `(u32)` matches `(u32,)` for the exact same reason that it
also matches `Result<u32, Error>`.

The `->` operator has lower precedence than comma. If it's not wrapped
in brackets, it delimits the return value for the function being searched for.
To search for functions that take functions as parameters, use parenthesis.

### Limitations and quirks of type-based search

Type-based search is still a buggy, experimental, work-in-progress feature.
Expand Down Expand Up @@ -218,9 +234,6 @@ Most of these limitations should be addressed in future version of Rustdoc.

* Searching for lifetimes is not supported.

* It's impossible to search for closures based on their parameters or
return values.

* It's impossible to search based on the length of an array.

## Item filtering
Expand All @@ -237,19 +250,21 @@ Item filters can be used in both name-based and type signature-based searches.

```text
ident = *(ALPHA / DIGIT / "_")
path = ident *(DOUBLE-COLON ident) [!]
path = ident *(DOUBLE-COLON ident) [BANG]
slice-like = OPEN-SQUARE-BRACKET [ nonempty-arg-list ] CLOSE-SQUARE-BRACKET
tuple-like = OPEN-PAREN [ nonempty-arg-list ] CLOSE-PAREN
arg = [type-filter *WS COLON *WS] (path [generics] / slice-like / tuple-like / [!])
arg = [type-filter *WS COLON *WS] (path [generics] / slice-like / tuple-like)
type-sep = COMMA/WS *(COMMA/WS)
nonempty-arg-list = *(type-sep) arg *(type-sep arg) *(type-sep)
nonempty-arg-list = *(type-sep) arg *(type-sep arg) *(type-sep) [ return-args ]
generic-arg-list = *(type-sep) arg [ EQUAL arg ] *(type-sep arg [ EQUAL arg ]) *(type-sep)
generics = OPEN-ANGLE-BRACKET [ generic-arg-list ] *(type-sep)
normal-generics = OPEN-ANGLE-BRACKET [ generic-arg-list ] *(type-sep)
CLOSE-ANGLE-BRACKET
fn-like-generics = OPEN-PAREN [ nonempty-arg-list ] CLOSE-PAREN [ RETURN-ARROW arg ]
generics = normal-generics / fn-like-generics
return-args = RETURN-ARROW *(type-sep) nonempty-arg-list
exact-search = [type-filter *WS COLON] [ RETURN-ARROW ] *WS QUOTE ident QUOTE [ generics ]
type-search = [ nonempty-arg-list ] [ return-args ]
type-search = [ nonempty-arg-list ]
query = *WS (exact-search / type-search) *WS
Expand Down Expand Up @@ -294,6 +309,7 @@ QUOTE = %x22
COMMA = ","
RETURN-ARROW = "->"
EQUAL = "="
BANG = "!"
ALPHA = %x41-5A / %x61-7A ; A-Z / a-z
DIGIT = %x30-39
Expand Down
40 changes: 39 additions & 1 deletion src/librustdoc/html/render/search_index.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ use std::collections::{BTreeMap, VecDeque};
use rustc_data_structures::fx::{FxHashMap, FxIndexMap};
use rustc_middle::ty::TyCtxt;
use rustc_span::def_id::DefId;
use rustc_span::sym;
use rustc_span::symbol::Symbol;
use serde::ser::{Serialize, SerializeSeq, SerializeStruct, Serializer};
use thin_vec::ThinVec;
Expand Down Expand Up @@ -566,6 +567,7 @@ fn get_index_type_id(
// The type parameters are converted to generics in `simplify_fn_type`
clean::Slice(_) => Some(RenderTypeId::Primitive(clean::PrimitiveType::Slice)),
clean::Array(_, _) => Some(RenderTypeId::Primitive(clean::PrimitiveType::Array)),
clean::BareFunction(_) => Some(RenderTypeId::Primitive(clean::PrimitiveType::Fn)),
clean::Tuple(ref n) if n.is_empty() => {
Some(RenderTypeId::Primitive(clean::PrimitiveType::Unit))
}
Expand All @@ -584,7 +586,7 @@ fn get_index_type_id(
}
}
// Not supported yet
clean::BareFunction(_) | clean::Generic(_) | clean::ImplTrait(_) | clean::Infer => None,
clean::Generic(_) | clean::ImplTrait(_) | clean::Infer => None,
}
}

Expand Down Expand Up @@ -785,6 +787,42 @@ fn simplify_fn_type<'tcx, 'a>(
);
}
res.push(get_index_type(arg, ty_generics, rgen));
} else if let Type::BareFunction(ref bf) = *arg {
let mut ty_generics = Vec::new();
for ty in bf.decl.inputs.values.iter().map(|arg| &arg.type_) {
simplify_fn_type(
self_,
generics,
ty,
tcx,
recurse + 1,
&mut ty_generics,
rgen,
is_return,
cache,
);
}
// The search index, for simplicity's sake, represents fn pointers and closures
// the same way: as a tuple for the parameters, and an associated type for the
// return type.
let mut ty_output = Vec::new();
simplify_fn_type(
self_,
generics,
&bf.decl.output,
tcx,
recurse + 1,
&mut ty_output,
rgen,
is_return,
cache,
);
let ty_bindings = vec![(RenderTypeId::AssociatedType(sym::Output), ty_output)];
res.push(RenderType {
id: get_index_type_id(&arg, rgen),
bindings: Some(ty_bindings),
generics: Some(ty_generics),
});
} else {
// This is not a type parameter. So for example if we have `T, U: Option<T>`, and we're
// looking at `Option`, we enter this "else" condition, otherwise if it's `T`, we don't.
Expand Down

0 comments on commit c87410e

Please sign in to comment.