Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Boxes vs Containers - Naming schemes for element access #7887

Closed
Kimundi opened this issue Jul 18, 2013 · 28 comments
Closed

RFC: Boxes vs Containers - Naming schemes for element access #7887

Kimundi opened this issue Jul 18, 2013 · 28 comments

Comments

@Kimundi
Copy link
Member

Kimundi commented Jul 18, 2013

Right now there is much confusion and inconsistency about the naming of functions that deal with elements in types that wrap one or more of them. We need to find a consistent way to deal with that.

First, some terminology:

  • Container - Can contain at most N elements (with N effectively infinite).
    Examples: Vector, Hash Map, Linked List.
  • Box - Can contain at most 1 element.
    Examples: Option, Cell, ~T, ARC.

I'm also defining a elementary set of operations you can perform on containers and boxes in regard to an arbitrary chosen element E (named in all-caps to differentiate from actual function names below):

  • REF: Return a reference to the element - (&'a self) -> &'a E
  • REF_MUT: Return a mutable reference to the element - (&'a mut self) -> &'a mut E
  • TAKE: Return a element, removing it from the type - (&mut self) -> E
  • UNWRAP: Return a element, destroying the type in the process - (self) -> E

In regard to naming schemes, I see a number of possibilities:

Unify boxes and containers

This would be the globally most consistent and reusable way, but also the most ugly one. We define a set of functions for working with an arbitrary 'next' element on an container. For example in stack-terminology:

  • fn top(&'a self) -> &'a E - REF, fail if empty
  • fn peek(&'a self) -> Option<&'a E> - REF or None
  • fn top_mut(&'a mut self) -> &'a mut E - REF_MUT, fail if empty
  • fn peek_mut(&'a mut self) -> Option<&'a mut E> - REF_MUT or None
  • fn pop(&mut self) -> E - TAKE, fail if empty
  • fn pop_opt(&mut self) -> Option<E> - TAKE or None
  • fn pop_consume(self) -> E - UNWRAP, fail if empty
  • fn pop_consume_opt(self) -> Option<E> - UNWRAP or None.

Separate boxes and containers, consistent box naming scheme

Containers and Boxes are separate concepts. Containers are build around a terminology and operations that make most sense to them. They probably wouldn't support UNWRAP and might not support the others either, depending on their specific semantics.

Boxes gain their own more concise and strictly consistent terminology, based around the box metaphor:

  • "Get a reference to the content, but keep it in the box"
    • fn get(&'a self) -> &'a E - REF, fail if empty
    • fn get_opt(&'a self) -> Option<&'a E> - REF or None
    • fn get_mut(&'a mut self) -> &'a mut E - REF_MUT, fail if empty
    • fn get_mut_opt(&'a mut self) -> Option<&'a mut E> - REF_MUT or None
  • "Take the content out of the box, keeping the box intact but empty"
    • fn take(&mut self) -> E - TAKE, fail if empty
    • fn take_opt(&mut self) -> Option<E> - TAKE or None
  • "Destructively unwrap the box to get to the content"
    • fn unwrap(self) -> E - UNWRAP, fail if empty
    • fn unwrap_opt(self) -> Option<E> - UNWRAP or None.

Separate boxes and containers, convenient box naming scheme

This one is quite similar to the prior one, but shuffles names and definitions around to provide a nicer set of function names for the common operations:

  • "Get a reference to the content, but keep it in the box"
    • fn get(&'a self) -> &'a E - REF, fail if empty
    • fn get_opt(&'a self) -> Option<&'a E> - REF or None
    • fn get_mut(&'a mut self) -> &'a mut E - REF_MUT, fail if empty
    • fn get_mut_opt(&'a mut self) -> Option<&'a mut E> - REF_MUT or None
  • "Take the content out of the box if any, keeping the box intact but empty"
    • fn take(&mut self) -> Option<E> - TAKE or None
  • "Destructively unwrap the box to get to the content"
    • fn unwrap(self) -> E - UNWRAP, fail if empty

The last one is probably the most convenient as it's optimized for common "I know it is not None" Option usecases:

  • opt.get() - " Do something with the content of a Option, but keep it in place."
  • opt.unwrap() - "Move the content out of a Option, throw the Option away"
  • opt.clone().unwrap() or opt.get().clone() - "Get a copy of the content"
  • opt.take().unwrap() - "Move the content out of a Option, replace the Option with None"

Hm, this writeup is only half as comprehensive and cohesive as I wanted it to be, I hope it's still understandable.

@thestinger
Copy link
Contributor

I think it's silly to have redundant functions/methods doing fail!() internally every time there's a function/method returning Option. In many cases it doubles the size of the API, making it harder to maintain and learn.

It would be much simpler to just use the unwrap method on Option as necessary.

@Kimundi
Copy link
Member Author

Kimundi commented Jul 21, 2013

@thestinger: I agree in principle, but the problem here is that methods on Option that return an Option for the element are kinda useless... If we only have unwrap(), and have to chain that to every other possible call its going to be cumbersome to use.

@thestinger
Copy link
Contributor

I'm talking about everything returning an Option that's not an Option itself. The methods should only have to be defined once on Option, we really don't need get, expect and so on to go along with methods like find on every Map, and have them in every variety find comes in.

@bblum
Copy link
Contributor

bblum commented Jul 23, 2013

I've been mulling this over. My preference at this point is to do away with the "ref" and "consume" suffixes, and make "get" mean reference, and "unwrap" mean by value. So, today's option get becomes unwrap, unwrap stays the same, get_ref turns into get, map stays as map, map_consume turns into map_unwrap, and consume_iter turns into unwrap_iter.

@bluss
Copy link
Member

bluss commented Jul 24, 2013

bblum, unwrap fails on None, map_unwrap can't do that, it's not really useful.

@bblum
Copy link
Contributor

bblum commented Jul 24, 2013

My thought about that was "unwrap means might fail", but "map means won't fail" with higher precedence. I dunno, maybe it is better to keep that as consume, but I'd like to see one or the other word go away. Would it be any better to replace all "unwrap" with "consume"?

@erickt
Copy link
Contributor

erickt commented Jul 24, 2013

@bblum: I think we should prefer functions like .map() to consume self if
we can, and let the caller use .clone() if they need a copy. That feels
more rust-y to me. I've got an old patch series waiting for the recently
landed fix for moving out of self that implements this for options,
results, and vecs. I'll look into reviving it.

On Tue, Jul 23, 2013 at 8:01 PM, Ben Blum notifications@github.com wrote:

My thought about that was "unwrap means might fail", but "map means won't
fail" with higher precedence. I dunno, maybe it is better to keep that as
consume, but I'd like to see one or the other word go away. Would it be any
better to replace all "unwrap" with "consume"?


Reply to this email directly or view it on GitHubhttps://github.com//issues/7887#issuecomment-21460848
.

@erickt
Copy link
Contributor

erickt commented Jul 24, 2013

@bblum: looking back at my patch, I also added .map_ref() in the case you didn't want to pay the cost to clone the container. Hows does that pattern sound?

@bblum
Copy link
Contributor

bblum commented Jul 24, 2013

That (foo and foo_ref) was my original leaning, but then I thought about .iter(). The vast majority of the time, (I expect) people will want ref_iter() instead of consume_iter(). But maybe the syntax sugar will save us in that case?

The other qualm I had with that idea is that ARC unwrap should have a longer, more ominous name than get. But maybe we can keep the name unwrap for Option and for ARC, as a special case of implicit-consume that additionally means "might fail, be careful".

@bluss
Copy link
Member

bluss commented Jul 24, 2013

Keep unwrap for the case where you get "the whole content" out of a thing. I agree .map, .map_mut sound good for mapping by reference. What about .map_value for the consume case? (EDIT: I talk only about Boxes, like Option).

Kimundi, I don't really agree with any of your alternatives. Separate Option from the containers. Option is a building block for the latter.

@Kimundi
Copy link
Member Author

Kimundi commented Jul 24, 2013

A note to map(), map_mut() etc. Could we not just use an external iterator for that? Once it works properly it would end up as box.map() and box.iter_mut().map()

@bluss
Copy link
Member

bluss commented Jul 24, 2013

Not for Option, since it's so central, it needs a lot of sugar

@erickt
Copy link
Contributor

erickt commented Jul 24, 2013

The challenge which approach do we pick for the shortest name? I converted .map() to consume because I found most use cases could be implemented with .map_consume() and I figured I'd change the shorter name to be the more efficient one. But since reference iterators are more general, it makes sense to keep .iter() returning references.

What if we made the container's .map() consume, and use .iter().map() if you want the references? With mostly-working default methods, we could lift these implementations into some traits like this:

trait ConsumableIterator<T> {
    fn map<U>(&self, f: &fn(T) -> U) -> ConsumableIterator<U>;
}

trait FromIterator<T> {
    static fn from_consume_iter(i: ConsumableIterator<T>) -> Self;
}

trait Mappable<T>: ConsumableIterator<T> {
    fn map<U, Container: FromIterator<U>>(self, f: &fn(T) -> U) -> Container<U> {
        FromIterator::from_consume_iter(self.consume_iter().map(f))
}

impl<T> Mappable<T> for ~[T] {}

@thestinger
Copy link
Contributor

I don't think consuming is usually what you want for containers, and lots of them won't be able to do it.

@erickt
Copy link
Contributor

erickt commented Jul 24, 2013

@thestinger: How so? I'm using the term consume in this case to mean transferring the ownership of the contained elements from one container to another. Any container that allows for removing of elements (e.g. not bloom filters) should be able to support this kind of operation.

@thestinger
Copy link
Contributor

Consume would often be O(nlogn) with a tree instead of O(n), and not possible with persistent containers.

I'm not sure if you would be able to write consume for a container like a list (chunked or not), without unsafe code that's likely to break if drop flags are removed.

@erickt
Copy link
Contributor

erickt commented Jul 24, 2013

@thestinger: By O(nlogn), are you counting both the deconstruction and reconstruction of a container? If just deconstruction, we can implement tree consumers in Rust in O(n) if take advantage of unique pointers. It allows us to violate the tree algorithm invariants to unfold the tree structure. Here's a simple O(2n) version to consume the tree:

enum Tree<T> {
    Node(T, ~Tree<T>, ~Tree<T>),
    Empty
}

impl<T> Tree<T> {
    fn consume_iter(self) -> ConsumeIterator<T> {
        fn aux(node: Tree<T>, mut values: ~[T]) {
            match node {
                Empty => values,
                Node(value, left, right) => {
                    values = aux(left);
                    values.push(value);
                    values = aux(right);
                    values
                }
            }
        }
        aux(self, ~[]).consume_iter()
    }
}

It wouldn't be that hard to optimize away values, but we'd still need a stack to track where we are in the tree.

Regarding persistent containers, I totally agree, and it makes total sense for their .map() to return references.

Regarding writing consume for a mutable singly linked list, you would need an intermediate structure that stores the consumed items in reverse order. All you need to do then is unreverse them. It'd also be O(2n). We could use dark magic to reverse the pointer directions to get it down to O(n) but I'm not sure if it'd be worth the unsafety.

@thestinger
Copy link
Contributor

@erickt: I was thinking about the complexity of consuming and building up a new tree compared to just doing the operation in-place. Default methods like fold would likely be based on iter and I think by-reference is usually going to be desired.

@graydon
Copy link
Contributor

graydon commented Jul 25, 2013

I added a consume iterator to treemap recently.

@graydon
Copy link
Contributor

graydon commented Jul 25, 2013

(no unsafe, o(n), very tidy!)

@graydon
Copy link
Contributor

graydon commented Jul 25, 2013

More to the point of the bug, I think i prefer no suffix means by-ref, we use the suffix _move rather than take, consume or unwrap, and partial functions are funneled into either conditions or options with an _opt suffix.

@erickt
Copy link
Contributor

erickt commented Jul 25, 2013

@graydon: So .map_move()? I can live with that.

Speaking towards the bug, do we need partial forms of functions like .take()? We're starting to hit a combinatorial explosion of different function variants: by-ref / by-move / mut / default / map / expect. Instead I'd prefer if we just have up to 3 forms when necessary: by-move, by-immutable-ref, and by-mutable-ref. If we want a partial result, we can call Some(~"foo").map(|x| x.to_str()).expect(~"I expected some foo"). It does result in an extra Option allocation, but that should be both relatively cheap, and easy to optimize away with enough #[inline] attributes.

@Kimundi
Copy link
Member Author

Kimundi commented Jul 25, 2013

Isn't take_unwrap() used quite a lot in the codebase? Having an explicit take() would allow to construct that call as take().unwrap().

@nikomatsakis
Copy link
Contributor

One observation: Persistent collections will not be able to support a "consume" operation.

@Kimundi
Copy link
Member Author

Kimundi commented Jul 25, 2013

@nikomatsakis: Yeah, I figured you'd need a separate trait for each of these functions.

bors added a commit that referenced this issue Aug 7, 2013
According to #7887, we've decided to use the syntax of `fn map<U>(f: &fn(&T) -> U) -> U`, which passes a reference to the closure, and to `fn map_move<U>(f: &fn(T) -> U) -> U` which moves the value into the closure. This PR adds these `.map_move()` functions to `Option` and `Result`.

In addition, it has these other minor features:
 
* Replaces a couple uses of `option.get()`, `result.get()`, and `result.get_err()` with `option.unwrap()`, `result.unwrap()`, and `result.unwrap_err()`. (See #8268 and #8288 for a more thorough adaptation of this functionality.
* Removes `option.take_map()` and `option.take_map_default()`. These two functions can be easily written as `.take().map_move(...)`.
* Adds a better error message to `result.unwrap()` and `result.unwrap_err()`.
erickt added a commit to erickt/rust that referenced this issue Aug 10, 2013
bors added a commit that referenced this issue Aug 10, 2013
This PR does a bunch of cleaning up of various APIs. The major one is that it merges `Iterator` and `IteratorUtil`, and renames functions like `transform` into `map`. I also merged `DoubleEndedIterator` and `DoubleEndedIteratorUtil`, as well as I renamed various .consume* functions to .move_iter(). This helps to implement part of #7887.
erickt added a commit to erickt/rust that referenced this issue Sep 11, 2013
@brson
Copy link
Contributor

brson commented Sep 17, 2013

Does anybody have a succinct explanation of the current consensus on this subject for https://github.com/mozilla/rust/wiki/Doc-detailed-release-notes?

@Kimundi
Copy link
Member Author

Kimundi commented Sep 20, 2013

If we do #9355 (which I'd be totally in favor for), it would pretty much reduce all discussion here to get vs unwrap.

I'm kinda embarrassed that I never got the idea that you might not need three different versions of accessors for Option... ;)

@thestinger
Copy link
Contributor

Many of the issues have been dealt with by a3b04c1 and the work towards consistency in the codebase. I've opened #9784 about get vs. unwrap and I think we can tackle any other issues in independent bugs.

osaut pushed a commit to osaut/rust that referenced this issue Oct 31, 2013
This is an alternative version to rust-lang#8268, where instead of transitioning to `get()` completely, I transitioned to `unwrap()` completely.

My reasoning for also opening this PR is that having two different functions with identical behavior on a common datatype is bad for consistency and confusing for users, and should be solved as soon as possible. The fact that apparently half the code uses `get()`, and the other half `unwrap()` only makes it worse.

If the final naming decision ends up different, there needs to be a big renaming anyway, but until then it should at least be consistent.

---

- Made naming schemes consistent between Option, Result and Either
- Lifted the quality of the either and result module to that of option
- Changed Options Add implementation to work like the maybe Monad (return None if any of the inputs is None)  
  See rust-lang#6002, especially my last comment.
- Removed duplicate Option::get and renamed all related functions to use the term `unwrap` instead  
  See also rust-lang#7887.

Todo: 

Adding testcases for all function in the three modules. Even without the few functions I added, the coverage wasn't complete to begin with. But I'd rather do that as a follow up PR, I've touched to much code here already, need to go through them again later.
flip1995 pushed a commit to flip1995/rust that referenced this issue Nov 4, 2021
…ne, r=flip1995

Remove hidden code lines in Clippy's lint list

This PR removes code lines from Clippy's lint list, which would also be hidden, when generating docs with rustdoc.

"A picture is worth a thousand words"... and here are even two pictures:

**Before:**

![image](https://user-images.githubusercontent.com/17087237/138952314-676dd9e7-ee80-459e-b521-bc42d8d03517.png)

**After:**

![image](https://user-images.githubusercontent.com/17087237/138952684-4fef969d-6880-4434-a338-b1a5a45539f0.png)

---

changelog: none

r? `@camsteffen` (Since you implemented the code block filtering 🙃 )
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants