Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Style guide for base library #8

Closed
chenyan-dfinity opened this issue Apr 22, 2020 · 15 comments
Closed

Style guide for base library #8

chenyan-dfinity opened this issue Apr 22, 2020 · 15 comments
Assignees

Comments

@chenyan-dfinity
Copy link
Contributor

chenyan-dfinity commented Apr 22, 2020

I think we need a style guide for base library. There are already some inconsistencies in the existing modules.

Several items I would like to discuss here:

  • Do we want to provide an OO style interface for all containers?
    List doesn't have a class interface. Trie provides a class interface in a separate file TrieMap.
  • For the modules that have class interface, which methods goes inside the class, and which stays outside? Is there a cost for having too many methods inside a module/class?
  • How do we handle containers that require "type class", e.g. Hash, Ord, Eq.
    We use public class Container<T>(ord : (T, T) -> Bool) for taking the ord relation. Does this mean that all the methods should stay inside the class. Because methods outside of the class will need to take both ord and Container as arguments, and this can lead to inconsistency.
  • Module naming convention. Each module usually has a type and a class constructor, e.g. type Container<T> = ...; public class MakeContainer<T>(eq). The usual convention is to use t for value type and mkContainer for constructor. But t<T> looks odd, and we usually name the constructor as a type name.
    The code then looks like let buf : Buf.Buf<Int> = Buf.Buf<Int>(10). Does let buf : Buf.t<Int> = Buf.mkBuf<Int>(10) feel better?
  • Method naming convention. We need a consistent naming convention and type signature for common ADT methods, e.g. add vs insert; remove vs delete; does add return a value; does remove trap for empty container; does remove return the removed elements; etc.
  • Multiple implementation for the same interface. Do we have a preference for imperative vs functional implementation, or we may want both in some cases? How about different data structures for the same interface, e.g. red-black tree vs splay tree. Do we want an interface file for each container, i.e. an object type that lists all method signatures.
  • Guidelines about testing and documentation. What needs to go in the comments/documentation, e.g. time/space complexity. Can we have a testing framework that works on all container modules.
@rossberg
Copy link
Contributor

Thanks @chenyan-dfinity for starting this discussion. I agree.

I think OO vs procedural is the most difficult question to answer. Previously I would have tended towards OO, not because I particularly like it (I don't), but because it's a much better fit for our target audience. And because it's the only way currently to provide encapsulation.

But stable variables may change this picture, since we cannot currently allow objects in there. OTOH, it is too early to change directions for the sake of stable (we need some practical proof of their viability first).

A compromise might be to have classes as wrappers around procedural interfaces for everything, but the duplication may also be confusing.

What does everybody think?

@rossberg
Copy link
Contributor

rossberg commented Apr 23, 2020

A few more random thoughts, independent of OO vs PP:

Naming:

  • All types should be capitalised, values (including variant tags) shouldn't.
  • Classes define types, so they should never be named MakeX but just X.
  • Most of the times, data structure types should be named after their functionality, not their implementation (e.g., OrderedMap vs RedBlackTreeMap) -- don't assume that our users know much about data structures.
  • Obviously, analogous functions should have the same names across types.

Types:

  • Mutators should generally return (). In cases where they could e.g. return the old value, that should be provided by a separate function.
  • Orderings should return an Order type, not Bool.
  • For now let's not bother with defining signature types for abstractions to allow multiple implementations of the same thing.

Data structures:

  • In some cases it's useful to have both functional and imperative versions (sets, ordered maps, queues), in others it isn't (hash maps).
  • For non-class types, we could handle the parameterisation issue (functors in ML, type classes in Haskell) by emulating functors, i.e., having a function returning a module. Have we tried? Is it worth it?

Should we take clues from Scala's collection library? Minus the overengineering, of course, more wrt to conventions.

@matthewhammer
Copy link
Contributor

matthewhammer commented Apr 23, 2020

What does everybody think?

I think that we do need a style guide, and that sometime soon, the base library should conform to it. However, as we are discussing and realizing, the best style is not always so clear. Thankfully, in many cases above, it is.

All of the stuff that Andreas lists above as "Naming" and "Types" seems reasonable to me, and to be consistent with what we've been doing (or aiming to do).

However, for "Data structures", things are less clear, at least to me, currently.

I agree with the first point (sometimes we want both FP and OO interfaces, and sometimes we only want an OO interface).

For things that seem more like functors but that are not classes (so, not HashMap), what concrete examples are you considering @rossberg? I was thinking about the purely functional Trie API, or any other set/map interface that's pure. I am also thinking about things like graphs (maps with even more structure).

In those cases, we often want a type with associated operations, like the param to a functor. I have not tried writing a function that accepts a module, but I have tried doing some things this week (with Claudio's help) involving this pattern, where one module is parameterized by the types and operations of another. (I'm trying to make the Adapton graph code separate from the "CleanSheets" DSL evaluator).

Even using a class is painful, since as it stands, the internal representation of this functor needs to be lifted out, for various reasons about the current limitations of type variable capture. I may be misconnecting my current experience to this point, but don't think so. (Perhaps @crusso can correct me if this is not connected.) I expect that using a function will have a similar issue, since classes are merely that. (Right?)

@matthewhammer
Copy link
Contributor

A compromise might be to have classes as wrappers around procedural interfaces for everything, but the duplication may also be confusing.

If I understand this suggestion, then the Red Black tree module follows it, within a single module.

Perhaps we should do a merging of Trie and TrieMap modules to mirror this pattern?

@rossberg
Copy link
Contributor

Ah, you are right about the type capture issue. Damn, I think we need to get back to and improve our nested types semantics...

I don't necessarily think that the OO and PP interface need to be in the same module. It's probably less confusing if they aren't.

@chenyan-dfinity
Copy link
Contributor Author

chenyan-dfinity commented Apr 24, 2020

Classes define types, so they should never be named MakeX but just X.

I never quite understand class in Motoko. It is used as both a type and a constructor, sometimes even as a functor. If we are using class as a functor, we should probably name it makeX?

Following the discussion, I have a PR to refactor Heap: #11. And I feel the Heap class I am writing there looks like a functor. I also left some questions there, feel free to leave comments.

@matthewhammer
Copy link
Contributor

matthewhammer commented Apr 24, 2020

@chenyan-dfinity I also want to discuss this question ("how do we do functors in Motoko?, always with a class of some form, and if so, what are the patterns?").

I'm creating a companion issue to this one for that tangent #13

@rossberg
Copy link
Contributor

@chenyan-dfinity, classes are intended to mirror typical OO-classes (in as simple a way as possible). Emulating functors doesn't really work with them until we relax our restriction on free type vars in type defs. But classes aren't meant to be functors anyway.

True functors would take and return a module. They would require dependent type paths and abstract type members. Not currently a priority. :)

@matthewhammer
Copy link
Contributor

Orderings should return an Order type, not Bool.

@rossberg One clarification: All such orderings are total (not partial?), right?

(So that Order type is always isomorphic to the 3-way sum type () + () + ())

@rossberg
Copy link
Contributor

rossberg commented May 5, 2020

@matthewhammer, yes, I would e.g. define it as type Order = {#less; #equal; #more}.

In cases where partial orders arise, maybe the natural type to use would be ?Order.

@chenyan-dfinity
Copy link
Contributor Author

hmm, ?Order is not a well-formed partial order. We need either ?{#less; #equal} or Bool to represent partial orders. Bool feels more natural to me, but it doesn't distinguish between strict/non-strict partial orders. Which one do you prefer?

Also for priority queue, I assume we want to take a partial order, so that heapsort becomes a topological sort.

@rossberg
Copy link
Contributor

rossberg commented May 6, 2020

@chenyan-dfinity, I don't understand, why is ?Order not the partial analogue to what Order is for total orders?

@chenyan-dfinity
Copy link
Contributor Author

ah, okay. Total order has the same problem: if Ord(a,b)=#less, I would expect Ord(b,a)=#more. But the type system doesn't prevent you from saying something else. So we are assuming the user providing a properly formed total order function then.

For partial order, if we return Bool or ?{#less; #equal}, it will always be well-formed, and we never need to think about the symmetric case of #more.

@rossberg
Copy link
Contributor

rossberg commented May 6, 2020

Well, the type system also cannot prevent an order that isn't transitive. Or not reflexive. Or one that returns true one time and false the next. So I don't know if symmetry is a property particularly worth worrying about.

The problem with a Bool is that you typically need twice as many comparisons to figure out the case you're in. If they are expensive that's undesirable.

@rossberg
Copy link
Contributor

rossberg commented Jun 5, 2020

Okay, I switched back from extract to remove. To make it a bit more symmetric, I also changed swap to replace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants