Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type class vs data type Graph #5

Closed
snowleopard opened this issue Mar 1, 2017 · 7 comments
Closed

Type class vs data type Graph #5

snowleopard opened this issue Mar 1, 2017 · 7 comments
Labels

Comments

@snowleopard
Copy link
Owner

snowleopard commented Mar 1, 2017

At the moment we have type class Graph and data type called Basic that is its direct deep embedding (I also sometimes refer to it as Expr). However, it would be really nice to call the data type Graph too, to add it to Tree, Forest and the rest of the family of nice algebraic data types.

So, I propose the following:

  • The data type Graph lives in Algebra.Graph.Data: Of course we need to convince GHC to add it to containers, but it's really very lightweight. See Type class vs data type Graph #5 (comment).

    data Graph a = Empty | Vertex a | Overlay (Graph a) | Connect (Graph a) deriving Show
  • The type class Graph lives in Algebra.Graph. It also defines the instance Graph Data.Graph.

Presumably, if I'm implementing cool algorithms on the data type Graph, I don't care about Algebra.Graph and don't need to include it.

Alternatively, if I'm coding polymorphic functions, I don't need Algebra.Graph.Data and don't include it.

So, even though having the same name for the type class and data type sounds potentially confusing, they would probably rarely get mixed in the same file, and so can happily live in separate modules.

Having them in different modules also allows you to always be specific through qualified imports, like Data.Graph and Class.Graph, which seems nicer than inventing names like IGraph or GraphLike or what not for the type class.

@snowleopard
Copy link
Owner Author

snowleopard commented Mar 1, 2017

Oh, I just realised that putting it into Data.Graph won't work, because of type Graph = Table [Vertex] :-(

So, maybe Algebra.Graph.Data then.

@ndmitchell
Copy link

I prototyped a potential design here: https://gist.github.com/ndmitchell/800fb7bf65f5875a22e72b1b868e1442

@snowleopard
Copy link
Owner Author

snowleopard commented Mar 4, 2017

@ndmitchell Thanks! I've just played with your code a little and realised that toGraph limits the space of inhabitants of the type class only to instances that are isomorphic to directed graphs (or whatever graphs we choose to have by default according the type class laws), which is unfortunate.

For undirected, transitive, etc. graphs, the result is not immediately unique and one would have to come up with some canonical way to implement toGraph so that, e.g. connect a b and connect b a map to the same result, which is the only sensible behaviour, I think.

And for instances like ToList this method makes little sense, even though it's easy to obtain a good (from the canonicity point of view) implementation, e.g. toGraph = vertices . toList.

What about adding a class ToGraph with a single method? I'm not a fan of having a dozen of single-method classes, but I think it's important to keep the core unlimited in terms of the number of potential inhabitants.

As a more vague comment: I find toGraph a bit fishy, because mathematically it's just an id function for instances that are isomorphic to directed graphs, and it seems to exist purely because of efficiency considerations. Efficiency is important, but I'd prefer if it didn't force us to extend the core.

@snowleopard
Copy link
Owner Author

snowleopard commented Mar 4, 2017

Another argument for turning ToGraph into a separate class: it's not a subclass of Graph! It's only purpose is to turn your graph representation (an edgelist [(a,a)], Data.Graph.Graph, Gr () () from fgl, or anything else) into the Graph datatype, so that it's possible to reuse algorithms supported by the latter.

Note: making Data.Graph.Graph an instance of type class Graph would be a disaster: since it doesn't share common subgraphs, it would be ridiculously inefficient (imagine constructing it using the edges function). But making it an instance of ToGraph is perfectly sensible!

As a side note, we still don't have any cool algorithms on the Graph data type, which makes toGraph a bit premature -- why would we want to convert to Graph now? -- but I do share your belief that at some point it will actually become useful.

@ndmitchell
Copy link

I didn't intend to imply that toGraph must be invertible. If you have a bidirectional graph, and call connection a b, I'd expect toGraph to return Overlay (Connect a b) (Connect b a). I see no reason connect a b and connect b a return the same result. You have laws about how to reason about graphs, and can use them to say two things are equivalent.

Can you give an example of a type that can implement Graph but not toGraph?

Most of the graph algorithms you have written produce a polymorphic g as a result. That is equivalent to producing a Basic graph combined with a toGraph implementation, apart from the algorithm must at least do a complete recopy. With toGraph you get to chose where the copy/convert goes, giving you more efficient code - and you can actually start to compose steps (e.g. removeVertex) into algorithms. From what I see, you've already demonstrated the need for toGraph :) It does enable efficiency, but it seems far more principled that just for efficiency, it's for conversion/compositionality.

@snowleopard
Copy link
Owner Author

snowleopard commented Mar 6, 2017

I see no reason connect a b and connect b a return the same result. You have laws about how to reason about graphs, and can use them to say two things are equivalent.

I think they should return the same result -- otherwise, they will leak implementation details. For example, at the moment we implement the Eq instance of SymmetricRelation by comparing the symmetric closures of the underlying non-symmetric relations. A trivial implementation of toGraph will therefore return different results for connect a b and connect b a. Now imagine that at some point in future we decide to change the implementation and store symmetric relations in a canonical representation where all edges (a, b) are ordered, i.e. a < b. Then suddenly toGraph will return the same result for connect a b and connect b a and some code elsewhere may break as a result.

I therefore think that the following should be a law of toGraph: if x == y then toGraph x == toGraph y. This is the only way to guarantee that implementation details don't leak.

Can you give an example of a type that can implement Graph but not toGraph?

Consider the following classic instance:

newtype Size a = Size { getSize :: Int }

instance Graph (Size a) where
    type Vertex (Size a) = a
    empty       = Size 0
    vertex _    = Size 1
    overlay x y = Size $ getSize x + getSize y
    connect x y = Size $ getSize x + getSize y

It's not inconceivable that we define something like this to calculate the size of graph expressions. You will find it tricky to implement toGraph :-) I can even write a law-abiding Eq instance for it: _ == _ = True, which is practically useless, but still proves the point.

@snowleopard
Copy link
Owner Author

The data type Graph is now defined in the top-level module Algabra.Graph:
http://hackage.haskell.org/package/algebraic-graphs/docs/Algebra-Graph.html

The type class Graph has been demoted to Algabra.Graph.Class:
http://hackage.haskell.org/package/algebraic-graphs/docs/Algebra-Graph-Class.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants