Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discriminate Unions As An API #726

Open
sgtz opened this Issue Mar 22, 2019 · 2 comments

Comments

Projects
None yet
2 participants
@sgtz
Copy link

sgtz commented Mar 22, 2019

Extending Discriminate Unions As An API

Alternate Title: Extending Algebraic Data Types As An API

This proposal is about taking the Active/Partially Active Patterns as-a-view idea further, extending the DU to be an API (see the book Expert F#). In short, an Active / Partially Active pattern can be used to give match based semantics over any structure, but although flexible/very useful, the mechanism is limited in two ways:

  1. Active Patterns can handle a maximum of 7 cases. This makes exhaustive, complete case analysis much more difficult. While a pattern of nesting (even deep nesting), can still overcome this limitation, this becomes increasingly awkward and impractical.
  2. Partial Active Patterns force you to compromise on compiler assisted case analysis.

One really useful use of the DU as an API technique would be to access equivalent data structures without the need to reconfiguring and reformat their contents. What if you could receive a structure via TCP/IP and not immediately unwrap and place the message contents on the heap, but because the program understands that this different yet structurally equivalent memory layout, it could just access the contents in situ? I strongly suspect there are additional uses for this technique.

references:

  • GADT's in OCaml/Haskell seem to have a similar motivation.
  • I'd also like to reference Adam Sitnik's work on Span as part of the inspiration here (Span landed in the CLR's C# and F# semi-recently).

Without fully fleshing out what the implementation should look like, DUs in F# are implemented as a class with inner classes, with each class being an enum/case/tagged union (pick a word for depending on how you think about them). Instead of just inner classes, I propose that we also have the option of having inner interfaces, which then enables these rich "view" based potentials described above.

Possibly this is a feature that could co-exist with GADT's.

The existing way of approaching this problem in F# is ...

A discriminated union of function signatures

Pros and Cons

The advantages of making this adjustment to F# are ...

flexibility for library authors who love the flexibility of DUs, who might build up a framework + function libraries, and then see additional patterns that these functions could wrap themselves around without alteration.

All of the neat pattern matching would still be available. Simplicity would stay central to the program. Dare I say it in a functional context -- why not -- this would be a judicious use of the principle of information hiding (that thing many of us try to disavow ;) ).

The disadvantages of making this adjustment to F# are ...

extra complexity. A possible performance hit if used naively because interfaces are slower than concrete implementations. Extra syntactic sugar complexity? On the surface, it seems like a small shift. In practice, it could be more complex to implement.

The F# masses could get overexcited about this feature and it'd lead to the unravelling of the entire CLR ecosystem as we know it. Seriously though, it's something to use sparingly.

Extra information

Estimated cost (XS, S, M, L, XL, XXL): L

Related suggestions: (put links to related suggestions here)

Affidavit (please submit!)

Please tick this by placing a cross in the box:

  • [ X ] This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
  • [ X ] I have searched both open and closed suggestions on this site and believe this is not a duplicate
  • [ X ] This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

  • [ :) ] This is not a breaking change to the F# language design
  • [ X ] I or my company would be willing to help implement and/or test this

type CD = | C | D; let (|A|B|E|) x = match x with | 1 -> A | 2 -> B(C) | 3 -> B(D) | _ -> E

Yes you can, but don't, especially if you want to work with a central DU. This is a mashup of the idea of a view, and other potentials such as memory layouts. It is not about taking views to the nth degree. Syntactic sugar can hide anything, but we should err towards simplicity. This code snippet is not simple.

The below is an awkward way forward without the feature, but which at the same time is not too far away for how the basic mechanics might work behind the scenes (depending on what the interfaces referred to of course).

type FI32 = unit -> int
type FAI32 = unit -> int array
    
type AST = 
  | I32 of FI32
  | AI32 of FAI32
  | ASTList of AST list  

module AST =  
  let mkI32 (a : obj array) (i : int) = I32 <| (fun _ -> a.[i] :?> int)
  let mkAI32 (a : obj array) (i : int) = AI32 <| (fun _ -> a.[i] :?> int array)
  let flatConstruct a =
    let rec flatConstruct (a: obj array) i : AST list =
      match i,a.Length with 
      | i,c when i = c -> []
      | _ -> 
          match a.[i] with 
          | :? int -> mkI32 a i 
          | :? array<int> -> mkAI32 a i 
          | _ -> failwith "oops"
          :: flatConstruct a (i+1)
    ASTList <| flatConstruct a 0

  let tostr a = 
    let rec tostr a i =
      match a with 
      | I32 x -> sprintf "\n%s%i" (String.replicate (i*2) " ") (x()) 
      | AI32 x -> sprintf "\n%s%A" (String.replicate (i*2) " ") (x())
      | ASTList([]) -> ""
      | ASTList((ASTList _ as h)::t) -> tostr h (i+1)
      | ASTList(h::t) -> (tostr h i) + (tostr (ASTList(t)) i)  
    tostr a 0

let contrived_stream_of_bytes : obj array = [|3 :> obj;[|0;1;2;3|] :> obj|]
    
let a = AST.flatConstruct [|1 :> obj; [|10;11;12|] :> obj |] 

AST.tostr a

let b = AST.flatConstruct contrived_stream_of_bytes

AST.tostr b

The beauty (until I believe otherwise) of also using an interface as a getter framework instead of "unit -> some value" means that the value could be somewhere else, or it could be available concretely (like the way DUs work currently).

The Story So Far + Acknowledgments

I wanted to demonstrate the idea, and thought up a low language impact hacktivist approach by using John Azarius's C# DU via an F# DSL, with a twist that code gen was consumable by F#. I just need to implement structural equality, etc, and I'm done, right? How could I get F# to recognise John's generated C# DU? (refs will go here). Thanks, John for that chat about this!

After kicking the tires, Don Syme let me know that hacking in a change like this wasn't going to work. Don coined the term DU as an API. Thanks Don!

After describing what I wanted to do to Yaron Minsky, he replied, "oh... we use Generalised Algebraic Data Types for that." So, after a period of language envy, I've gone from hacktivism to wondering if the first step is to do something in a more language-centric way. Thanks for your well-timed comment Yaron! Invaluable.

Note: Richard Minich has already proposed GADTs for F#. I just feel that this approach may also be of benefit to F# within a CLR + C# friendly environment (which, through watching from the sidelines, I believe Don Syme has gone to great lengths to maintain).

@sgtz sgtz changed the title A Discriminate Union As An API Discriminate Unions As An API Mar 22, 2019

@7sharp9

This comment has been minimized.

Copy link
Member

7sharp9 commented Apr 5, 2019

On the first read this sounds like single case span active patterns, then on another read something else, I think some form of pseudo code or syntax is needed to describe this more.

In the case of my first read through a performance enhanced version of the single case pattern has merit. e.g having a non allocating view of a single case deconstruction of a series of bytes etc.

@sgtz

This comment has been minimized.

Copy link
Author

sgtz commented Apr 5, 2019

@7sharp9 Thanks for requesting some code to illustrate this better. On this cut, I provided some code based on an awkward way forward as a quick way of fleshing this out. All of the examples or talks on Span that I read or saw talked about single data cases. I kept thinking of heterogeneous use cases though (like the recursive DU above). If you squint hard enough, it's a little like another version of the "lazy" keyword. Is this the closest we can come to having a type-safe pointer? Maybe there's an even better scheme, but this way seems flexible. Perhaps it's an acceptable compromise.

"On the first read this sounds like single case span active patterns, then on another read something else"

Yes. Me too! Once the idea settled for me, I have interpreted this a few different ways. I'm trying not to explore the variations, and to stick with the more boring of these: memory layouts. I think there are additional patterns that will make sense though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.