Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dedicated method for ranges translation in CE #1116

Open
5 tasks done
gsomix opened this issue Feb 14, 2022 · 8 comments
Open
5 tasks done

Add dedicated method for ranges translation in CE #1116

gsomix opened this issue Feb 14, 2022 · 8 comments

Comments

@gsomix
Copy link

gsomix commented Feb 14, 2022

Addition of InlineIfLambda attribute in F# 6 opened the way for high-performance computation expressions (imagine one for zero allocation ZString builder). But we can do better. I propose we change translation rules to allow optimisations of for-loops in CE.

Consider following code using list.fs builder:

listc { for i = 0 to 10 do i*i }

According to F# spec compiler uses following rules to translate this expression:

T(for x = e1 to e2 do ce, V, C, q) = T(for x in e1 .. e2 do ce, V, C, q) (*)
T(for x in e do ce, V, C, q) = T(ce, V ⊕ {x}, λv.C(b.For(src(e), fun x -> v)), q) (**)

where e1 .. e2 (*) is range operator from standard library that creates sequence of values and For (**) is defined as:

member inline b.For(
    sequence: seq<'TElement>, 
    [<InlineIfLambda>] body: 'TElement -> ListBuilderCode<'T>) 
    : ListBuilderCode<'T> =
    b.Using (sequence.GetEnumerator(), 
        (fun e -> b.While((fun () -> e.MoveNext()), (fun sm -> (body e.Current).Invoke &sm))))

So simple for-loop allocates sequence and calls Using method that might be undesirable in high-performance code. I propose to add new translation rule:

T(for x in e1 .. e2 do ce, V, C, q) = T(for x in range(e1, e2) do ce, V, C, q)

where range denotes b.Range(e1, e2) if builder b contains Range method. Otherwise, range(e1, e2) denotes e1 .. e2.

It allows to implement same optimisation compiler does for loops where

for x in 1..10 do
    printfn $"{x}"

compiles into effective while loop.

Let's draft it! First add Range method:

member inline b.Range(e1: int, e1: int) = Range(Index(e1), Index(e2))

Then, add For overload:

member inline b.For(
    range: Range, 
    [<InlineIfLambda>] body: int -> ListBuilderCode<'T>)  =
    ListBuilderCode<_>(fun sm -> 
        for i = range.Start to range.End do
            (body i).Invoke &sm)

The existing way of approaching this problem in F# is override range operator (..). Surprisingly CE uses operator from the context!

Pros and Cons

The advantage of making this adjustment to F# is allowing more high-performance scenarios for CE.

The disadvantages of making this adjustment to F# are increasing complexity of translations rules and compiler.

Extra information

Estimated cost (XS, S, M, L, XL, XXL): L

Affidavit (please submit!)

Please tick this by placing a cross in the box:

  • This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
  • I have searched both open and closed suggestions on this site and believe this is not a duplicate
  • This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

  • This is not a breaking change to the F# language design
  • I or my company would be willing to help implement and/or test this

For Readers

If you would like to see this issue implemented, please click the 👍 emoji on this issue. These counts are used to generally order the suggestions by engagement.

@Happypig375
Copy link
Contributor

What about ranges with steps?

@baronfel
Copy link
Contributor

Here's an example of that form: Fantomas AST

And the specific AST nodes for the ForEach (in a modified JSON form for collapsibility):

Minimized AST for step-range for-each expressions
{
    "ForEach": {
        "forDebugPoint": "yes",
        "seqExprOnly": false,
        "isFromSource": true,
        "pat": {
            "Named": {
                "ident": "i",
                "isThisVal": false,
                "accessibility": null
            }
        },
        "enumExpr": {
            "IndexRange": {
                "expr1": {
                    "IndexRange": {
                        "expr1": {
                            "Const": {
                                "Int32": 0
                            }
                        },
                        "expr2": {
                            "Const": {
                                "Int32": 2
                            }
                        }
                    }
                },
                "expr2": {
                    "Const": {
                        "Int32": 100
                    }
                }
            }
        },
        "bodyExpr": {
            "YieldOrReturn": {
                "flags": [
                    true,
                    false
                ],
                "expr": {
                    "Ident": "i"
                }
            }
        }
    }
}

It seems like a ForEach member could be exposed that is provided the start/step/end as well for customized iterations?

@En3Tho
Copy link

En3Tho commented Jun 19, 2022

I think that effectively iterating a collection or a span or range would be a very neat feature for CEs.

I made a few array pool based builders and For is the only place they still allocate I believe. Such feature is definitely a big improvement for perf-oriented CEs

@baronfel
Copy link
Contributor

baronfel commented Jun 19, 2022

I did a very quick spike on this on this branch: baronfel/fsharp - range-ce-member. the Range member + For member overload approach seems to work well (though as noted above we'd need a solution for steps).

I did a bit of manual testing with the following builder and CE:

builder + CE with the new members
open System
open System.Collections.Generic
open Microsoft.FSharp.Quotations

[<Struct; NoEquality; NoComparison>]
type ArrayCollector<'T> =
    [<DefaultValue(false)>]
    val mutable ResizeArray: ResizeArray<'T>

    [<DefaultValue(false)>]
    val mutable First: 'T

    [<DefaultValue(false)>]
    val mutable Second: 'T

    [<DefaultValue(false)>]
    val mutable Count: int

    member this.Add (value: 'T) = 
        match this.Count with 
        | 0 -> 
            this.Count <- 1
            this.First <- value
        | 1 -> 
            this.Count <- 2
            this.Second <- value
        | 2 ->
            let ra = ResizeArray<'T>()
            ra.Add(this.First)
            ra.Add(this.Second)
            ra.Add(value)
            this.Count <- 3
            this.ResizeArray <- ra
        | _ -> 
            this.ResizeArray.Add(value)

    member this.AddMany (values: seq<'T>) =
        if this.Count > 2 then
            this.ResizeArray.AddRange(values)
        else
            // cook a faster iterator for lists and arrays
            match values with 
            | :? ('T[]) as valuesAsArray -> 
                for v in valuesAsArray do
                   this.Add v
            | :? ('T list) as valuesAsList -> 
                for v in valuesAsList do
                   this.Add v
            | _ ->
                for v in values do
                   this.Add v

    member this.AddManyAndClose (values: seq<'T>) =
        this.AddMany(values)
        this.Close()

    member this.Close() =
        match this.Count with 
        | 0 -> Array.Empty<'T>()
        | 1 -> 
            let res = [| this.First |]
            this.Count <- 0
            this.First <- Unchecked.defaultof<_>
            res
        | 2 -> 
            let res = [| this.First; this.Second |]
            this.Count <- 0
            this.First <- Unchecked.defaultof<_>
            this.Second <- Unchecked.defaultof<_>
            res           
        | _ ->
            let res = this.ResizeArray.ToArray()
            this <- ArrayCollector<'T>()
            res

[<Struct; NoEquality; NoComparison>]
type ListBuilderCollector<'T> =
    [<DefaultValue(false)>]
    val mutable Collector: ArrayCollector<'T>

    member sm.Yield(value: 'T) = sm.Collector.Add(value)

    member sm.ToList() = sm.Collector.Close()

type ListBuilderCode<'T> = delegate of byref<ListBuilderCollector<'T>> -> unit

type ListBuilderViaCollector() =

    member inline _.Delay([<InlineIfLambda>] f: unit -> ListBuilderCode<'T>) : ListBuilderCode<'T> =
        ListBuilderCode<_>(fun sm -> (f ()).Invoke &sm)

    member inline _.Zero() : ListBuilderCode<'T> = ListBuilderCode<_>(fun _sm -> ())

    member inline _.Combine
        (
            [<InlineIfLambda>] part1: ListBuilderCode<'T>,
            [<InlineIfLambda>] part2: ListBuilderCode<'T>
        ) : ListBuilderCode<'T> =
        ListBuilderCode<_> (fun sm ->
            part1.Invoke &sm
            part2.Invoke &sm)

    member inline _.While
        (
            [<InlineIfLambda>] condition: unit -> bool,
            [<InlineIfLambda>] body: ListBuilderCode<'T>
        ) : ListBuilderCode<'T> =
        ListBuilderCode<_> (fun sm ->
            while condition () do
                body.Invoke &sm)

    member inline _.TryWith
        (
            [<InlineIfLambda>] body: ListBuilderCode<'T>,
            [<InlineIfLambda>] handler: exn -> ListBuilderCode<'T>
        ) : ListBuilderCode<'T> =
        ListBuilderCode<_> (fun sm ->
            try
                body.Invoke &sm
            with
            | exn -> (handler exn).Invoke &sm)

    member inline _.TryFinally
        (
            [<InlineIfLambda>] body: ListBuilderCode<'T>,
            compensation: unit -> unit
        ) : ListBuilderCode<'T> =
        ListBuilderCode<_> (fun sm ->
            try
                body.Invoke &sm
            with
            | _ ->
                compensation ()
                reraise ()

            compensation ())

    member inline b.Using
        (
            disp: #IDisposable,
            [<InlineIfLambda>] body: #IDisposable -> ListBuilderCode<'T>
        ) : ListBuilderCode<'T> =
        // A using statement is just a try/finally with the finally block disposing if non-null.
        b.TryFinally(
            (fun sm -> (body disp).Invoke &sm),
            (fun () ->
                if not (isNull (box disp)) then
                    disp.Dispose())
        )

    member inline b.For
        (
            sequence: seq<'TElement>,
            [<InlineIfLambda>] body: 'TElement -> ListBuilderCode<'T>
        ) : ListBuilderCode<'T> =
        b.Using(
            sequence.GetEnumerator(),
            (fun e -> b.While((fun () -> e.MoveNext()), (fun sm -> (body e.Current).Invoke &sm)))
        )
    
    member inline b.For(
        range: Range, 
        [<InlineIfLambda>] body: int -> ListBuilderCode<'T>)  =
        ListBuilderCode<_>(fun sm -> 
            for i = range.Start.Value to range.End.Value do
                (body i).Invoke &sm)

    member inline _.Yield(v: 'T) : ListBuilderCode<'T> =
        ListBuilderCode<_>(fun sm -> sm.Yield v)

    member inline b.YieldFrom(source: IEnumerable<'T>) : ListBuilderCode<'T> =
        b.For(source, (fun value -> b.Yield(value)))

    member inline _.Range(start:int, finish: int) = System.Range(Index.op_Implicit start, Index.op_Implicit finish)

    // member _.Quote(y) = y

    // member inline _.Run(code: Expr<ListBuilderCode<'T>>) = 
    //     printfn "%A" code

    member inline _.Run([<InlineIfLambda>] code: ListBuilderCode<'T>) : 'T [] =
        let mutable sm = ListBuilderCollector<'T>()
        code.Invoke &sm
        sm.ToList()


let b = ListBuilderViaCollector()

// b {
//     for x in 0..10 do 
//         printfn $"{x}"
// }

//desugars to

let bs: unit[] = 
    b.Run(
        b.Delay(fun () -> 
            b.For(
                b.Range(0, 10), 
                fun i -> 
                    printfn $"%d{i}"
                    b.Zero()
            )
        )
    )

I had to manually desugar the CE (using the commented-out quote member) to get something I could put into sharplab to see the results of.

Here's what the desugaring ends up compiling to:

range@162 = new Range(0, 10);
range@162-1 = @_.range@162;
int num = range@162-1.Start.Value;
int value = range@162-1.End.Value;
if (value >= num)
{
    while (true)
    {
        object[] array = new object[1];
        array[0] = num;
        PrintfFormat<Unit, TextWriter, Unit, Unit> format = new PrintfFormat<Unit, TextWriter, Unit, Unit, int>("%d%P()", array, null);
        PrintfModule.PrintFormatLineToTextWriter(Console.Out, format);
        num++;
        if (num == value + 1)
        {
            break;
        }
    }
}
bs@196 = sm@182.Collector.Close();

which is a nice, compact while loop, which I think was the intent. If there's interest I'm happy to continue exploring/flesh out this design.

@dsyme
Copy link
Collaborator

dsyme commented Apr 13, 2023

I've marked this as approved-in-principle

@baronfel
Copy link
Contributor

baronfel commented Nov 9, 2023

Just to make sure I understand - the proposed Range member wouldn't need to return a System.Range, would it? It could return whatever type, and then as long as there was a matching For member overload that took that type then everything would line up and compilation would succeed?

If that's the case, then a range expression with a step could be handled in a similar way:

  • extract out the beginning, step, end from the syntax
  • call a new RangeWithStep(beginning, step, end) member that can return any T
  • use the value of that call as an input into a For member call

@TheAngryByrd
Copy link

TheAngryByrd commented Dec 7, 2023

So technically you can kind of do this today. However you can't use the range syntax and must use something like a tuple:

The super relevant bits:

    member inline b.For(
        (start, end_) : struct (int*int),
        [<InlineIfLambda>] body: int -> ListBuilderCode<'T>)  =
        ListBuilderCode<_>(fun sm ->
            for i = start to end_ do
                (body i).Invoke &sm)
...

      b {
         for x in struct (0,100) do
             printfn $"{x}"
        }

SharpLab

So you could make a struct that contains the Start/Step/End values and utilize that as an overload today.

If you still want to control inputs and how they end up to other members, you could reuse the Source member (although there's little documentation on that member currently) to do what the proposed Range member is doing. However it would have to account for the m..n range syntax. Also one of the downsides currently is there must be a corresponding Source member for each input type (so the input for either For/Bind) and it might be worth relaxing that requirement.

@brianrourkeboll
Copy link

brianrourkeboll commented Jul 1, 2024

Hmm, I have a few questions about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants