Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor code clean-ups and performance improvements #906

Merged
merged 2 commits into from Mar 15, 2024

Conversation

Thorium
Copy link
Member

@Thorium Thorium commented Mar 15, 2024

Code clean-up with minor performance optimizations:

No functional changes.

Copy link
Collaborator

@nojaf nojaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Thorium, this is quite insightful.

It sparks some interest to detect these things via analyzers.
Before I dwindle in that territory, did you detect these manually? Or use some tooling?

Last question: is any of this noticeable? We don't have a benchmark so I'm not asking you to add that but did you run this before and after? Just curious.

@@ -15,8 +15,12 @@ open FSharp.Formatting.Markdown

module String =
/// Matches when a string is a whitespace or null
[<return: Struct>]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any idea under which conditions you would start using this?

  • What is the minimum language version?
  • Always for unit or can this be used for other return types as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Struct constructs came with FSharp.Core 6.0.1 so that's the minimum version.
Works with other return types too.


module List =
/// Matches a list if it starts with a sub-list that is delimited
/// using the specified delimiters. Returns a wrapped list and the rest.
let inline (|DelimitedWith|_|) startl endl input =
let inline internal (|DelimitedWith|_|) startl endl input =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So internal because this is used in other files in the project?
Otherwise private would also work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to limit the scope to this solution: As you know inline makes compiler slower. So if you have hundreds of public inline functions in your solution and someone refers your project via Nuget package, they are going to pay the price with their compilation too, because those inline-functions come available for them too, even when they probably are not aware of them and probably will never need them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my reasoning only, and I might be wrong:
If do code like this: myInt |> and now you ask your F# IDE to auto-complete, then when it knows the type of "myInt" then it can give proper function options: Those who take int in this case.

But in case of inline functions, they use statically resolved type parameters.

So if you have a referenced 10s other nuget packages and all have 10s of inline functions, then... all of those might be potentials for your auto-complete. You probably don't want call those, so it's just noise that could be avoided if they'd be internal functions.


open ArgParser

[<Struct>]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heuristic would be here to add [<Struct>] when the DU only has cases without any fields?

Copy link
Member Author

@Thorium Thorium Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Enums and similar are struct per default, so if you do this:

type MyE =
| A = 1
| B = 2

it's fine as is. But if you do this:

type MyE =
| A 
| B 

...it's not a struct. So basically it'll generate a class of MyE containing a class of A and class of B and an instance of either.

@Thorium
Copy link
Member Author

Thorium commented Mar 15, 2024

FSharpLint can detect the null-check, but it's a problematic project because it seems to break on each new major .NET release (which are often) and it randomly crashes on many other cases. The others were manual.

@Thorium
Copy link
Member Author

Thorium commented Mar 15, 2024

General guidelines about structs (and I don't have good references to point here):

Keep your structs small.

By small I mean a bool or guid or int or int64 or another struct, but not bunch of other classes or interface-implementations of any kind or big lists or anything like that.

Struct is read to memory at once and kept there, so it's not a pointer to somewhere else, and it makes the processing fast and footprint small. The limitation is function parameters, structs are copied and not sent as pointers, which means if you have big structs, then function calls go slow. Which you can avoid by making your structs as ByRef and so on, but then they are mutable and it goes as hack-hack rather than proper enterprise code. But that's why e.g. System.Text.Json is so fast, they use big structs byref and trust to take 100% out of your single core.

If you have a simple object having a few non-class fields, then it's better to be struct. If you have a 1000s of these small objects it will really improve the performance and memory footprint when processing them through.

Option class is not a struct, but there is a struct-corresponding ValueOption. ValueOption has IsSome and IsNone and typically the usage with match-clause will just work. And there is ValueOption.bind and so-on like Option.bind you've used to. But if you use lot of existing F# functions like List.tryPick they work with Some x and not ValueSome x, and there I'd go for usability over performance.

Tuple class is not a struct but, but there is a ValueTuple which is used instead of (a,b) like this: struct(a,b). That's a bit inconvenient to use.

So this is still good:

[<Struct>]
type MyType =
| MyOp of int
| MyOp2 of bool ValueOption 
| MyOp3 of struct(bool*bool) 

They also say you should performance-test, to verify your changes, because many times it depends on use-case if a harmless looking [<Struct>] type MyType = MyOp of int array is better to be a struct or not.

So for conservative analysis, I'd go for only non-data-carrying and/or only plain boolean/int/guid/struct-field types to be recommended to be structs.

@nojaf
Copy link
Collaborator

nojaf commented Mar 15, 2024

Thanks for all the insight!

Last question: is any of this noticeable? We don't have a benchmark so I'm not asking you to add that but did you run this before and after? Just curious.

@@ -464,26 +463,29 @@ module internal ArgParser =
else
None

[<return: Struct>]
Copy link
Member Author

@Thorium Thorium Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nojaf here is an example where there is a Struct boolean. As boolean is a tiny field, there is no point of wrapping it to inside a separate class, like the original code did. Wrapping just makes the processing of comparisons etc slower.

@Thorium
Copy link
Member Author

Thorium commented Mar 15, 2024

The comparisons are probably microseconds, so I expect it's not human noticeable in a single run of generating the docs, but if you have build-pipeline over every commit queuing generating docs, then even small changes can affect over time, to free up some CI workers faster.

@nojaf
Copy link
Collaborator

nojaf commented Mar 15, 2024

Sure, that is fair. I just wanted you to admit that you didn't check 😉.
A valuable PR nonetheless.

@Thorium
Copy link
Member Author

Thorium commented Apr 1, 2024

Here is quite detailed reference post of Structs: https://www.bartoszsypytkowski.com/writing-high-performance-f-code/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants