Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the generated code: eliminate redundant code #24

Closed
costa100 opened this issue Dec 16, 2021 · 17 comments
Closed

Improve the generated code: eliminate redundant code #24

costa100 opened this issue Dec 16, 2021 · 17 comments

Comments

@costa100
Copy link

Hello:

I used to generate a tableScript with type=getAll for a bunch of tables, and I noticed that the same methods are repeated for each table. I attached a screenshot where you can see the code differences for two [tables.](url
Screen Shot 2021-12-15 at 6 16 52 PM
)

The ***Execute*** methods have essentially the same bodies though different return values, ultimately depending on the structure of the record.

I think these methods can be moved up in a base class that can be parameterized by the record type.

Here is a possible solution:

  // R is the return type
  [<AbstractClass>]
  type BaseScript<'R> (connStr: string, conn: SqlConnection, sql: string) =
 
    let configureCmd userConfigureCmd (cmd: SqlCommand) =
      cmd.CommandText <- sql
      userConfigureCmd cmd

    abstract member initOrdinals: SqlDataReader -> unit
    abstract member getItem: SqlDataReader -> 'R  


    //abstract member assignOrdinals (reader: SqlDataReader): Unit

    //abstract member readRow (reader: SqlDataReader) : r

    [<EditorBrowsable(EditorBrowsableState.Never)>]
    member val configureConn : SqlConnection -> unit = ignore with get, set

    [<EditorBrowsable(EditorBrowsableState.Never)>]
    member val userConfigureCmd : SqlCommand -> unit = ignore with get, set

    member this.ConfigureCommand(configureCommand: SqlCommand -> unit) =
      this.userConfigureCmd <- configureCommand
      this

    member this.ConfigureConnection(?configureConnection: SqlConnection -> unit) =
      match configureConnection with
      | None -> ()
      | Some config -> this.configureConn <- config
      this

    member this.ExecuteAsync(?cancellationToken) =
      executeQueryEagerAsync connStr conn this.configureConn (configureCmd this.userConfigureCmd) this.initOrdinals this.getItem [] (defaultArg cancellationToken CancellationToken.None)

    member this.AsyncExecute() =
      async {
        let! ct = Async.CancellationToken
        return! this.ExecuteAsync(ct) |> Async.AwaitTask
      }

    member this.ExecuteAsyncWithSyncRead(?cancellationToken) =
      executeQueryEagerAsyncWithSyncRead connStr conn this.configureConn (configureCmd this.userConfigureCmd) this.initOrdinals this.getItem [] (defaultArg cancellationToken CancellationToken.None)

    member this.AsyncExecuteWithSyncRead() =
      async {
        let! ct = Async.CancellationToken
        return! this.ExecuteAsyncWithSyncRead(ct) |> Async.AwaitTask
      }

    member this.Execute() =
      executeQueryEager connStr conn this.configureConn (configureCmd this.userConfigureCmd) this.initOrdinals this.getItem []

    member this.LazyExecuteAsync(?cancellationToken) =
      executeQueryLazyAsync connStr conn this.configureConn (configureCmd this.userConfigureCmd) this.initOrdinals this.getItem [] (defaultArg cancellationToken CancellationToken.None)

    member this.LazyExecuteAsyncWithSyncRead(?cancellationToken) =
      executeQueryLazyAsyncWithSyncRead connStr conn this.configureConn (configureCmd this.userConfigureCmd) this.initOrdinals this.getItem [] (defaultArg cancellationToken CancellationToken.None)

    member this.LazyExecute() =
      executeQueryLazy connStr conn this.configureConn (configureCmd this.userConfigureCmd) this.initOrdinals this.getItem []

    member this.ExecuteSingleAsync(?cancellationToken) =
      executeQuerySingleAsync connStr conn this.configureConn (configureCmd this.userConfigureCmd) this.initOrdinals this.getItem [] (defaultArg cancellationToken CancellationToken.None)

    member this.AsyncExecuteSingle() =
      async {
        let! ct = Async.CancellationToken
        return! this.ExecuteSingleAsync(ct) |> Async.AwaitTask
      }

    member this.ExecuteSingle() =
      executeQuerySingle connStr conn this.configureConn (configureCmd this.userConfigureCmd) this.initOrdinals this.getItem []
   
  type ``barg_assoc_All`` private (connStr: string, conn: SqlConnection) =
    inherit BaseScript<TableDtos.dbo.barg_assoc>(connStr, conn,
        """-- barg_assoc_All
        SELECT
          [barg_assoc_id],
          [barg_assoc_name]
        FROM
          [dbo].[barg_assoc]"""      
    )

    let mutable ``ordinal_barg_assoc_id`` = 0
    let mutable ``ordinal_barg_assoc_name`` = 0

    override this.initOrdinals (reader: SqlDataReader) =
      ``ordinal_barg_assoc_id`` <- reader.GetOrdinal "barg_assoc_id"
      ``ordinal_barg_assoc_name`` <- reader.GetOrdinal "barg_assoc_name"

    override this.getItem  (reader: SqlDataReader)  =
      let ``barg_assoc_id`` = reader.GetInt32 ``ordinal_barg_assoc_id``
      let ``barg_assoc_name`` = reader.GetString ``ordinal_barg_assoc_name``
      {
        ``Barg_assoc_id`` = ``barg_assoc_id``
        ``Barg_assoc_name`` = ``barg_assoc_name``
      }

    [<EditorBrowsable(EditorBrowsableState.Never)>]
    new() =
      failwith "This constructor is for aiding reflection and type constraints only"
      ``barg_assoc_All``(null, null)



    static member WithConnection(connectionString, ?configureConnection: SqlConnection -> unit) =
      ``barg_assoc_All``(connectionString, null).ConfigureConnection(?configureConnection=configureConnection)

    static member WithConnection(connection) = ``barg_assoc_All``(null, connection)




  type ``barg_unit_All`` private (connStr: string, conn: SqlConnection) =

    inherit BaseScript<TableDtos.dbo.barg_unit>(connStr, conn,
           """-- barg_unit_All
           SELECT
             [barg_unit_id],
             [barg_unit_name]
           FROM
             [dbo].[barg_unit]"""
       )



    let mutable ``ordinal_barg_unit_id`` = 0
    let mutable ``ordinal_barg_unit_name`` = 0

    override this.initOrdinals (reader: SqlDataReader) =
      ``ordinal_barg_unit_id`` <- reader.GetOrdinal "barg_unit_id"
      ``ordinal_barg_unit_name`` <- reader.GetOrdinal "barg_unit_name"

    override this.getItem (reader: SqlDataReader) : TableDtos.dbo.barg_unit =
      let ``barg_unit_id`` = reader.GetInt32 ``ordinal_barg_unit_id``
      let ``barg_unit_name`` = reader.GetString ``ordinal_barg_unit_name``
      {
        ``Barg_unit_id`` = ``barg_unit_id``
        ``Barg_unit_name`` = ``barg_unit_name``
      }

    [<EditorBrowsable(EditorBrowsableState.Never)>]
    new() =
      failwith "This constructor is for aiding reflection and type constraints only"
      ``barg_unit_All``(null, null)

    static member WithConnection(connectionString, ?configureConnection: SqlConnection -> unit) =
      ``barg_unit_All``(connectionString, null).ConfigureConnection(?configureConnection=configureConnection)

    static member WithConnection(connection) = ``barg_unit_All``(null, connection)

There can be a lot of methods eliminated from the generated code. Ultimately, thinking about it, what's different about each query that is executed are: the sql query itself, the parameters, and the code that reads the data from the SqlDataReader into the record, and of course the type of the record returned (in case the query returns data). All the other execution methods should be the same, and can be common, and, imo, they should not be repeated for each class that is generated.

The reading of the data could be abstracted through an parameterized interface by a record or records types with two methods, initialize & readRow (I would call it readRow instead of getItem).

If you want to be able to read multiple resultsets (I don't know if the generator currently supports reading multiple resultsets, i.e. scripts or stored procedures can return multiple select statements), it might become a little bit trickier. One would need a list of interfaces, one for each resultset, each interface would have its own record type.

I hope this is useful, and I think the amount of code generated can be reduced.

Thanks

@cmeeren
Copy link
Owner

cmeeren commented Dec 16, 2021

Thanks for the tip. IMHO, cleanup of generated code isn't that important, since normally it's not something anyone looks at. That said, in this case, with many members removed, it could hopefully reduce compilation times a bit. Though I often have 10k-20k lines of generated code without compilation times being a problem.

I will look into it when I have the chance.

@cmeeren
Copy link
Owner

cmeeren commented Dec 16, 2021

This is not as straight-forward as it seems.

Firstly, there are not one "script type", but three: One if there are no params, and two if there are params (one for the initial creation and param applications, and another for the execution). Your example treats only the no-param case.

Secondly, the specific Execute methods present varies between scripts/sprocs, specifically by whether or not it returns any results.

Worst-case, it seems that we would need to juggle six different base types. Maybe some of them could be merged, or inherit one another.

@cmeeren
Copy link
Owner

cmeeren commented Dec 16, 2021

I think this is doable. Working on it now.

@cmeeren
Copy link
Owner

cmeeren commented Dec 16, 2021

I came across a challenge: F# seems to require that the type parameter in inherited classes must be specified (cannot be inferred). Return types in Facil are, for the most part, anonymous records. This will lead to a potentially very long type parameter (if there are many returned columns). There is also the challenge that this parameter depends on whether output columns and/or return types are used, so there's a bit of logic needed to render the correct type parameter.

If using base classes, we remove some code, such as the Execute methods, but we get other code, namely long type parameters. All in all, I'm not sure that it's worth it. Even without this explicit type parameter (which could significantly increase the line count again), the line count in the DB test project was only reduced from 27k to 17k. A reduction of 37%. I'm leaning towards this not being worth all the trouble.

Let me know if you have any magical solutions.

@costa100
Copy link
Author

costa100 commented Dec 16, 2021

Thank you for looking into this!

Thanks for the tip. IMHO, cleanup of generated code isn't that important, since normally it's not something anyone looks at.

Actually, I did read the code to understand what it does. I think when you use the tool the first time, you have to read the code that it generates. Once the trust in the tool is built, a developer probably spends less time reading the code because the patterns are mapped in his brain, and she/he knows what to expect.

My first suggestion would be to eliminate the use of anonymous records. I would always assign a name, so you always have a type name that you can reference. Also watch out for similar record types.

If I step back, a query has:

  • a parameters list that can be empty. Some parameters can be output parameters, and they can also contain resultsets.
  • query/command text
  • it can return zero, one or more resultsets
  • a stored procedure can return an integer code
  • sometimes t-sql can keep track of the number of records that were modified (personally I always ignored this info, I think it is useless, but that's me)
  • there might be other details specific to transact-sql (cursor parameters, sql_variant types etc.)

Ultimately, the code that handles the result from a query, whether that query comes from a script, or a tableScript or a procedure, is the same. Except for tableDtos, handling tableDtos is an operation of deriving the F# type that maps to the structure of a resultset. This deriving operation is also part of the generation process of the code that executes a query and handles the resultset.

the line count in the DB test project was only reduced from 27k to 17k. A reduction of 37%.

Keep in mind that a real life project might deal with hundreds of tables, queries, stored procedures. Personally I like lean code, but on the other hand, I can live with some redundant code.

I don't know if this helps or not. The devil is in the details :-)

@cmeeren
Copy link
Owner

cmeeren commented Dec 17, 2021

Actually, I did read the code to understand what it does. I think when you use the tool the first time, you have to read the code that it generates. Once the trust in the tool is built, a developer probably spends less time reading the code because the patterns are mapped in his brain, and she/he knows what to expect.

Anyone is of course free to peruse the generated code, but this is not a scenario I intend to optimize for unless there are no drawbacks.

My first suggestion would be to eliminate the use of anonymous records. I would always assign a name, so you always have a type name that you can reference. Also watch out for similar record types.

IMHO, a nominal record should represent a well-defined and singular concept. "The results of query X" is a well-defined and singular concept. So is "This (sub)set of columns of table X" (i.e., a table DTO). But "the incidentally similar result set returned by queries X, Y, and Z" is not. If one of the queries change, the record is then no longer returned by these three queries. One indication of the problem is the difficulty in naming such a record.

Using anonymous records sidesteps the whole issue: Each query can return a "separate" record as it were, but as long as the result sets are the same, they can be passed as input to the same functions.

Note that Facil lets use your own DTO record types (defined in a file "above" the generated code, of course) and then use them in whichever queries you want. Anonymous records is just the default when there is no matching table DTO.

If I step back, a query has:

...

Yes, I successfully implemented the base types and was fairly close to the goal. But having to specify anonymous records in the type parameter of the anonymous types was the spanner in the works that made me think that this is not worth it, given that the benefit is fewer lines of generated code which, as I mentioned above, I do not find it important to optimize for.

Keep in mind that a real life project might deal with hundreds of tables, queries, stored procedures.

If you have that many, here's a tip (otherwise unrelated to this discussion) - it would probably be a good idea to have the generated DB code in a separate project. That way, you don't have to re-compile all the generated code every time you change something in the rest of your code.

Personally I like lean code, but on the other hand, I can live with some redundant code.

I like lean code, too, but I only care about it when it is code that is intended for me to read/maintain. That doesn't include Facil's generated code.

I don't know if this helps or not. The devil is in the details :-)

Indeed. 🙂

@costa100
Copy link
Author

costa100 commented Dec 18, 2021

Hey, thanks for taking the time to review my message, and writing this post.

IMHO, a nominal record should represent a well-defined and singular concept.

using result: nominal for the stored procedure does what I wanted.

it would probably be a good idea to have the generated DB code in a separate project.

This is a very good idea. All the code generated by the tool can be placed in a library project. It is self contained.

Btw, in the last few days I evaluated a few data access methods from this list: https://fsharp.org/guides/data-access/, SqlClient, SqlProvider & SqlFun. The one I like best is offered by your library. I think you should contact the folks at fsharp.org and ask them to add your site to the list. I found out about the existence of this library on twitter.

To conclude, it is up to you if you want to do more. Personally I don't like anonymous records. They have their usefulness, but I don't think this place is one of them. I will continue my evaluation, but I am more and more inclined to use your generator.

@cmeeren
Copy link
Owner

cmeeren commented Dec 18, 2021

using result: nominal for the stored procedure does what I wanted.

Happy to hear it. Note that it generates separate records per query, regardless of whether or not records are equal across queries.

Btw, in the last few days I evaluated a few data access methods from this list: https://fsharp.org/guides/data-access/, SqlClient, SqlProvider & SqlFun. The one I like best is offered by your library. I think you should contact the folks at fsharp.org and ask them to add your site to the list. I found out about the existence of this library on twitter.

Done. I actually contacted them a month ago, but missed the part of the reply where I was told how to create a PR for that page.

To conclude, it is up to you if you want to do more. Personally I don't like anonymous records. They have their usefulness, but I don't think this place is one of them. I will continue my evaluation, but I am more and more inclined to use your generator.

Without a solution to the challenges regarding naming (and the underlying issue about singular concepts), I'm not planning on changing things. They can be "configured away" anyway. The only thing that's not supported is "merging" similar result records (which is where the challenges enter the picture).

@costa100
Copy link
Author

As middle ground solution, would you be open to add an option that controls the generation of the code as follows:

  • Option 1: as is right now with anonymous records
  • Option 2: use inheritance with the caveat that anonymous records are not supported. The developer would have to always specify the F# name of the record.

@cmeeren
Copy link
Owner

cmeeren commented Dec 18, 2021

Not sure what you mean by "inheritance". Records cannot be inherited. And Facil supports only records. Supporting non-record types would be a bigger change that I'm not keen on making.

@costa100
Copy link
Author

costa100 commented Dec 18, 2021

Not sure what you mean by "inheritance".

Not the records, the code that executes the sql. "inheritance" as per my other message above as a way to remove redundant Execute methods.

@cmeeren
Copy link
Owner

cmeeren commented Dec 18, 2021

Ah, I see. No, I see absolutely no compelling reason to support two fundamentally different ways of generating code or "generated code styles" as it were, with the only benefit being slightly less generated code and less features in one version. I maintain my position that the readability of the generated code is an unimportant concern. The only concern I have taken with regards to the generated code, is avoiding unnecessary merge conflicts, hence why DTOs/parameters are multiple lines instead of (potentially very long) single lines.

@costa100
Copy link
Author

The only concern I have taken with regards to the generated code, is avoiding unnecessary merge conflicts,

Conflicts from such generated files are simple to resolve - always accept the modified version.

No, I see absolutely no compelling reason to support two fundamentally different ways of generating code

OK, I guess.

@cmeeren cmeeren closed this as completed Dec 18, 2021
@costa100
Copy link
Author

I thought more about this. I think one should strive to the generate proper and lean code, even though it's generated code that others will only use and not modify. I keep reading about the benefits of functional programming, and re-usability, and then what's the point? And actually it hurts my eyes seeing those methods repeated over and over again unnecessarily (imo). As the library creator, it is your decision ultimately. For me, it is not a show stopper, i.e. it won't prevent me from using this code in the form that it is now. But I think you should strive to make it better. I only presented one option. There might other options. One of the central pieces of the generated code is the code that reads the data from the SqlDataReader object into the record (anonymous or not). You could isolate this piece and make it a re-usable component that can be plugged into the code that executes a stored procedure, and which would appear in one place not replicated over and over.

@costa100
Copy link
Author

@costa100
Copy link
Author

Return types in Facil are, for the most part, anonymous records. This will lead to a potentially very long type parameter (if there are many returned columns).

Hey, sorry for insisting.

Here is some code that works in F# 6:


open System.Data.SqlClient

// For more information see https://aka.ms/fsharp-console-apps

type IRowReader<'R> =
    abstract member Initialize: unit -> unit
    abstract member ReadRow: SqlDataReader -> 'R

type SomeOtherRecord = {Name: string}
type SomeResultAnonymous_Record = {|Name: string|}
type SomeResultAnonymous_AnotherRecord = {|Name: string|}

type SomeResultSet_RowReader =
    interface IRowReader<SomeResultAnonymous_Record> with
        member this.Initialize () =
          ()

        member this.ReadRow (sdr: SqlDataReader) = {|Name = "Blob"|}

type AnotherResultSet_RowReader =
    interface IRowReader<SomeResultAnonymous_AnotherRecord> with
        member this.Initialize () =
          ()

        member this.ReadRow (sdr: SqlDataReader) = {|Name = "Blob"|}


let x: SomeResultAnonymous_AnotherRecord = {|Name = "Porky Pig"|}
let y: SomeResultAnonymous_Record = {|Name = "Donald Duck"|}

let mutable z = {|Name = "Minnie"|}
z <- x

As you can see, you can actually assign a name to an anonymous record!

Also, another idea would be to use an interface for looping through records.

@cmeeren
Copy link
Owner

cmeeren commented Dec 19, 2021

I thought more about this. I think one should strive to the generate proper and lean code, even though it's generated code that others will only use and not modify. … And actually it hurts my eyes seeing those methods repeated over and over again unnecessarily (imo).

That's subjective. I do not feel that way (considering that this is generated code that is not intended to be looked at), and it is I who have to maintain the code. My capacity for that is very limited. The past two years, I have had to let go of the stewardship of several open-source projects because I simply do not have the capacity to maintain them, and I weren't actively using them anymore.

I keep reading about the benefits of functional programming, and re-usability, and then what's the point?

I don't see how this statement is in any way relevant to anything in this discussion, or really says anything at all. This is generated code, and the "public" part of that generated code provides a nice and friendly API.

But I think you should strive to make it better.

At what cost? That is the core question. Again, I maintain the position that improving the visual quality of the generated code is not a high enough priority to warrant spending a lot of my limited time and energy, or accept functional drawbacks.

If I had some tens of hours to spend on improving Facil in ways that don't impact its functionality, I would first and foremost refactor the internals so it is easier to maintain. Currently the internals are rather messy; a by-product of me never having been able to prioritize a thorough refactoring after Facil graduated from the proof-of-concept stage. (Facil works and is very well tested and continues to grow, but adding new functionality is harder than it needs to be due to the somewhat messy internals.)

This article is interesting: https://www.compositional-it.com/news-blog/5-things-to-be-aware-of-with-f-anonymous-records/

I've read it before. If there was anything in particular you wanted to draw my attention to that was relevant for this discussion, you'll have to spell it out.

As you can see, you can actually assign a name to an anonymous record!

I am aware of this feature. I do that myself in my DTO mapping code (outside Facil). In the context we are discussing here, this solution requires defining anonymous record aliases above each query. That could drastically increase the line count again - another drawback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants