-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the generated code: eliminate redundant code #24
Comments
Thanks for the tip. IMHO, cleanup of generated code isn't that important, since normally it's not something anyone looks at. That said, in this case, with many members removed, it could hopefully reduce compilation times a bit. Though I often have 10k-20k lines of generated code without compilation times being a problem. I will look into it when I have the chance. |
This is not as straight-forward as it seems. Firstly, there are not one "script type", but three: One if there are no params, and two if there are params (one for the initial creation and param applications, and another for the execution). Your example treats only the no-param case. Secondly, the specific Worst-case, it seems that we would need to juggle six different base types. Maybe some of them could be merged, or inherit one another. |
I think this is doable. Working on it now. |
I came across a challenge: F# seems to require that the type parameter in inherited classes must be specified (cannot be inferred). Return types in Facil are, for the most part, anonymous records. This will lead to a potentially very long type parameter (if there are many returned columns). There is also the challenge that this parameter depends on whether output columns and/or return types are used, so there's a bit of logic needed to render the correct type parameter. If using base classes, we remove some code, such as the Execute methods, but we get other code, namely long type parameters. All in all, I'm not sure that it's worth it. Even without this explicit type parameter (which could significantly increase the line count again), the line count in the DB test project was only reduced from 27k to 17k. A reduction of 37%. I'm leaning towards this not being worth all the trouble. Let me know if you have any magical solutions. |
Thank you for looking into this!
Actually, I did read the code to understand what it does. I think when you use the tool the first time, you have to read the code that it generates. Once the trust in the tool is built, a developer probably spends less time reading the code because the patterns are mapped in his brain, and she/he knows what to expect. My first suggestion would be to eliminate the use of anonymous records. I would always assign a name, so you always have a type name that you can reference. Also watch out for similar record types. If I step back, a query has:
Ultimately, the code that handles the result from a query, whether that query comes from a script, or a tableScript or a procedure, is the same. Except for tableDtos, handling tableDtos is an operation of deriving the F# type that maps to the structure of a resultset. This deriving operation is also part of the generation process of the code that executes a query and handles the resultset.
Keep in mind that a real life project might deal with hundreds of tables, queries, stored procedures. Personally I like lean code, but on the other hand, I can live with some redundant code. I don't know if this helps or not. The devil is in the details :-) |
Anyone is of course free to peruse the generated code, but this is not a scenario I intend to optimize for unless there are no drawbacks.
IMHO, a nominal record should represent a well-defined and singular concept. "The results of query X" is a well-defined and singular concept. So is "This (sub)set of columns of table X" (i.e., a table DTO). But "the incidentally similar result set returned by queries X, Y, and Z" is not. If one of the queries change, the record is then no longer returned by these three queries. One indication of the problem is the difficulty in naming such a record. Using anonymous records sidesteps the whole issue: Each query can return a "separate" record as it were, but as long as the result sets are the same, they can be passed as input to the same functions. Note that Facil lets use your own DTO record types (defined in a file "above" the generated code, of course) and then use them in whichever queries you want. Anonymous records is just the default when there is no matching table DTO.
Yes, I successfully implemented the base types and was fairly close to the goal. But having to specify anonymous records in the type parameter of the anonymous types was the spanner in the works that made me think that this is not worth it, given that the benefit is fewer lines of generated code which, as I mentioned above, I do not find it important to optimize for.
If you have that many, here's a tip (otherwise unrelated to this discussion) - it would probably be a good idea to have the generated DB code in a separate project. That way, you don't have to re-compile all the generated code every time you change something in the rest of your code.
I like lean code, too, but I only care about it when it is code that is intended for me to read/maintain. That doesn't include Facil's generated code.
Indeed. 🙂 |
Hey, thanks for taking the time to review my message, and writing this post.
using
This is a very good idea. All the code generated by the tool can be placed in a library project. It is self contained. Btw, in the last few days I evaluated a few data access methods from this list: https://fsharp.org/guides/data-access/, SqlClient, SqlProvider & SqlFun. The one I like best is offered by your library. I think you should contact the folks at fsharp.org and ask them to add your site to the list. I found out about the existence of this library on twitter. To conclude, it is up to you if you want to do more. Personally I don't like anonymous records. They have their usefulness, but I don't think this place is one of them. I will continue my evaluation, but I am more and more inclined to use your generator. |
Happy to hear it. Note that it generates separate records per query, regardless of whether or not records are equal across queries.
Done. I actually contacted them a month ago, but missed the part of the reply where I was told how to create a PR for that page.
Without a solution to the challenges regarding naming (and the underlying issue about singular concepts), I'm not planning on changing things. They can be "configured away" anyway. The only thing that's not supported is "merging" similar result records (which is where the challenges enter the picture). |
As middle ground solution, would you be open to add an option that controls the generation of the code as follows:
|
Not sure what you mean by "inheritance". Records cannot be inherited. And Facil supports only records. Supporting non-record types would be a bigger change that I'm not keen on making. |
Not the records, the code that executes the sql. "inheritance" as per my other message above as a way to remove redundant Execute methods. |
Ah, I see. No, I see absolutely no compelling reason to support two fundamentally different ways of generating code or "generated code styles" as it were, with the only benefit being slightly less generated code and less features in one version. I maintain my position that the readability of the generated code is an unimportant concern. The only concern I have taken with regards to the generated code, is avoiding unnecessary merge conflicts, hence why DTOs/parameters are multiple lines instead of (potentially very long) single lines. |
Conflicts from such generated files are simple to resolve - always accept the modified version.
OK, I guess. |
I thought more about this. I think one should strive to the generate proper and lean code, even though it's generated code that others will only use and not modify. I keep reading about the benefits of functional programming, and re-usability, and then what's the point? And actually it hurts my eyes seeing those methods repeated over and over again unnecessarily (imo). As the library creator, it is your decision ultimately. For me, it is not a show stopper, i.e. it won't prevent me from using this code in the form that it is now. But I think you should strive to make it better. I only presented one option. There might other options. One of the central pieces of the generated code is the code that reads the data from the SqlDataReader object into the record (anonymous or not). You could isolate this piece and make it a re-usable component that can be plugged into the code that executes a stored procedure, and which would appear in one place not replicated over and over. |
This article is interesting: https://www.compositional-it.com/news-blog/5-things-to-be-aware-of-with-f-anonymous-records/ |
Hey, sorry for insisting. Here is some code that works in F# 6:
As you can see, you can actually assign a name to an anonymous record! Also, another idea would be to use an interface for looping through records. |
That's subjective. I do not feel that way (considering that this is generated code that is not intended to be looked at), and it is I who have to maintain the code. My capacity for that is very limited. The past two years, I have had to let go of the stewardship of several open-source projects because I simply do not have the capacity to maintain them, and I weren't actively using them anymore.
I don't see how this statement is in any way relevant to anything in this discussion, or really says anything at all. This is generated code, and the "public" part of that generated code provides a nice and friendly API.
At what cost? That is the core question. Again, I maintain the position that improving the visual quality of the generated code is not a high enough priority to warrant spending a lot of my limited time and energy, or accept functional drawbacks. If I had some tens of hours to spend on improving Facil in ways that don't impact its functionality, I would first and foremost refactor the internals so it is easier to maintain. Currently the internals are rather messy; a by-product of me never having been able to prioritize a thorough refactoring after Facil graduated from the proof-of-concept stage. (Facil works and is very well tested and continues to grow, but adding new functionality is harder than it needs to be due to the somewhat messy internals.)
I've read it before. If there was anything in particular you wanted to draw my attention to that was relevant for this discussion, you'll have to spell it out.
I am aware of this feature. I do that myself in my DTO mapping code (outside Facil). In the context we are discussing here, this solution requires defining anonymous record aliases above each query. That could drastically increase the line count again - another drawback. |
Hello:
I used to generate a tableScript with type=getAll for a bunch of tables, and I noticed that the same methods are repeated for each table. I attached a screenshot where you can see the code differences for two [tables.](url
)
The
***Execute***
methods have essentially the same bodies though different return values, ultimately depending on the structure of the record.I think these methods can be moved up in a base class that can be parameterized by the record type.
Here is a possible solution:
There can be a lot of methods eliminated from the generated code. Ultimately, thinking about it, what's different about each query that is executed are: the sql query itself, the parameters, and the code that reads the data from the SqlDataReader into the record, and of course the type of the record returned (in case the query returns data). All the other execution methods should be the same, and can be common, and, imo, they should not be repeated for each class that is generated.
The reading of the data could be abstracted through an parameterized interface by a record or records types with two methods, initialize & readRow (I would call it readRow instead of getItem).
If you want to be able to read multiple resultsets (I don't know if the generator currently supports reading multiple resultsets, i.e. scripts or stored procedures can return multiple select statements), it might become a little bit trickier. One would need a list of interfaces, one for each resultset, each interface would have its own record type.
I hope this is useful, and I think the amount of code generated can be reduced.
Thanks
The text was updated successfully, but these errors were encountered: