-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The argument against Entity Framework, and for micro-ORMs #12
Comments
Thank you for not advocating building queries from strings :). Change tracking downside of Entity Framework could have an easy enough workaround: use new DBContext for applying changes, then effects are more obvious. What doesn't have an easy workaround: startup time of Entity Framework, which is quite noticeable in desktop apps. Right now is not really a great time to look at EF bug list, or jump into EF, for that matter. The query translator got rewritten, but hasn't really stabilized, as the issue list rightfully indicates. Their test coverage for new/rewritten parts is also not great. "Please try again with nightly builds" has been the standard response for over a month now. Let's hope by the time 3.1 releases in November things have stabilized. |
I stumbled into this article on reddit, and really liked it. I agree with a lot of what you're saying. 😊 Anyway, the blog post was great, and I generally I agree with you. 😊 Just don't write ORMs off right of the bat, and consider them as a tool as you would any other dependency in your system. |
This blog post is good up to a certain level of application complexity. Most large enterprise projects are going to be dealing with data sets with millions of rows of data, with billions of relational outcomes. Advocating eager loading in such a scenario is non-sense. EF allows you to cherry pick relationships to eager load when using lazy loading by default. If you are using a webserver and constantly eager loading large graphs then you are placing a huge burden on the server to load data that gets dumped in the garbage when the controller returns. If it's a desktop app, you may get away with it for a while, but eventually your app will be holding a couple of gigs of data in RAM and user experience will suffer. On mobile, you should be just as strict with memory usage as on a REST server. EF was designed to work in all these scenarios and when used correctly, does so admirably. Enterprise code bases need consistency and reliability as their top concerns for maintainability. Designing a system under the assumption that only you or someone of your skill level will be maintaining code is both arrogant and dangerous. |
There's other options for ORMs. For example, Tortuga Chain (which I work on) using database reflection. Rather than just assuming the class exactly matches the table or doing everything using SQL string literals, it compares the table and class definitions at runtime. This dramatically reduces the boilerplate, especially when you don't want every column. |
Another is SQL Alchemy which allows you to build complex SQL expressions using an object model. Unfortunately it is Python only at this time. |
Regarding boilerplate, consider this line: Consider this line:
Why can't all ORMs do this? Why do they usually require manually dealing with connections/contexts and an extra round trip to the database just to perform a simple update? In my opinion, the only time I should see a |
Having read this .... many of the points here are exactly where i'm at right now ... the hidden technical debt, abstraction issues I can't solve, complexity with no explanation, poor performance in scenarios that often seem trivial on face value. I'm currently looking in to an alternative to EF Core to resolve a ton of issues I have that the EF Core team seems to be either confused about, not interested in, or simply willing to pick at my description rather than focusing on the issue at hand. The fact is, since EF6, each and every subsequent release has removed or broken functionality that my stack has depended on and i'm sick of swallowing that with the reasoning being "this is the cost of progress". So here's where i'm at ... So here's my core functionality requirements from the EF functionality that I currently use ...
That last point appears to be the sticking point for most "micro-ORMs" .. the "micro" prefix usually means like with say Dapper that it does the SQL query to Entity mapping but won't do the bit before that to get from the LINQ expression tree to SQL. Assuming this ORM can handle that I'm looking for examples ofdoing things like applying set filters so I can achieve something like ... var results = Db.GetAll<BaseT>()
.Include(t => t.Property)
.ThenInclude(i => i.SubProperty)
.ToArray(); ... key things to note here that EF solves that I can't seem to find a solution to in other ORM's the include, and then the sub include are both filtered by the relationship but also on filter conditions applied to the table regardless of context in which that table is questioned. This seems to be a feature missing in all but EF. Does this ORM support this scenario? |
This requires you to know all the possible combinations of questions that you might want to ask the api up front or manually wiring up a second model so that the traversal is possible with sub queries. Consider putting an OrmLite managed DB behind an OData API where the questions are virtually limitless but every possible combination of scenarios has to be considered and handled. People often forget about the complexity that EF is solving stating that something is a "micro-ORM" instead of a "full ORM" is seemingly just like declaring "this solves half your issue, now go find something else that solves the other half, but it's fast ok". I've not yet found anything that can match this type of EF solved scenario that didn't require a crap ton of "work arounds" or "patching stuff together" and it's the one thing that keeps my solutions sat on it ... which is frustrating because I both hate it and have to use it at the same time as there is seemingly no alternative. Other things of note ...
... all features that I often use to solve dynamic scenarios that would otherwise be unsolvable or for situations like I don't want to sit around writing boiler plate stuff like the SQL statement for building a table when I have a class that exactly matches it's structure. Also, the current version of EF core will never return "some proxy sub class with injected decorated behavior", as that functionality was ripped out as part of the rebuild when EF6 became EF Core 1.0. My current entity model, has a DbContext as you might expect with entity sets, none of which I pull from the Db in such a manner that I use things like Lazy loading, or proxies, or in any way require the resulting entities to be "attached" to the context, this is basically the same setup as is explained here. My ideal ORM would allow me to do something like ...
... in this situation I would be constructing a simple ADO.Net connection to the Db and then telling the framework "build me a SQL query of Type T", large complex "models" seem to feel like overkill for me since the type metadata for the query you're building should tell you all you need to know. I continue my composition on that to construct the full query, including notifying it (as per my previous comment) of what "sub sets" I want in the results, then performing a .ToArray() .ToList() or simply iterating over it would actually execute it. The results would be disconnected from the DB (simple POCO's) and be appropriately secured unless I specifically asked the framework to track changes for me to make saving easier later. My issue tends to boil down to the fact that all these ORM's claiming to be better than EF appear to be so on face value, but as soon as you start drilling in to the complex scenarios they fall short of features resulting in me having to extend or build tons of framework around the ORM. So the article makes the comparison of OrmLite's 89k cloc to EF's 514k cloc in the chart at the begining but it's forgetting that there's a bunch of stuff in that extra 400k cloc that OrmLite can't do. Unless i'm missing something? I recently asked the Dapper guys about how implementing a Dapper based back end for OData might look I simply got a one liner "you'd have to write your own version of LINQ to SQL as Dapper doesn't do that" ... So Dapper can execute a query, and map the results to an Object graph, but it can't build the query in the first place. That's half the work is it not? In short I'd love to see a "micro-ORM" implementation of an OData or GraphQL or similar API model as those sorts of API's really push the limits of what ORM's can do. |
See this: Why not OData? Also this: AutoQuery |
I don't agree with the bulk of points in that Odata article, for example the supposed "tight coupling of internals" problem highlighted is not present in my stack but that's a far more complex and different discussion. I was using OData as an example mainly because of the complexity in scenarios that it exposes us to only, even Microsoft who pushes OData heavily states that best practice is to have N-Tier separation and promotes the use of both an API model and a DB model, the mapping for that is an entirely different issue though that's not worth discussion here. The article seems to suggest that AutoQuery is an alternative to OData which it just isn't. Essentially all i'm asking is that an ORM should be able to handle exactly that, and when I ask it a question I should be able to tell it "I want my question asked in this business context" which for the bulk of queries in enterprise applications boils down to "based on what the user making the call has access to", which in my case is a small cut of every table in the DB. If the answer it seems is to just avoid asking the question then it's not really answer is it? |
Interesting topics, I really like to chip-in my ideas but only on ORM side. Hope to share more here soon. As of writing this, I do not have any idea of AutoQuery and I am not as deep at OData.
@TehWardy - I think, a micro-ORM can do solve all the things, whereas EF can't. The EF has abstracted most of the things for you, and that limits you to access the benefits of the underlying storage. It means, you have less control with EF over a what we called micro-ORMs.
True, and not true. It is based on preference. I can say, EF did not solve any complex problem :) - that's why there are micro-ORMs which allows you to have more control. Of course, you have to write more. Specifically to Migration Tool, we have developed a DevOps tool and had not used EF code-first as our preference is to not bound it to EF at all. In the end, it is much easy to maintain and solves the complexities of our releases :)
Do not generalize this, EF does not even have Batch and Bulk operations, 2nd layer cache, Trace, etc. And if you tend to do that, it requires you a lot of works just to make it work with EF. Also, you are bound to the models at all, and you can't do anything on your model but just to use it on a specific table.
I am interested to collaborate and share. I am also an author of one micro-ORM named RepoDb. Can you have a look? It is a hybrid-ORM which will allow you to do the things of micro-ORMs and macro-ORMs. You will have a lot of benefits while using it, I can even explain and support on this. You may experience a different test as it has been baked differently. |
It's refreshing to have people to talk to about this stuff ... most devs want to avoid this type of problem as it's a minefield of pain whatever path you go down it seems, the trick is picking which mines you step on with some level of "good guess work of the future". I can see you clearly have some big issues with the M$ stack, mostly they are valid too if the stack is used as documented. You seem to be under the impression that OData is ...
To that I would say ...
I've noticed that OData is referenced with regards to the v3 spec and WCF, that version is basically dead, the v4 spec and version runs entirely on .Net Core 3.1 and is actually more complete than it's partner version of EF (the soruce of my frustration right now). Also what's this ...
... OData isn't dead, it IS restful and returns JSON by default and my entire point to you centered around my testing efforts on service stack are literally that CoC point (i have that already and don't want to give it up), service stack doesn't seem to do anything by convention it requires explicit definition literally everywhere. When I say "I should be able to tell it "I want my question asked in this business context" ... i'm not joking, I run a transactional platform and the context of a question is important ... I see a billion euros a week worth of invoice data through the system and users getting back the wrong rowset is not an option, that is by design a complex question nto because M$ said so but simply because it has to be. As for the "mis-appropriation of features" ... I don't use any of the extended features that didn't get ported to .Net core for that exact reason, I saw the headache coming and avoided it. I've had DBA's hand crank queries to answer some of the simpler questions and the way EF handles some of the scenarios actually beats that (it's rare but it happens). That's a requirement imposed on me because of the nature of our DB not due to the framework imposing that, replacing EF with ORmLite or Dapper will not change that, it's been tested extensively. |
Thanks for the feedback, I can totally see why people say what they say about EF, hell the grief I give M$ on occasion and the EF team is somewhat ridiculous at times because i'm trying to solve problems that simply shouldn't exist. I'll happily take a look at your ORM :)
Good place to start, I know here on the Service Stack side OData is seen as some sort of anti-pattern due to the way that M$ documents and recommends using it, I definitely don't use OData as documented so usually don't hit the down sides (like tight binding to the DB structure). When discussing it here though, it pays to appreciate the complexity of questions that you can answer with the OData + EF stack WITHOUT having to specifically write any code at all beyond building the class that matches the table by default. With other stacks like shown here with service stack I've had some interesting conversations with @mythz (sorry mate, I do like to ask the complex questions) on this area and the choice to jump boils down to a few key points for me ...
@mythz I was going to ask you about point 5 actually ... I've converted a few thousand lines of code over to service stack but obvs because my model is more than 10 tables (or whatever the limit is) I can't test it, i'm actually seriously interested in at least spinning it up to see if that performance gain is really there (although I do have a query generation problem to solve) as ORMLite only solves some of the scenarios I have. Some points of discussion had on stackoverflow ...
The key thing here is that as the technical lead on my own stack I should be able to pick the pieces that work for me (EF admittedly doesn't give me that, it's all of that half million lines or nothing - ish) ... but then having picked my pieces I should be able to build solutions around them as needed. When I talk about my ideal ORM, i currently don't think it exists but then i'm very picky. Key features I would like to see in an ORM which would make building my own API easy are ...
That last point along side the lack of SQL generation from LINQ is where I feel both Dapper and OrmLite fall short, but this is why I think they are sold as "micro-ORM's" at least in part, they deal with ONLY the problem of talking SQL to SQL servers. consider this sort of query example ... // build the query
var query = new Query<T>()
.Where(...)
.Where(...)
.Select(...)
.Expand(...)
.ThenExpand(...)
.GroupBy(...)
.OrderBy(...)
.ToSql();
// then with Dapper I could do ...
var results = connection.Query<ResultType>(query).ToList(); ... the key thing to note about this example is that I'm mapping questions presented as OData parameters in to this framework basically allowing the User to build the query they want the API to run, but not only that, the base set is filtered by a preconfigured filter for T based on the users access to rows in the DB, then when they expand in to the subsets those also have a filter applied to them, all of this is automatically injected in to the query. Now I can see the response here ... "yeh you can do all that with x-ORM" ... you're right, I can ... but I don't want to hand crank all that functionality, EF already handles it for me, the only issue is that I have to take all of EF to get it. If I could take the query building, as a feature and plug that in to any ORM then I'm free to choose to use OData on top of that if I so please. For CRUD on a single OData endpoint I only require a single generic controller, with a filter on the DB table (a one of linq expression, and one liner) I can filter the table for any user that logs in by applying my own "app role logic or whatever" to the table and i'm done. Assuming I follow a convention I would then have 1 controller / service and one context class representing my DB then the simple POCO that represents the table (all things I have with service stack) ... the key difference with the OData + Ef stack is that if I want a new endpoint I simply add a new POCO and i'm done, full CRUD implemented "by convention". is this slower than a handcranked query for each CRUD operation on every possible endpoint query ... yup, do I care that it costs a few extra CPU cycles ... nope, servers are cheap to rent, cloud solutions architects and the dev teams to maintain complex codebases aren't. |
@mythz i'm not ranting, sorry if you feel that way ... I thought this was a friendly discussion. I'd also like to point out that i'm about 80% of the way through making the code work with Service Stack but i'm currently stuck on a few things (understanding this AutoQuery behaviour is actually one of them, so thanks for that). In addition to that i'm constantly sharing the information you give me internally to get further feedback from my team. I'm not adverse to dropping the M$ stack entirely (again why i'm here), I just need to prove to the business that I can deliver the problem domain under the new stack without too much fallout and in a timely AND cost effective fashion (OData + EF is free after all which means something to a small business like us). The fact that i'm using OData here is not out of some misguided loyalty to it or M$ it's more that it presents certain complex challenges to ORM solutions so acts as a good example of showing the worst cases that an ORM may need to face. Sure the M$ implementation of OData is horrendously bad in places but to conflate the implementation with the standard (which is what OData is) is out right wrong.
... it's wrong because they actively push the use of metadata based descriptions for the entire model that you're exposing and even go as far as defining the definition of that metadatas schema ...
... that's just an assertion, the very thing you're accusing me of.
OData is repeatedly comprared to WCF SOAP Services for some reason by both you and @pauldotknopf in his article (which I find odd as they have literally nothing in common) EXCEPT ... SOAP had a WSDL description, you could use the tools to generate your client code. I go a step further and expose metadata relevant to each endpoint on the endpoint itself to avoid the caller having to rely on a large blob of meta for which they want a small portion (IMO this should be the standard). I could go on but my "opinions" (however documented and fact based they may be) regarding OData aren't wanted here. Again .. The reason I pointed at OData was that it generates complex "real world questions" I have to build an API to answer. Your examples are interesting and do solve the problem in the event that I handle the question or use AutoQuery to do this for me ... can AutoQuery do this with my own business logic in the middle something like this (taking the OrmLite example) ... var q = db.From<Customer>()
.Where(...)
.Join<Customer, CustomerAddress>()
.Where(...)
.Join<Customer, Order>()
.Where(...)
.Where(x => x.CreatedDate >= new DateTime(2016,01,01))
.And<CustomerAddress>(x => x.Country == "Australia");
var results = db.SelectMulti<Customer, CustomerAddress, Order>(q);
foreach (var tuple in results)
{
Customer customer = tuple.Item1;
CustomerAddress custAddress = tuple.Item2;
Order custOrder = tuple.Item3;
} ... if I understand this correctly this is the equivilent of a query that returns an expanded subset of properties too.
Essentially the reasoning here is that from the users information in the request (like an auth token for example), I have to filter the db down to the stuff they can see in every table then execute my question on what's left (standard multi-tenancy issue basically). From there the logic is only as complex as the user question asked which it sounds like AutoQuery might be able to limit to a problem domain that's already coded for which is perfect! |
@TehWardy quick question, since you are considering AutoQuery, I take it that the OData solution isn't deployed yet? You are still in the research phase? Is this a new solution? |
@pauldotknopf I have an existing solution implemented with an OData based API layer, my issue generally isn't with OData, I think @mythz here has issues with "ugly URLs" in OData (not unreasonable to be honest if you lok at them encoded they can appear pretty ugly). The background for my "problem domain" The reason we use OData is because it allows the client to specify the question they want to ask instead of me stating to the client "these are the questions you can ask" which is key here. My issue is that I don't know that that particular join is something the client wants at the time I'm writing the code and I don't want a support call to implement a new API method everytime they have a new question to ask the API. I know you guys are highly against OData but the key thing this offers is "within the confines of the type safety as defined by the contract emtadata which defines the tpyed sets that can be questioned" the user can "build a question in a URL to confirm to even the most complex of business scenarios" and yes the nature of the question they can ask CAN get complex but it's on them to decide that not me, and forcing them to only ask "pre-built questions" won't cut it. The issue is that our clients are fortune 500 companies with big complex "poorly deisgned" systems like SAP implementations and often are constrained by having to work to a standard that that system implements, and OData is one of those standards. With the netflix example netflix can decide how people communicate with them, with our platform we offer business services that connect between such systems and are forced to interact with those systems in the way that they support so i'm not dealing with an "in an ideal world" scenario. With that in mind |
Or in my case ... Believe it or not I carry much the same ethos as you but i'm often not in a position to make the "ideal choice" due to external concerns (as described above). I offset a ton of the concerns you have with the nature of the "over-complexity" and "over-engineering" by putting all my business logic behind interfaces and using IoC so if I have OData controllers, WebAPI controllers, or ServiceStack services i'm always insulated from that complexity, but it does make it tricky to answer some of the problems that such implementations introduce. I've also deliberately put EF behind an Interface which having migrated the stack on to ServiceStack I'm seeing that I did lapse in a couple of places where I exposed IQueryables when I should have exposed IEnumerables (that's on me to fix and trivial to do so). That said, having taken your advice onboard my plan is to update the code until those "leaks" are plugged then to re-migrate the code again as it only took me a couple days this time round it shouldn't be too bad next time. I have another major demand on my time this week but hopefully I can get to looking at that stuff next week. The up side is that my business logic architecture is entirely interface / IoC driven so it's mostly a lift and shift operation (i have said I don't use that stuff like most people do). I do appreciate your advice @mythz and you do make some great points, points that I intend to raise with Microsoft too in places because ultimately you're right and it's on them to provide good advice for the technology stacks that so many use. I would still be interested in looking at ServiceStack but the current feedback from our board of directors is the following ...
That last point is the one i've been trying to address here of course for the most part. |
I do happen to have that freedom, and as the technical lead here for everything we do the board leans on my guidance to make it's technology spending calls, gneerally speaking the calls made are "because it's right" not out of some mis-guided impression that M$ puts out about how things should be done (hopefully i've shown you that much at least). Again, Many thanks for the feedback, when I get back to it i'll definitely ping them an email, i'm actually curious to see how the two solutions work side by side because as they say "the proof is in the pudding" ... right! |
Just a comment here about the general attitude and tone I've seen in this discussion. Point to consider: if you happen to think a specific technology is more suitable, or disagree with someone on key points, try not to be arrogant and condescending about it. I'd say especially is true if you're trying to promote a particular service or product, yet you resort to engaging other developers by basically insulting them:
Or this...
|
I completely agree @kakins I'm trying to be polite and promote a strong technical discussion about the differences and why they exist instead i'm just being told "accept it, drop some opinions you have and move away from what years of experience has shown you because this is better", that's not how a responsible architect delivers good architecture to a business at scale. Myths and Paul are clearly very attached to the Service Stack design and feel very strongly about it which is commendable to stand by your creation but there really is no need for the aggressive stance here, i'm not here to insult / knock anything. As it happens i've got a wide variety of experience using different approaches to the "API layer problem" and the "N-Tier stack problem" having worked in the industry for 20 years I've seen a lot of stuff claimed as the "best answer, and anything else is simply broken due to ". Frameworks always die eventually and something better always comes along, no doubt at some point that will be the case with Service Stack too. In order to shield myself from particular stacks flaws i've essentially built a business layer that sits behind and depends on interfaces entirely, so the specifics of a particular stack / framework design don't really matter much me, I just need an IoC/DI implementation to wrap it all up. That said, what i've learnt from this discussion is that both Myths and Paul point out both some valid points and some misconceptions that they will defend to the very end and not accept any adverse information to the contrary which makes it hard to get advice on how to fit complex scenarios in to Service Stack. Given the price tag on Service Stack, it really doesn't matter how good or bad OData / EF are only how well Service Stack solves the problems I have already solved with those competing technologies. If you're gonna charge for something as a potential buyer i'm gonna be dam sure i'm putting my money in to value add for the business before I pull out the company credit card. For example ... Misconception: I have overcome this pitfall oddly enough by working not that differently to how AutoQuery works. That doesn't mean I disagree with the approach taken by Service Stack, it just means I don't agree with the particular use case / implementation detail they have chosen to pick fault with or the fault reasoning, I'm just looking for the parallels in order to make informed choices. Why I use OData + EF + .Net Core I am in a unique position in that unlike API providers like Netflix (as this seems to be the example used above) ... I have to provide an API layer that allows users to construct a UI to support a business process that they design on top of a connected web of systems with mine as the middleware platform that ties them altogether. This means I have to consider problems like ...
What this means If a user wishes to design a query that pulls data from 10 tables, and filters on 3 of them as a flat set "source" for a datagrid in their UI I can support that with 0 coding, 0 deploys, 0 changes to the system at all. That's the key point here, I can't roll out a deploy every time any client finds some new question they wish to ask the API as I would be forever deploying. The bottom line There's a key point i'm trying to get understood here which boils down to ... Regardless of peoples opinions, there's a technical fact that comes with all this discussion which determines the viability and I would be negligent not to ask these awkward questions. I apologise if anything I have said has come across offensive here. |
@TehWardy I don't think you've been offensive at all. You've laid out your case, and although I don't know the details of the problem you're trying to solve, you've described it well enough that I can understand the essence. In the flurry of words contained in this thread, I've seen two viewpoints presented:
That could be a simplistic over-generalization of the two views. However, I can at least envision how EF can help solve item 1, while micro-ORMs may have difficulty supporting it. On the other hand, for simpler scenarios like item 2, you could argue that EF is overkill. When I say "simpler" here, it is relative. I'm by no means suggesting that micro-ORMs are only for simple solutions. However, I'll admit I haven't spent time in ServiceStack, or done much work with micro-ORMs. But I have asked myself the same question as @TehWardy as I've looked at Dapper. I started writing a dynamic query builder using EF that, at least from what I could tell, would be extremely more difficult using Dapper. |
Exactly @kakins it sounds like you understand the nature of my problem ... Putting all these pieces together is essentially a standard web stack, but some use more strict / restricted API layer capabilities under the guise of "complexity is bad" but I don't have that option due to the operational requirements of the problems i'm solving. We could make a case for not using expression trees instead and just manipulate strings sourced from the API layers "query" and then directly translate those to SQL but having the expression tree in the middle gives us type safety in the business logic and an interception point that doesn't require things like reflection which can be slow. I'd be really interested in a demo service stack project that replicated some of the more complex capabilities of OData "by convention" avoiding the need to write out a lot of DTO's specific to each use case but my gut feeling is that point that Myths comes back to about the key design elements in Service Stack force this as the DTO's explicitly define the contract information. Sure OData may be bad for some valid reasons but it's a great way to point at complex API layer functionality and be like "can your API layer do this" as a point of discussion, it's certainly not the holy grail though. Things like aggregation or sub selection don't appear to be possible as a user defined scenario without me having to pre-define those in Service Stack which contradicts the "over-posting" and "over-responding"best practices i've come to like for both security and perf reasons. background for this If I have a business object on my back end (simple POCO) with 20 properties and I need a result set with 10 of them, filtered on some child tables values, with a system derived business rule injected in to the query for each table hit I'm not losing any type safety by asking for only those 10. This is a common problem in API layers, Netflix for example assumes I want all of the fields for a movie, and I have no choice but request them all, I can't just ask for a subset like say a key and a name if i'm building a drop down list of them. In my API layer, 1MB responses could turn in to 10MB repsonses. OData whilst ugly does at least give me the flex to define exactly what I want it to do on the back end and exactly what I want it to pull from the DB for me. |
https://pknopf.com/post/2019-09-22-the-argument-against-entity-framework-and-for-micro-orms/
The text was updated successfully, but these errors were encountered: