Use factories and not fixtures #11

Open
andreareginato opened this Issue Oct 3, 2012 · 31 comments

Projects

None yet
@andreareginato
Collaborator

Write your thoughts about the "use factories and not fixtures" best practice.

@adimichele

I think you should check out fixture_builder and reconsider this one. The basic idea is to auto-generate fixtures (using factories or whatever) at the beginning of the test suite and only when necessary. I've gotten huge test performance improvements using this method with a transaction-based db clean strategy.

@myronmarston

I'm not a fan of the binary good/bad statement here. There are good reasons not to use factories.

Really, the best practice is to use neither as much as possible: instead put as much of your domain logic in PORO that can be tested without needing complex, time consuming setup with either factories or fixtures. You'll need some tests that use DB records of course, and either factories or fixtures are valid tools to use there.

@Spaceghost

I don't use either in unit tests. I tend to just build objects using the actual classes. I've used fixtures, and factories. My preferential factory gem is Fabrication. I prefer the speed, clarity, and simple communication of just using the objects I need directly in unit tests.

If I'm doing functional testing, I'll definitely mix in some fabrication. The ease of using a factory in that case is apparent.

@robsaunders

One caveat is to watch out you don't build up complex associations in your factory files - as this gives them the same downsides as fixtures. Structure should always be built up within the test itself.

Having complex structure in factory objects to DRY your tests obscures the object structure, adds unnecessary associations for tests that don't use them, and creates problems when you need a test which changes one small association or value, and all your other tests fail. It's an example of bad DRY.

@UncleGene

While talking about fixtures, bad example shows plain object creation, not fixture. As comparison of fixtures and factories may be controversial, plain creation vs. factories has no contenders. Perhaps it makes sense to rename?

@dgm
dgm commented Oct 7, 2012

Factories fall apart when your application logic requires complex and numerous relationships set up in order to test - it makes the tests SLOW. I think there is much to be gained by setting up fixtures for a base system that has most of the relationships in place, and then modifying that as needed for specific tests, whether with factories or basic object creation.

I don't know why everyone says fixtures are hard to maintain. When your data gets complex, factories are equally hard to maintain. When dealing with over 50 models (government reporting here, it cannot be less complex), you have to put them in different files, and using factories means yet another DSL syntax to learn. YAML isn't that hard.

What is really missing is better tools to help manage those fixture files. There should be an easy way to load up the test db fixtures, edit them, save, and re-run tests to make sure you didn't break anything. (You say that's brittle? The same thing is possible by editing factory definitions. You deal with it by running tests.)

I like FactoryGirl, but only as an addition to the standard fixtures, to create border cases, test invalid data, etc.

@deanius
deanius commented Oct 18, 2012

Where are the upvote options ?? dgm - amen. YAML files are just data - a business person can write them, a QA - anyone. Factories are executable code. Executable code is a poor serialization of an object graph. For my money, fixtures solve that problem far more elegantly than whatever Factory-Gem-Of-The-Day.

@Spaceghost

I, for one, love the concept of VCR. Perhaps a nice factory2fixture gem could be made? Something like that?

@pepe
pepe commented Oct 18, 2012

+1 @robsaunders @myronmarston

The trick is not to abuse anything, and for me abusing the fixtures is just little bit easier.

And when cucumbering, good factories are pure gold.

@dgm
dgm commented Oct 18, 2012

I will concede that factories are nice for cucumber, but that demands to ask if cucumber is worth anything. In all my experience, it is just extra steps, because only programmers write it, and it just adds extra steps and extra knowledge to the process. Cucumber only makes sense if you have QA, managers, or clients writing the stories... and even then I suspect programmers have to tweak it. I can take the same conversation with a client, write the story in pseudocode, and then program rspec to do the same thing with capybara and save the extra steps of regex parsing.

As I said before, fixtures are great for the base system, when there are a lot of dependencies due to the complexity of the domain model. Let the fixtures define the basic setup - almost a seed like structure, but a little more. Then use factories/object creation to define the corner cases.

Properly named fixtures help define the components that you want to test for, and proper tests know how to exclude the extra data (which is something that you should be testing for!). If the presence of additional data is throwing off your test, or if changing a fixture or factory breaks tests, you need to evaluate those tests and set them up to not be so brittle.

For example, if a test currently checks that there are 4 comments to a post after completing an addition, and you add a comment in the fixtures or factories, of course that breaks the test. The test should instead query the number before the action, and then verify that the action added 1 comment. This insulates your tests from changes in the fixtures OR factories.

@robsaunders says: "Structure should always be built up within the test itself." Well, that is nice until you have 500 tests (only 500?) that all need to build up 30 relationships before each test can run. Personally, I hate waiting four an hour for tests to finish. Mocking and stubbing don't help at all, because it is the very nature of these relationships that need the most testing. Fixtures and transactional tests take a lot of pain out the equation. I can depend on the base system and add 1-3 objects per test and achieve a huge speed boost.

And again, what is truly missing is proper tools to manage those fixtures - not just a dump of the test database, but to facilitate proper naming of the various models and organize it. But in the end, it is just editing text files, and yaml is no more difficult to understand than factory girl's DSL; in fact, I think yaml is far easier to understand!

Coming back around to cucumber: If you have decently named factories, I think my project managers would better understand the fixtures, because they could say: (I'm going to alter the syntax a little, this isn't necessarily cucumber)

With child_x_with_abc_registration
Sign up for y_class at x_site
x_site should show registration for child_x in y_class

My project managers have a great memory for the existence of predefined test data like child_x and abc_registration. This would make a lot of sense to them and doesn't require factories. It also doesn't require cucumber. That little pseudocode can be converted to rspec that is readable by managers, even if they cannot create it directly.

For me, tests that run in a decent amount of time, and not wasting time on unecessary programming steps, is pure gold.

@mpalmer
mpalmer commented Mar 3, 2013

I'd like to see a definition of "difficult to control". I have no idea what that means.

@andreareginato
Collaborator

I've added a section where I describe the fact that if possible we should avoid both fixtures and factories.
Any correction and comment to the updated guideline is appreciated.

@mpalmer
mpalmer commented Mar 10, 2013

I've read that "why I don't like factory girl" article before, and it has never felt "whole" to me, because it doesn't give any concrete way to do it better. So in some ways, this point has gotten worse for me now. I still don't understand why fixtures are "difficult to control" -- they've never seemed particularly painful to me (although I will concede that may be Stockholm Syndrome talking). In addition to that, I'm now being pointed to an article that says that factories are a bad idea too, but doesn't give me any way for me to tell if I'm suffering from the same problems the author of that article is worried about, and even if I am, they don't give any suggestions for exactly how to solve those problems.

A solid, real-world-applicable explanation of what we should be doing instead of both factories and fixtures is sorely needed, I think. I don't know of any relevant articles on the subject, though. Perhaps it still needs to be written...

@robsaunders

It is possible to mix traits together to create a similar behaviour to fixture-factories, but having the dependency lie on the test-side rather than the fixture side. It is more limited and less flexible, but the overall force here is to do the right thing rather than the easy thing. When it comes down to it, putting structure into your factory creation is the equivalent of writing fixtures and you WILL end up with a messy pile of unmaintainable tests.

As a slight panacea you could, for example, use traits and do something like this:

create(:user, :with_posts)

It is difficult to then go ahead and create sub-associations (:with_comments_on_posts for example), but this limitation is a good thing - it prevents you from having crazy levels of sub-associations within your tests. And if you need sub associations, you write them in your test setup. This is in no way slower, in fact it's slower creating a pile of unused associative objects EVERY time you call your 'monster factory create' function.

@dgm if you have 500+ tests that use crazy levels of associations, you're doing it wrong. The bulk of your tests should be testing models (unit) or controllers (functional) where you really don't want to be testing a complete set of associations at once. The tests that you do have this kind of setup (integration tests), you should be confident that the associations work and really just be testing behaviour. It takes time to discover this, but the more complex your test setup, the more you end up testing nothing, because you can't account for the trillions of different edge cases and permutations that occur. Setting up tests this way also reduces the time it takes your suite to run (as you have fewer slow and bulky integration tests).

The complexity of the above paragraph probably requires giving a whole talk on the subject and there are a few good ones out there on this topic.

@mpalmer hopefully this answers your question - it's not a matter of finding a 'better way to do the wrong thing', it's a matter of understanding why building associations in a global way is bad, and how you can write more efficient, more specific tests.

@warmwaffles

@robsaunders I totally agree. I see it as a sign of test and code smell when I start seeing crazy levels of associations needing to be done in order to test one thing. Which a lot of times the solution is to stub the crap out of that model and make sure the boundaries of that method are right.

@zamith
zamith commented Apr 18, 2013

I prefer to use build instead of create as much as possible, as not hitting the db speeds up tests greatly. If you test behaviour instead of state, this should not be a problem as you don't want to test ActiveRecord or DataMapper or whatever.

@andreareginato
Collaborator

@mpalmer, @robsaunders would you like to send a pull request better describing this point? It feels like this is the weakest point in all betterspecs and I want to fix it.

@Spaceghost

I disagree that one should only use fixtures.

There are many reasons why you'd use both, and it's a bit presumptuous to
lay a blanket statement like that down.
I can't even decide if 'prefer factories' is a good way forward either.
Things like VCR which record data you're using to test against are almost
like a hybrid. Dynamically generating data and saving it as a fixture.

I don't believe this is an either/or situation, it seems to me that it's a
question of when and how much.

~Johnneylee

On Thu, May 16, 2013 at 6:47 AM, Andrea Reginato
notifications@github.comwrote:

@mpalmer https://github.com/mpalmer, @robsaundershttps://github.com/robsaunderswould you like to send a pull request better describing this point? It
feels like this is the weakest point in all betterspecs and I want to fix
it.


Reply to this email directly or view it on GitHubhttps://github.com/andreareginato/betterspecs/issues/11#issuecomment-17993971
.

@mpalmer
mpalmer commented May 16, 2013

@andreareginato I don't really have enough knowledge to be able to write a pull request. In my opinion, the point should probably just be removed entirely.

@robsaunders I'm afraid that doesn't answer my question at all -- I still lack "understanding why building associations in a global way is bad". I see some furious hand-waving, but not a lot of concrete example. A simple "here is what happens when you use fixtures, and here is how factories make it all better" would be nice.

@dgm
dgm commented May 16, 2013

I don't think fixtures are necessarily bad, but their current implementation could be better.

Right now I'm working on a report that requires a lot of data in the database in order to fully test - it takes 5 minutes for the factories to whip up the test data. As a result, all the assertions are done in one massive test, as no one wants to wait for it to regenerate.

It would be much better if there was a way to use the factories to generate the data, and then snapshot the whole database into a cached sql file; then the test case could use that to load the database directly in a before section, and use transactions to roll back between tests that are augmented by further factory calls.

This would be a hybrid between fixtures and factories - but I would have to agree that using yaml fixtures would be painful... What I want is a VCR like gem for databases.

The reason there is no pull request for this, as far as I am concerned, is because there is still no truly good solution yet.

@warmwaffles

@dgm if you are using SQLite3 for your database, there is an in memory option. I have never gotten it to work but I've seen it done and it can speed tests up. I know it really doesn't fix the problem but it is definitely something to look in to in order to speed up your tests.

@mpalmer
mpalmer commented May 16, 2013

In-memory sqlite3 is awesome for testing. I use it in all my projects. I'm not sure it's an argument either for or against fixtures or factories, though -- it'll just make everything database-related significantly faster.

@warmwaffles

well he was arguing that the build up time for his tests with factories was
causing him to consider using fixtures and possible database switching.

just curious, do you have an example sqlite3 memory example I can look at?

@maurogeorge

You guys think is a good pattern use the FactoryGirl::Syntax::Methods to wirite only create(:user) over FactoryGirl.create(:user)? I use as default on all projects the short syntax.

@marnen
marnen commented Jan 2, 2014

@robsaunders:

One caveat is to watch out you don't build up complex associations in your factory files - as this gives them the same downsides as fixtures. Structure should always be built up within the test itself.

I completely disagree. A major purpose of factories is to be able to build complex associations easily—otherwise you'd just use User.new instead of FactoryGirl.create :user. This does not have the same downside as fixtures, because the object that you care about is specified and created right in your test (sure, associated objects are created too, but you can ignore them). Fixtures get less useful as your associations get more complex, while factories actually get more useful. Not using associations in your factories removes most of the benefit of using factories at all.

How do you test with complex associations, then? It seems to me you'd either have to build up the associations in your tests (making your tests ugly) or mock a lot (making your tests implementation-coupled). Is there a third way that I'm missing?

@mpalmer:

In-memory sqlite3 is awesome for testing. I use it in all my projects.

Careful! SQLite is single-user and therefore not suitable for production. If you're using it for testing, then that means you're testing on a different DB from production, and your test environment may behave significantly differently from your dev environment: although ActiveRecord does a good job of abstracting differences between DBs, it cannot do so completely, and SQLite is just plain missing a lot of useful features (such as foreign key constraints and true booleans), so that testing any part of your app involving those features will be automatically incorrect.

IMHO, for any but the most trivial uses of the database, this difference is too dangerous to bear, so I'm willing to live with the lower speed of my tests actually talking to Postgres—because then I know they're correct for my production environment.

@robsaunders

@marnen:

If that's how you feel about factories, you may be abusing them. What you are doing is pushing complex and brittle structure that actually BELONGS to the test, into another, communal area. I may have misunderstood you though, so if I did this is not relevant, but I am assuming you mean to put 'helpers' to build up associated data (complex structures of models with specific sets of children) inside your factory files, which are used by multiple tests at once, rather than in the model test file itself.

It's fine in a small project with a few people who all respect the code. But more than a handful of devs and tests and you run into chaos quickly. This isn't about 'pretty' code or convenient short term code. This is about robust code that works in the long run, and that is about modularising areas so you have fewer moving parts (areas of extreme churn) and less functions that start to look like 'The Blob', which are changed every time you need a little bit more functionality from new consumers of said function and potentially breaking other consumers in the process.

I think it does take being in projects where this is a problem to truly respect how bad it can get, I am always on the look out for this anti-pattern and the single biggest one I find is mis-using factories.

In my opinion (I do understand that other people differ) factories provide the following benefits:

  • They provide a nice syntactic sugar layer for quickly creating models in your tests
  • They allow for 'default' values for your objects so you don't have to worry so much about validations which are a cause of brittleness themselves.
  • They allow you to focus purely on what the test itself is testing (for example, you can create a 'default' object and each test unit can focus on values for its area knowing the other values are correct). This abstraction is probably the main value of factories.
  • They reduce bugs by separating off what is considered a valid model (eg. when you change something about the model so it is no longer valid for your tests, the factory is often the first thing that breaks, so you know your model is failing correctly, it's a good 'first place' to look when you're unsure if your model is valid).

This is why I don't just use Model.new instead of factories, and I think these are sensible benefits rather than shortcuts that satiate short-term developer laziness but create long term problems.

@marnen
marnen commented Jan 3, 2014

@robsaunders Thanks very much for a thoughtful reply! I'm always trying to make my testing practices better.

What you are doing is pushing complex and brittle structure that actually BELONGS to the test, into another, communal area.

I'm not sure that the complex structure belongs explicitly in a unit test. Perhaps it does, but...

# user.rb
class User
  validates_presence_of :account_id
  validates_presence_of :preferred_language_id
  validates_presence_of 10 other things
  ...
end

#user_spec.rb
describe User do
  describe "#full_name" do
    it "should return the first and last names" do
      User.new(first_name: 'John', last_name: 'Smith').full_name.should == 'John Smith'
    end
  end
end

The unit spec will fail, because all the required attributes and associations have to be set up even though they're not relevant to this spec. So you either have to build them all explicitly (ugly), mock them out (ugly and risky), or put them in a factory. I'll take the factory every time.

I may have misunderstood you though, so if I did this is not relevant, but I am assuming you mean to put 'helpers' to build up associated data (complex structures of models with specific sets of children) inside your factory files, which are used by multiple tests at once, rather than in the model test file itself.

Certainly not. I've never heard of this helper technique. My factory file would be

Factory.define :user do
  association :account
  association :preferred_language
  # 10 more associations and default fields
end

So when I do FactoryGirl.create :user, I get all the associations needed to make a valid User, but I don't have to clutter up specs with them where they're not relevant.

@robsaunders

@marnen I may have misunderstood you. The kinds of functions I am talking about are things like functions called create_user with crazy levels of arguments to define how the user factory instance is made. Defining single-level associations within a factory is ok but not great (you're making much more database-heavy work than you need, and kind of starts to break the unit tests reason for being a unit test, but there are some situations where, without it you don't have a valid factory). I think as long as you're only doing the minimal work needed to create a valid factory, you won't run into what equates to a 'god function' (the equivalent of the god model antipattern).

@marnen
marnen commented Jan 3, 2014

@robsaunders:

The kinds of functions I am talking about are things like functions called create_user with crazy levels of arguments to define how the user factory instance is made.

People do that?!? That really defeats the point of factories. I could see doing that if you weren't using factories, but.

Defining single-level associations within a factory is ok but not great (you're making much more database-heavy work than you need, and kind of starts to break the unit tests reason for being a unit test, but there are some situations where, without it you don't have a valid factory).

Rails unit tests hit the DB. I understand the theoretical arguments for not having them do so, but so far, I haven't found a good way of making that convenient in practice. Even if you use something like NullDB, the presence or absence of the DB starts to leak into the tests, even though it should be a private implementation detail.

That being the case, then, I'm perfectly happy to have as many associations as necessary in my factories as necessary for a valid record of the class I care about. If that means that FactoryGirl.create :user makes 15 associated records, I really don't care very much. I'll use as many levels of association in my factories as I have to in order to make my test code easy to read.

Do you do otherwise? If so, how do you make it convenient?

I think as long as you're only doing the minimal work needed to create a valid factory,

That's my practice. In a complex application, that can still be a lot of work, but it's necessary.

you won't run into what equates to a 'god function' (the equivalent of the god model antipattern).

What do you consider to be a "god function" here?

@nruth
nruth commented Jun 7, 2016

The kinds of functions I am talking about are things like functions called create_user with crazy levels of arguments to define how the user factory instance is made

Yes this does happen, e.g. wrapping the must-be-in-one-transaction-so-cant-be-belongs-to shortcoming of old tools. Anyway, it seems the document should be updated to say "Don't use Rails helpers spec/support helpers", instead of don't use fixtures.

In fact, for complex setup cases fixtures sound pretty good and I'm going to give them a spin for climbing out of a 4-7 year old hole of overlapping helper mess & slow factories.

Beyond that it'd mean showing sustainable ways to use both approaches, which is probably no good for a pithy guide like this.

@marnen
marnen commented Jun 7, 2016

@nruth That article you linked to drives me crazy, because it seems to go out of its way to show some of the ways that fixtures suck, but then unaccountably says "oh, they're not so bad"!

They are so bad; they encourage inflexible test data that's overly separated from your test functions. What you probably want instead is more factory templates to easily cover different cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment