Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] Adds new RefreshDatabaseLazily testing trait #38861

Merged
merged 13 commits into from
Sep 21, 2021
Merged

[8.x] Adds new RefreshDatabaseLazily testing trait #38861

merged 13 commits into from
Sep 21, 2021

Conversation

lukeraymonddowning
Copy link
Contributor

@lukeraymonddowning lukeraymonddowning commented Sep 18, 2021

Hi guys,

Hope you're well! Lemme break down this PR.

Background

When testing, we often add RefreshDatabase to our tests to migrate the database between tests. This is super useful but comes with a speed penalty. Because it is frustrating to have to add the RefreshDatabase trait to every TestCase that uses it, developers often just drop it on the base TestCase in PhpUnit or use something like uses(RefreshDatabase::class)->in('Feature'); in Pest PHP. This means that tests that have no need to interact with the database will still run migrations.

To quote Caleb Porzio: 'No Bueno.'

Solution

This PR adds a new drop-in replacement for RefreshDatabase: RefreshDatabaseLazily. By utilising a new QueryExecuting event bundled in with this PR, the test is able to wait until the database is actually required before running the migrations. This means that if a test has no need for the database, migrations will never be run.

The impact of this PR is that you can drop this trait on your base TestCase with 0 consequences. The new QueryExecuting event also allows for users with more complex database requirements to implement similar functionality in their test setup manually.

Why not just alter RefreshDatabase directly?

It is entirely possible that developers make use of Laravel's migrations, but then interact with the database through raw PDO statements. This would lead to the QueryExecuting event never being fired, and the migrations never being run.

As such, allowing developers to opt in to this feature if their application solely makes use of Laravel's DB tools allows for a non-breaking change. I feel the majority of devs will be able to start using this trait immediately, but better safe than sorry.

As always, I want to say a huge thank you for all of the hard work you continue to put into maintaining and building upon the Laravel framework. I am eternally grateful.

Kind Regards,
Luke

Edit: in order to alleviate concerns about performance caused by the QueryExecuting event, I have removed it in favour of a more simplistic and streamlined callback array. This allows for the same functionality without the overhead.

@taylorotwell
Copy link
Member

Thanks. What are the before / after benchmarks on some real world scenarios?

@lukeraymonddowning
Copy link
Contributor Author

Obviously greatly depends on the application; how many non-database tests vs database tests.

In an application with ~400 tests, where around 90% of those tests use the database, I went from 24s to 21s.

@timrspratt
Copy link
Contributor

timrspratt commented Sep 18, 2021

Isn't the concern around the emitting of the event on every query called during the application's lifecycle and not directly with the test performance? If an application performed 100 database queries on a request (arguably poor design but you never know!), would the QueryExecuting overhead be noticed in the response time? Even if negligible, those on AWS Lambda (or other serverless environments) would be impacted by a cost increase?

Would it be safer to only emit the event when in the testing environment to begin with?

@fgaroby
Copy link
Contributor

fgaroby commented Sep 18, 2021

What happens if a test only runs SELECT queries (no INSERT/UPDATE/DELETE queries in this test) ? Will the database still be refreshed ?

@lukeraymonddowning
Copy link
Contributor Author

@taylorotwell here you can see another application, this one using MySQL rather than SQLite in memory and the saving is pretty drastic...

Screenshot 2021-09-18 at 22 48 23

Screenshot 2021-09-18 at 22 49 11

@lukeraymonddowning
Copy link
Contributor Author

@windu2b SELECT statements still run through the run method, so the migrations will be run 👍

@lukeraymonddowning
Copy link
Contributor Author

@timrspratt thanks for the feedback. I've done some testing on this, and with no changes, the event call takes between 0 and 0.1 milliseconds. In honesty, I don't know enough about lambda costs to know how much that would impact costs. I would certainly assume that there would be much larger impacts on costs, such as multiple or inefficient database queries as you pointed out.

Would be good to get some opinions on this for sure.

@garygreen
Copy link
Contributor

garygreen commented Sep 18, 2021

The impact of this PR is that you can drop this trait on your base TestCase with 0 consequences

Creating a QueryExecuting object every single time a query is executed just to satisfy this performance increase in tests will have an impact slowing down your main app and increase memory usage.

Could there be a ConnectionStarted event instead? That would fire only once when connection is started and when an anticipated query is excepted to be run?

@lukeraymonddowning
Copy link
Contributor Author

lukeraymonddowning commented Sep 18, 2021

@garygreen ideally we'd link into the connection booting, but that happens regardless of whether a query will be run, rendering it a little useless.

Any ideas? 🤔

I'll take another look tomorrow to see if we can reduce the event count whilst keeping it clean.

@garygreen
Copy link
Contributor

garygreen commented Sep 19, 2021

ideally we'd link into the connection booting, but that happens regardless of whether a query will be run

I haven't checked, but that sounds a bit strange if it pre-emptively connects. Briefly looking at the code it creates the PDO object on demand when needed, and so should only create the connection when it's needed...

@lukeraymonddowning
Copy link
Contributor Author

@garygreen found the cause.

  1. Create the most basic test possible that doesn't hit the database.
  2. Add a dd call at the top of the make method in ConnectionFactory.
  3. Run the test. You'll hit the dd.

The cause is the ParallelTestingServiceProvider, which boots and registers regardless of whether or not you're running parallel or series. This provider makes a call to DB::connection, which invokes the make method on the ConnectionFactory.

So whilst Laravel wouldn't attempt a DB connection early, tests always will. The ConnectionFactory and DatabaseManager are both singletons.

I've switched out event calling for another solution, which I think provides the best of both worlds.

@fgaroby
Copy link
Contributor

fgaroby commented Sep 19, 2021

@windu2b SELECT statements still run through the run method, so the migrations will be run +1

But I think it's useless to refresh the database, if there are only SELECT queries, isn't it ?

@lukeraymonddowning
Copy link
Contributor Author

@windu2b in an ideal world, yes. However, if a developer has used a select statement and no database exists, it will throw an exception, which would break a previously passing test.

@GrahamCampbell
Copy link
Member

Isn't the root problem that you are adding that trait to tests that don't use the db? Can't you just not do that?

@lukeraymonddowning
Copy link
Contributor Author

lukeraymonddowning commented Sep 19, 2021

@GrahamCampbell up to this point, that's exactly what I have done in my applications. However, it's annoying to have to keep track of this and muddies up the test cases.

In my opinion (and I dare say in the opinion of a large percentage of the community), this removes complexity and makes it easier to maintain a fast test suite.

If we start saying "can't you just not do that" about things, much of Laravel wouldn't exist.

I really appreciate your insights and contributions by the way; just stating my case.

@lukeraymonddowning lukeraymonddowning changed the title [8.x] Adds new RefreshDatabaseLazily testing trait and QueryExecuting event [8.x] Adds new RefreshDatabaseLazily testing trait Sep 19, 2021
@alexmartinfr
Copy link

@GrahamCampbell I think this will help reduce the cognitive load for newcomers to testing.
I don't know if you recall when you wrote your first tests, but there is already A LOT of things to take into account.

In my opinion, this is the kind of change that will reduce friction and help more and more people have a good experience with testing 👍

@GrahamCampbell
Copy link
Member

GrahamCampbell commented Sep 19, 2021

I am just trying to decide if encouraging people to jam all the traits into their tests should be considered a bad practice. We don't encourage people to install every package on packagist in case they need to use them. What's so wrong with having to say "run this test with db migrations"? It'll make people think, "do I really need to hit the db for this test?". Lots of people have really slow test suites because they are writing "unit" tests that unnecessarily boot the framework and do all kinds of stuff.


And, yes, I'm playing Devil's advocate here. :trollface:

@garygreen
Copy link
Contributor

garygreen commented Sep 19, 2021

I've personally only added the RefreshDatabase trait it in feature places where I know there will be db tests being done. I didn't realise it was common practise for developers to be adding it in the base TestCase class!

So benefit of this PR will come at a per-test level where it only refreshes when it knows a database query is being run. Still seems like a win if it can be done in a way that doesn't impact main app.

The new callbacks are ok, but I would prefer to see a general one time ConnectionStarted event rather than being able to register callbacks when booting a query (as I'm not sure what other use cases there are for that?). I'm concerned that might encourage naughty packages to add callbacks and slow down queries (and you wouldn't really know why...)

It would be ideal to listen for when database was connected and then assume it's needed. If something in Laravel is connecting too pre-emptively then that should be addressed?

@lukeraymonddowning
Copy link
Contributor Author

lukeraymonddowning commented Sep 19, 2021

@garygreen how would you handle parallel testing? Laravel has to make an early connection in order to build the required databases in preparation for running across processes.

Addressing your package concern, any package can already spam DB::listen and slow down every query on the other side, no? 🧐

@garygreen
Copy link
Contributor

Laravel has to make an early connection in order to build the required databases in preparation for running across processes

That's the main purpose of this PR, to hook into some mechanism that defers refreshing the DB? Curious to know why does Laravel have to make an "early connection" ? That may well be the case at the moment so wonder if that process can be deferred - run makes a call to reconnectIfMissingConnection which makes it sound like the connection happens only when something is going to happen on the DB, i.e. it's intended to be deferred?

@nuernbergerA
Copy link
Contributor

@lukeraymonddowning
Maybe split the PR into the callback on run and the trait testing stuff
because we could easily implement the trait in userland, but not the hook

@taylorotwell
Copy link
Member

@lukeraymonddowning were the benchmarks you shared on this feature on a real-world application or specifically crafted for this PR?

In my own applications I don't think I would see any benefit from this trait because every feature I test that hits controllers in my application is going to authenticate a user which is going to run a database query. It's actually probably pretty rare for me to write a feature test that does not interact with the database at all.

@nuernbergerA
Copy link
Contributor

@lukeraymonddowning were the benchmarks you shared on this feature on a real-world application or specifically crafted for this PR?

In my own applications I don't think I would see any benefit from this trait because every feature I test that hits controllers in my application is going to authenticate a user which is going to run a database query. It's actually probably pretty rare for me to write a feature test that does not interact with the database at all.

That's true for my apps too. Even a unit test that involves a model uses the DB in most cases.

Anyways it would be cool to have the event in the framework but the trait in a sperate package aka user-land.

@lukeraymonddowning
Copy link
Contributor Author

lukeraymonddowning commented Sep 20, 2021

@taylorotwell this was a real world application that we are currently in the process of building. No tweaks were made to the code other than switching out RefreshDatabase for RefreshDatabaseLazily.

A more detailed breakdown of the tests:

  • 438 tests
  • 362 with DB
  • 76 without DB

Examples of tests in the project that don't touch the DB include:

  • Checking guest/policy access
  • Tests where we can use make instead of create on the factory when using $this->be
  • Livewire tests

Can't be super specific on the nature of the tests due to NDAs on the project, but they're pretty typical tests.

@taylorotwell
Copy link
Member

@lukeraymonddowning and this application is the one where you saw improvement from 24 seconds to 21 seconds?

@lukeraymonddowning
Copy link
Contributor Author

That's the one 👍 the second example is quite light on the database as it stores the majority of its state in an external API.

@taylorotwell taylorotwell merged commit 3d1ead4 into laravel:8.x Sep 21, 2021
@taylorotwell
Copy link
Member

Renamed to LazilyRefreshDatabase. Thanks.

@lloricode
Copy link
Contributor

excited to release this

victorvilella pushed a commit to cdsistemas/framework that referenced this pull request Oct 12, 2021
* Adds a new `LazyRefreshDatabase` testing trait and a `QueryExecuting` database event.

* Refactor

* Refactor

* Event test

* Naming change

* Switches out event for an array of callbacks

* Refactor

* Refactor

* Refactor

* Refactor

* Refactor

* formatting

* fix doc block

Co-authored-by: Taylor Otwell <taylorotwell@gmail.com>
@lukeraymonddowning lukeraymonddowning deleted the lazy-database-testing branch April 17, 2023 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants