Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow performance of findAll (creating an unneccesary hash of whole recordset?) #806

Closed
andybellenie opened this issue Jun 22, 2017 · 20 comments
Assignees
Milestone

Comments

@andybellenie
Copy link
Contributor

Strange one...

In /model/read.cfm there is this on line 232:

request.wheels[$hashedKey(local.findAll.query)] = variables.wheels.class.modelName; // place an identifer in request scope so we can reference this query when passed in to view functions

I don't understand what this code is for. The end result is Wheels creates a hash of the entire recordset which is really slow when dealing with thousands of rows (e.g. an export).

I'm baffled by this code. How would you even reference this variable later?

I've commented this out on one system with no adverse effects. Any ideas?

@chapmandu
Copy link
Contributor

I believe it's purpose is to prevent identical queries running more than one per request (if reload=true argument is not used)

My understanding is that the sql is hashed, not the resulting recordset.

@andybellenie
Copy link
Contributor Author

Cached queries are stored a couple of lines above:

request.wheels[variables.wheels.class.modelName][local.queryKey] = local.findAll; // <- store in request cache so we never run the exact same query twice in the same request

For that you are correct, it's a hash of the arguments used to generate the query. The line I'm referring to is separate and definitely hashes the whole query. It's bizarre.

@andybellenie
Copy link
Contributor Author

@perdjurner any ideas?

@perdjurner
Copy link
Contributor

This is where it's used: https://github.com/cfwheels/cfwheels/blob/master/wheels/controller/rendering.cfm#L268

I think that when you render or include a partial, and pass in a query to it, it's necessary to know the model name. If the query is created in the User model for example, Wheels will look for a file named _user.cfm. More info here: https://guides.cfwheels.org/v1.4/docs/partials#section-using-partials-with-a-query

@andybellenie
Copy link
Contributor Author

Any reason why we couldn't use the column list instead of the whole query? This is a surprisingly big performance hit. With a 100 row findAll() I'm seeing a fivefold increase. Bigger queries are even worse.

@perdjurner
Copy link
Contributor

perdjurner commented Jun 25, 2017

I think we need to uniquely identify the query results so using just the column list wouldn't work.

What about improving performance of the function itself (the one that hashes the query)?

@andybellenie
Copy link
Contributor Author

The core problem is that the performance decreases with each additional row of data to serialise and I don't see how we could optimise that away. I think we need a different approach. This is happening even if reload=true is specified.

@perdjurner
Copy link
Contributor

I'll see if I can write some tests that cover all the different scenarios that this needs to work in. Then we can try other approaches as long as the tests keep passing.

@perdjurner perdjurner self-assigned this Jun 25, 2017
@perdjurner perdjurner added this to the 2.0.0 Beta 2 milestone Jun 25, 2017
@andybellenie
Copy link
Contributor Author

How about making the partial argument required when using a query? I realise it's a breaking change.

@perdjurner
Copy link
Contributor

Yes, I suppose we may have to if the performance is bad and we can't find another solution.

@perdjurner
Copy link
Contributor

perdjurner commented Jul 17, 2017

I won't have time for this for 2.0 so I'm removing the milestone (unless anyone else wants to have a look).

@perdjurner perdjurner removed this from the 2.0.0 Beta 2 milestone Jul 17, 2017
@andybellenie
Copy link
Contributor Author

I can have a go

@perdjurner perdjurner added this to the 2.0.0 Beta 2 milestone Jul 17, 2017
@perdjurner
Copy link
Contributor

@andybellenie Looking at releasing soon, any progress on this or should we bump it until later?

@perdjurner perdjurner modified the milestones: 2.0.0 Beta 2, 2.0.0 RC 1 Aug 5, 2017
@andybellenie
Copy link
Contributor Author

andybellenie commented Aug 9, 2017

I have no other suggestions except make the partial argument required. Looking at the code when using a query based partial Wheels has to hash the recordset twice, once to create the key, and again for lookup. As the key must be unique to each query and there's no obvious way to maintain a reference between the findAll() method and the view I think the best option is to make the argument required and document it as a breaking change.

@perdjurner
Copy link
Contributor

Can we make it a global setting perhaps?

So, if you find the hashing slow you can turn off that setting and then you're expected to always pass in the partial argument?

@andybellenie
Copy link
Contributor Author

We could but I don't think we should. This is a wasteful bit of processing and Wheels ought to be as lean as possible by default. It's not worth it for a marginal improvement on code readability. On my high volume sites, this has a real impact on overall load.

@perdjurner
Copy link
Contributor

Ok, let's get rid of it :)

Can you do that?

@perdjurner
Copy link
Contributor

If someone really liked the old functionality I suppose it can always be brought back in with a plugin.

@andybellenie
Copy link
Contributor Author

I'm on it, I'll get a pull request to you tomorrow

@andybellenie
Copy link
Contributor Author

Sorry had an urgent server issue to take care of, will have to be tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants