New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression: All dimensions are created on each request #8399

Closed
tsteur opened this Issue Jul 22, 2015 · 11 comments

Comments

Projects
None yet
4 participants
@tsteur
Member

tsteur commented Jul 22, 2015

On each request we need to check whether there are any updates to perform. This also means we need to check whether there are any dimension changes. To not having to find and create all dimension instances and to not having to compare the current version with the installed version (option table) there is a file cache: https://github.com/piwik/piwik/blob/2.14.1-rc1/core/Columns/Updater.php#L149-L157

See the comment in the method:

to avoid having to load all dimensions on each request we check if there were any changes on the file system can easily save > 100ms for each request

With newer PHP version etc it adds on my server about 40ms currently. That's a lot when the request in total takes only 200ms or 300ms.

Problem is they are now created in the constructor meaning whenever this instances is created we search for all dimensions and create instances of them: https://github.com/piwik/piwik/blob/2.14.1-rc1/core/Columns/Updater.php#L50-L52

@diosmosis

This comment has been minimized.

Show comment
Hide comment
@diosmosis

diosmosis Jul 22, 2015

Member

Would be fixed indirectly by #8329, but it would be better to not have to do the update check on every request. Not sure what initiates the update check, perhaps when we know an update needs to be done, we can hijack the normal request process. Would require making the event system distributed (ie, scheduling events or adding event handlers for other requests)...

Member

diosmosis commented Jul 22, 2015

Would be fixed indirectly by #8329, but it would be better to not have to do the update check on every request. Not sure what initiates the update check, perhaps when we know an update needs to be done, we can hijack the normal request process. Would require making the event system distributed (ie, scheduling events or adding event handlers for other requests)...

@tsteur

This comment has been minimized.

Show comment
Hide comment
@tsteur

tsteur Jul 23, 2015

Member

I had a look at that PR but I don't understand what is actually cached and why one should configure them via ini? Or do you mean a single instance can be changed?

We kinda have to check for updates on every request since there could be a database update or a new dimension or ... I was hoping it would be maybe just possible to move the Dimension::getAllDimension() calls back to where they were before and then it should be fast again.

Member

tsteur commented Jul 23, 2015

I had a look at that PR but I don't understand what is actually cached and why one should configure them via ini? Or do you mean a single instance can be changed?

We kinda have to check for updates on every request since there could be a database update or a new dimension or ... I was hoping it would be maybe just possible to move the Dimension::getAllDimension() calls back to where they were before and then it should be fast again.

@diosmosis

This comment has been minimized.

Show comment
Hide comment
@diosmosis

diosmosis Jul 23, 2015

Member

I had a look at that PR but I don't understand what is actually cached and why one should configure them via ini? Or do you mean a single instance can be changed?

The Dimensions are stored in DI and a cache is added to DI (which can be APC and not just an array for instance), which means we won't end up looping through plugins, requiring files, etc. for each request.

EDIT: The DI definitions are cached, which means the class names + DI config.

Not sure what you mean by single instance can be changed, can you explain?

We kinda have to check for updates on every request since there could be a database update or a new dimension or ...

I was thinking maybe there'd be some way to mark Piwik as needing an upgrade when we know this is the case (like activating a new plugin, or something), and making future requests initiate an update, rather than check on every request.

I was hoping it would be maybe just possible to move the Dimension::getAllDimension() calls back to where they were before and then it should be fast again.

Probably could do that, haven't really checked.

Member

diosmosis commented Jul 23, 2015

I had a look at that PR but I don't understand what is actually cached and why one should configure them via ini? Or do you mean a single instance can be changed?

The Dimensions are stored in DI and a cache is added to DI (which can be APC and not just an array for instance), which means we won't end up looping through plugins, requiring files, etc. for each request.

EDIT: The DI definitions are cached, which means the class names + DI config.

Not sure what you mean by single instance can be changed, can you explain?

We kinda have to check for updates on every request since there could be a database update or a new dimension or ...

I was thinking maybe there'd be some way to mark Piwik as needing an upgrade when we know this is the case (like activating a new plugin, or something), and making future requests initiate an update, rather than check on every request.

I was hoping it would be maybe just possible to move the Dimension::getAllDimension() calls back to where they were before and then it should be fast again.

Probably could do that, haven't really checked.

@tsteur

This comment has been minimized.

Show comment
Hide comment
@tsteur

tsteur Jul 28, 2015

Member

The Dimensions are stored in DI and a cache is added to DI (which can be APC and not just an array for instance), which means we won't end up looping through plugins, requiring files, etc. for each request.

How is it different to currently? I think the components are cached already (path to file and/or classname), Not sure what we would actually store in DI and cache?

Member

tsteur commented Jul 28, 2015

The Dimensions are stored in DI and a cache is added to DI (which can be APC and not just an array for instance), which means we won't end up looping through plugins, requiring files, etc. for each request.

How is it different to currently? I think the components are cached already (path to file and/or classname), Not sure what we would actually store in DI and cache?

@diosmosis

This comment has been minimized.

Show comment
Hide comment
@diosmosis

diosmosis Jul 28, 2015

Member

The logic in the VisitDimension::getAllDimensions() method currrently uses a transient cache, so whatever logic is done is done once per request. The DI cache will allow caching the sorted list so the logic in those methods will not be necessary on subsequent requests.

The DI definition objects are stored in the cache (you can see them if you look through php-di's source), so the list of class names for Dimensions will be stored. As well as definitions of Dimension objects so they can be stored in DI w/ (I think) no performance hit.

Member

diosmosis commented Jul 28, 2015

The logic in the VisitDimension::getAllDimensions() method currrently uses a transient cache, so whatever logic is done is done once per request. The DI cache will allow caching the sorted list so the logic in those methods will not be necessary on subsequent requests.

The DI definition objects are stored in the cache (you can see them if you look through php-di's source), so the list of class names for Dimensions will be stored. As well as definitions of Dimension objects so they can be stored in DI w/ (I think) no performance hit.

@tsteur

This comment has been minimized.

Show comment
Hide comment
@tsteur

tsteur Jul 28, 2015

Member

We store that list I meant somewhere in findComponents/findMultipleCompents() I think.

The logic in the VisitDimension::getAllDimensions() method currrently uses a transient cache, so whatever logic is done is done once per request. The DI cache will allow caching the sorted list so the logic in those methods will not be necessary on subsequent requests.

Does it mean it actually stores the serialized object?

Member

tsteur commented Jul 28, 2015

We store that list I meant somewhere in findComponents/findMultipleCompents() I think.

The logic in the VisitDimension::getAllDimensions() method currrently uses a transient cache, so whatever logic is done is done once per request. The DI cache will allow caching the sorted list so the logic in those methods will not be necessary on subsequent requests.

Does it mean it actually stores the serialized object?

@diosmosis

This comment has been minimized.

Show comment
Hide comment
@diosmosis

diosmosis Jul 28, 2015

Member

Not the Dimension object, a Definition object from php-di. I don't know if it's serialized, @mnapoli would know best about this stuff.

Member

diosmosis commented Jul 28, 2015

Not the Dimension object, a Definition object from php-di. I don't know if it's serialized, @mnapoli would know best about this stuff.

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jul 28, 2015

Contributor

Does it mean it actually stores the serialized object?

Depends of the Doctrine cache you use. If it's ArrayCache, it's not serialized. If you use ApcCache, it uses apc_store() and the documentation isn't explicit on this but I guess it's serialized. Etc.

And yes we are talking about DI definition instances.

Contributor

mnapoli commented Jul 28, 2015

Does it mean it actually stores the serialized object?

Depends of the Doctrine cache you use. If it's ArrayCache, it's not serialized. If you use ApcCache, it uses apc_store() and the documentation isn't explicit on this but I guess it's serialized. Etc.

And yes we are talking about DI definition instances.

@mattab

This comment has been minimized.

Show comment
Hide comment
@mattab

mattab Sep 18, 2015

Member

@tsteur do you think something could be done in time for 2.15.0, or if it's a lot of work, we should do it in 3.0.0 maybe?

Member

mattab commented Sep 18, 2015

@tsteur do you think something could be done in time for 2.15.0, or if it's a lot of work, we should do it in 3.0.0 maybe?

@diosmosis

This comment has been minimized.

Show comment
Hide comment
@diosmosis

diosmosis Sep 18, 2015

Member

@mattab I'm adding this to 2.15, a quick hack is possible I think (can't remember too clearly, but I'll find out).

Member

diosmosis commented Sep 18, 2015

@mattab I'm adding this to 2.15, a quick hack is possible I think (can't remember too clearly, but I'll find out).

@diosmosis diosmosis added this to the 2.15.0 milestone Sep 18, 2015

tsteur added a commit that referenced this issue Oct 8, 2015

@diosmosis diosmosis closed this in #8947 Oct 8, 2015

diosmosis added a commit that referenced this issue Oct 8, 2015

Merge pull request #8947 from piwik/8399
Fixes #8399, create dimension instance only when needed to improve performance
@mattab

This comment has been minimized.

Show comment
Hide comment
@mattab

mattab Oct 8, 2015

Member

http://demo.piwik.org was upgraded to latest beta and it feels faster already 👍

Member

mattab commented Oct 8, 2015

http://demo.piwik.org was upgraded to latest beta and it feels faster already 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment