Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PropertyInfo] Use a single cache item per method, not per method + arguments #29977

Closed
deviantintegral opened this issue Jan 24, 2019 · 1 comment

Comments

@deviantintegral
Copy link
Contributor

Description

I'm using the Serializer component to deserialize objects with 100+ properties. In some cases, Symfony\Component\PropertyInfo\PropertyInfoCacheExtractor::extract ends up being called thousands of times. I'm using memcache as the backend and profiling locally, so my results are a "best case" without any network latency.

xhgui - symbol - symfony component propertyinfo propertyinfocacheextractor extract 2019-01-24 13-39-22

9310 calls to extract, total 128ms, and 1340 gets from memcache, for a total of 69ms

In most cases, I'm deserializing complete objects, where every property is populated with something. By saving all properties into a single cache item, once the cache is populated I'm seeing a significant improvement.

xhgui - symbol - drupal media_mpx propertyinfocacheextractor extract 2019-01-24 13-42-57

9310 calls to extract, total 54ms, and 7 gets from memcache, for a total of 3ms

That's a 42% improvement locally, and assuming 1ms in network latency for a cache reads the real-world performance could be much worse.

Here's the proof-of-concept code I've used for this:

    private function extract($method, array $arguments)
    {
        try {
            $serializedArguments = serialize($arguments);
        } catch (\Exception $exception) {
            // If arguments are not serializable, skip the cache
            return \call_user_func_array(array($this->propertyInfoExtractor, $method), $arguments);
        }

        $key = rawurlencode($method);
        if (array_key_exists($key, $this->arrayCache) && array_key_exists($serializedArguments, $this->arrayCache[$key])) {
          return $this->arrayCache[$key][$serializedArguments];
        }

        $item = $this->cacheItemPool->getItem($key);

        $data = $item->get();
          if ($item->isHit()) {
            $this->arrayCache[$key] = $data[$key];
              if (array_key_exists($serializedArguments, $data[$key])) {
                return $this->arrayCache[$key][$serializedArguments];
              }
          }

          if (!$data) {
            $data = [];
          }
          $value = \call_user_func_array(array($this->propertyInfoExtractor, $method), $arguments);
          $data[$key][$serializedArguments] = $value;
          $this->arrayCache[$key][$serializedArguments] = $value;
          $item->set($data);
          $this->cacheItemPool->save($item);

          return $this->arrayCache[$key][$serializedArguments];
    }

Questions I have:

  1. Is optimizing for the "most properties will need to be extracted case" best?
  2. Are there commonly used cached backends where rewriting the same cache key like this will cause problems?

Depending on the above, I'd like to file a pull request either modifying the existing extract method or adding a new class with the different behaviour.

aside The fact that there's a static array cache needs to be documented. By default I added in a Drupal memory cache in a chain, and the class docs should call out that's not needed. Or, we should benchmark Doctrine / Drupal to see if the performance improvement documented at #16838 (comment) still holds.

@nicolas-grekas
Copy link
Member

Would you like to send a PR doing so?

@fabpot fabpot closed this as completed Apr 3, 2019
fabpot added a commit that referenced this issue Apr 3, 2019
…ntintegral)

This PR was merged into the 4.3-dev branch.

Discussion
----------

[PropertyInfo] Use a single cache item per method

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #29977
| License       | MIT
| Doc PR        | none

Replaces #30523 with a rebase to master.

This PR changes how property metadata is cached, significantly reducing the number of calls made between PHP and the backend cache. Instead of storing one cache item per method and set of arguments, a single cache item is stored per method. This matches well with real-world use, where most properties in an object will need to be inspected.

Note that the absolute numbers in the above PR are best case. In production environments where memcache is on a remote server, we were seeing multiple seconds consumed by memcache calls.

Commits
-------

2a4f8a1 [PropertyInfo] Use a single cache item per method
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants