Added a PSR-0 Annotation Reader #152

chx · 2012-06-15T04:36:52Z

You need to pass a PSR-0 include path and a class name to this class and then you can get the annotations out of it. The goal of this exercise is to converse memory as much as possible. Reflection requires the class to be loaded and can not be unloaded. Tokenizing allows for unloading. There was already groundwork for tokenizing for the use statements so I have patched the PhpParser class to allow reuse.

stof · 2012-06-15T07:44:07Z

lib/Doctrine/Common/Annotations/Psr0AnnotationReader.php

+            }
+        }
+        // Drop what we can to save memory.
+        unset($this->tokens, $this->parser);


unset will unset the properties of the class, not the values of the properties

beberlei · 2012-06-15T10:09:00Z

I don't get what this has to do with PSR-0. What it does is using tokenizer instead of reflection, alkso its a parser and not a reader. This PR is really weird to be honest in breaking the naming and hierachy :-) I dont see the value here.

schmittjoh · 2012-06-15T11:41:20Z

PSR-0 probably relates to the fact that the class is expected in a specific filename, and he does not rely on an autoloader to find it. I can see why this might be useful sometimes. I haven't tested performance, but I guess it will be slower the larger the class becomes (token_get_all hasn't a really good record there).

If we add this, I think we need to change the interface to remove the reflection methods (probably adding a second interface). Also extending the PHPParser and duplicating a lot of code does not really look good. We should try to avoid that. Maybe introducing a doc comment provider which can be injected into the reader and is either reflection-based or uses token_get_all.

akkie · 2012-06-15T12:05:11Z

The first version of the PhpParser was horrible slow when tokenizing the whole file. As I see in your pull request you make the same mistake and tokenize the whole file.

Please look at this benchmark:
https://github.com/akkie/parser-benchmark

schmittjoh · 2012-06-15T12:31:35Z

Whatever we do, the performance of the Reflection-based solution should not deteriorate.

beberlei · 2012-06-15T12:54:59Z

I don't see why we need different implementations here. The way it is at the moment works, why do we need to make this difficult for the users by adding a decision?

schmittjoh · 2012-06-15T13:14:14Z

One thing where I could need it, is static analysis. I do not want to execute user-provided code. @chx apparently wants to create less classes (i.e. saving memory) which seems another use-case.

We can still make the default decision for the user but allow to inject a different doc comment provider.

neclimdul · 2012-06-15T13:33:40Z

@akkie that's an interested benchmark but it doesn't test the performance of the Reader methods here, just the class parsing. Any ideas how we could compare that?

@beberlei I think the idea is that the tokenizer doesn't take a reflection class so the code is not permanently loaded into memory. It could be a fairly novel and useful choice.

sdboyer · 2012-06-15T13:33:47Z

it's not so much about the reflection classes themselves consuming memory as it is about the parsed-in code consuming memory. without an opcode cache (which we can't assume/require with Drupal), there's a memory hit for every file that's parsed in. userspace code can't unload code once it's been parsed in, so there's no way to recoup the memory later. the goal of taking a pure-tokenized approach was saving memory by avoiding parsing in in the first place. and since the large token array is flushed by the end of the constructor, the objects are as thin on memory use as possible.

hopefully that clarifies some of the points raised - for example, @stof 's point that this violates the interface. yep, it does, because the whole interface is predicated on Reflection* objects.

@akkie - wow, those numbers are pretty stark. haven't had a chance to review the differences in the PhpParser in detail there, but that's ...heh, noteworthy.

this is starting to look like a case that can be optimized either for speed or memory. both of these seem like reasonable cases to optimize for, especially if the interface could be refactored so that client code can choose (with a simple flag) which optimization they'd prefer to use.

stof · 2012-06-15T13:35:30Z

@sdboyer as it violates the interface, it cannot implement it and so cannot be used in places expecting the reader (and cannot be wrapped in a cached reader and so on)

schmittjoh · 2012-06-15T13:41:45Z

We can change the interface (add a new one). That should not be the problem here.

In general, the code is not ready for a merge though.

beberlei · 2012-06-15T14:43:48Z

well, why not add a static reflection API that looks that the normal API and remove the type-hints?

chx · 2012-06-15T17:52:28Z

Some notes in random order: the PSR-0 restriction is because I do not want to rewrite PHP in PHP :) I can know from a class name where the file is. I know that there is only one class. Especially without the latter, using the tokenizer would be tricky and less useful. Also, everyone uses PSR-0 by now anyways, so why not rely on it?

There is some code duplication here for sure but not a real lot. If I manage to unify the interfaces then I can move the globalIgnoredNames handling into a base class (I think late static binding will be useful) and also the getAnnotation function which even the existing AnnotationReader would benefit from.

The unificiation is a bit challenging because for my reader specifying the class makes no sense and getClassAnnotation($class, $annotationName) requires one. Unfortunate :/

chx · 2012-06-16T01:28:28Z

We will not remove any type hints. Rather I will create a class which extends ReflectionClass and ReflectionMethod and add methods returning such objects to the new parser. So approximately:

$parser = new Psr0Parser('include/path', 'my\name\space\class');
$reader = new AnnotationReader();
$reader->getClassAnnotation($parser->getClassReflection(), $annotationName);

chx · 2012-06-18T05:18:36Z

Better approach at #154

chx added 5 commits June 14, 2012 21:33

Added Psr0AnnotationReader

94275af

It is too complicated to reuse the AnnotationReader::globalIgnoredNames.

6010d89

Code style fixes.

3d3e188

A wrong variable name caused performance problems.

f338a97

Do not instantiate an empty parent reader.

61be574

stof reviewed Jun 15, 2012
View reviewed changes

chx closed this Jun 18, 2012

Majkl578 mentioned this pull request Dec 7, 2017

The future of Common 3.0 #826

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a PSR-0 Annotation Reader #152

Added a PSR-0 Annotation Reader #152

chx commented Jun 15, 2012

stof Jun 15, 2012

beberlei commented Jun 15, 2012

schmittjoh commented Jun 15, 2012

akkie commented Jun 15, 2012

schmittjoh commented Jun 15, 2012

beberlei commented Jun 15, 2012

schmittjoh commented Jun 15, 2012

neclimdul commented Jun 15, 2012

sdboyer commented Jun 15, 2012

stof commented Jun 15, 2012

schmittjoh commented Jun 15, 2012

beberlei commented Jun 15, 2012

chx commented Jun 15, 2012

chx commented Jun 16, 2012

chx commented Jun 18, 2012

Added a PSR-0 Annotation Reader #152

Added a PSR-0 Annotation Reader #152

Conversation

chx commented Jun 15, 2012

stof Jun 15, 2012

Choose a reason for hiding this comment

beberlei commented Jun 15, 2012

schmittjoh commented Jun 15, 2012

akkie commented Jun 15, 2012

schmittjoh commented Jun 15, 2012

beberlei commented Jun 15, 2012

schmittjoh commented Jun 15, 2012

neclimdul commented Jun 15, 2012

sdboyer commented Jun 15, 2012

stof commented Jun 15, 2012

schmittjoh commented Jun 15, 2012

beberlei commented Jun 15, 2012

chx commented Jun 15, 2012

chx commented Jun 16, 2012

chx commented Jun 18, 2012