ClassManifest / TokenisedRegularExpression appears to have memory leaks #5549

sminnee · 2016-05-17T18:24:23Z

A frequent source of frustration when setting up new dev environments is that visiting the sites will run out of memory while compiling the initial manifest. This is exacerbated by the fact that the manifest is rebuild for www-data and CLI users, but that's a separate problem.

I believe that the issue is called when the file-by-file parsing needs to happen for all files in the project. This suggests that a memory leak in this subsystem is responsible for the problem.

dhensby · 2016-05-18T15:23:33Z

pull TokenisedRegularExpression out for https://github.com/nikic/PHP-Parser :)

dhensby · 2016-05-18T20:24:20Z

@kinglozzer took a quick look at doing it here: master...kinglozzer:classmanifest

but it takes too long and uses too much memory, I think but the feature of breaking out of the node traverser hasn't been implemented, which I think we can use aggressively to save time (and maybe memory?)

sminnee · 2016-05-19T13:07:47Z

It will be interesting to see if an approach using nikic's parser can be made to work, but until the speed and memory usage issues are solved, it would make this problem worse, not better. It wouldn't surprise me if there were some quick wins with the current codebase and it's probably worth addressing those.

Bringing in nikic/php-parser feels a bit like using a sledgehammer to crack a walnut, but it would be a more robust project to rely on.

dhensby · 2016-05-19T18:23:38Z

It reduces the coffee we have to maintain, but yes it's got issues we'd like to address.

Copying composer's approach to class finding may be the way to speed this up. We don't need to tokenise more than a few lines of a file to get the info we need so we can look at trashing a ton of the coffee before tokenisation

sminnee · 2016-05-19T22:45:48Z

Composer's approach is inferior and not suited to our use case. It interprets neither the implements not extends keywords. It's a non-starter.

Trying to reliably tokenise a partial file is hard to do reliably, and the code basically starts by scanning for a T_CLASS or T_INTERFACE item.

We could consider moving to a better parser but we're not moving to a worse one.

dhensby · 2016-05-19T23:18:08Z

Composer's approach is inferior and not suited to our use case. It interprets neither the implements not extends keywords. It's a non-starter.

Sorry, by copying composers approach, I mean aggressively discarding code that we don't need to parse. @kinglozzer has had a go at it and had some success.

We could consider moving to a better parser but we're not moving to a worse one.

Agreed, but can we get worse than what we've got? ;)

sminnee · 2016-05-19T23:56:35Z

Agreed, but can we get worse than what we've got? ;)

Composer's. Last I checked it used regexes.

dhensby · 2016-05-20T00:25:20Z

Touché

kinglozzer · 2016-05-20T08:32:15Z

Composer's approach is inferior and not suited to our use case. It interprets neither the implements not extends keywords. It's a non-starter.

Sorry, by copying composers approach, I mean aggressively discarding code that we don't need to parse. @kinglozzer has had a go at it and had some success.

Yup, Composer strips out comments, docblocks etc before parsing and it does save a small amount of CPU time (I don’t have the profiler runs to hand, but it didn’t significantly impact memory usage either way in the runs I did). Speed-ups are nice, but unfortunately miss the point of this ticket :(

The other thing I tried was simply removing everything inside the “top-level” braces that we don’t need, that is:

class Foo extends Bar {
    // everything here will be removed
    protected $iWillBeRemoved;
}

trait Baz {
    // also here
    public $baz;
}

// would become
class Foo extends Bar { }
trait Baz { }

It was extremely crude and relied on copying Composer’s approach to stripping comments out first (so comments including things like {@link Foo} didn’t break it entirely). Not sure if we could make something like that work, and it needs profiling properly to see if it would be of any benefit anyway.

dhensby · 2016-05-20T08:44:44Z

We need to test on "real world" sites to see the real impact

sminnee · 2016-05-24T10:25:12Z

Running it on the CWP recipe, which has a bunch of modules, would be a reasonable real-world test.

sminnee · 2016-05-24T10:36:57Z

OK, I raised this issue based the reports of out-of-memory errors here (https://groups.google.com/forum/#!topic/silverstripe-dev/YQ9ru7UuUV0) but I'm having trouble replicating this...

dhensby · 2016-05-24T16:41:59Z

I'm able to exhaust 256mb of memory on a CWP install when including every test suite from the modules that have them.

I think each test does add more and more memory consumption but in not sure it's a manifest issue

sminnee · 2016-09-21T21:49:36Z

Closing this issue as a false-report.

dhensby mentioned this issue May 19, 2016

Replace TokenizedRegularExpression with PHPParser #5567

Merged

sminnee closed this as completed Sep 21, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ClassManifest / TokenisedRegularExpression appears to have memory leaks #5549

ClassManifest / TokenisedRegularExpression appears to have memory leaks #5549

sminnee commented May 17, 2016

dhensby commented May 18, 2016

dhensby commented May 18, 2016

sminnee commented May 19, 2016

dhensby commented May 19, 2016

sminnee commented May 19, 2016

dhensby commented May 19, 2016

sminnee commented May 19, 2016

dhensby commented May 20, 2016

kinglozzer commented May 20, 2016

dhensby commented May 20, 2016

sminnee commented May 24, 2016

sminnee commented May 24, 2016

dhensby commented May 24, 2016

sminnee commented Sep 21, 2016

ClassManifest / TokenisedRegularExpression appears to have memory leaks #5549

ClassManifest / TokenisedRegularExpression appears to have memory leaks #5549

Comments

sminnee commented May 17, 2016

dhensby commented May 18, 2016

dhensby commented May 18, 2016

sminnee commented May 19, 2016

dhensby commented May 19, 2016

sminnee commented May 19, 2016

dhensby commented May 19, 2016

sminnee commented May 19, 2016

dhensby commented May 20, 2016

kinglozzer commented May 20, 2016

dhensby commented May 20, 2016

sminnee commented May 24, 2016

sminnee commented May 24, 2016

dhensby commented May 24, 2016

sminnee commented Sep 21, 2016