Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Very inneficient gettext parser #596

Closed
digitalnature opened this Issue · 2 comments

2 participants

@digitalnature

The parser itself it's ok I guess, though the PO catalog parser is about 3 times faster than the MO (compiled catalog) parser. This is kind of awkward considering that the reason the binary version exists is to make reading faster by machines :)

But the real problem is the merging of messages, which slows things down about 12 times.

For example - a catalog with 3000 lines gets parsed in 0.09s on my machine (the MO version of the catalog in 0.26s).
The merging functions add another 1.2s to this process. Most of this time probably comes from the function call overhead x 3000 times.

I think no methods should be called inside the for / fgets loop, and code moved inside the loop even if it means some duplicate code. Also the resulting array contains some unnecessary fields like ids, id, and probably comments and occurrences. These just take up some memory. The id is already present in the key, and the value should just be an array containing the translated messages...

@davidpersson davidpersson was assigned
@davidpersson

My results are (set: ~1600 entries):

PO

Total (1000 iterations)
Took: 201.7883810997

Per iteration
Took: 0.2017883810997

Single sample
Took: 0.20334100723267

MO

Total (1000 iterations)
Took: 201.46384692192

Per iteration
Took: 0.20146384692192

Single sample
Took: 0.24216485023499
@davidpersson

I'm closing this as (a) the real solution to this would be caching (see #1054) and (b) very inefficient is in my eyes exaggerated as it is one of the fastest PHP parsers I know, (c) yes MO parsing should be faster but this PHP and that doesn't benefit from parsing binary vs text formats much, (d) memory efficiency is not a priority for the parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.