New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use in-memory store for station lookups. #111

Closed
Bertware opened this Issue Jan 24, 2017 · 5 comments

Comments

Projects
None yet
2 participants
@Bertware
Contributor

Bertware commented Jan 24, 2017

The problem: Too long execution time for station lookups

At this moment, both getStationsand getStationFromID loop over the entire stations list. This consumes valuable time resources. getStations also does quite some text replacing to match different spellings for the same station.

Given that a single liveboard can easily contain 25 entries, every reduction of the execution time by 10ms, would result in a possible reduction up to 250ms. This means even small gains are worth it. As a possible side-effect, the php code might run more efficient, offloading the server a little.

Option 1: SQLite

The entire matching could be moved to an SQLite database. Here, we would have a table, which contains the standard name and the alternative name. A station could have multiple entries here, one for every special spelling of that station. If a station doesn't have special/different spellings, it's not in the table.

  1. getStations($name) would perform a query, checking this table to see if we're handling a special name. If so, the original name would be returned. If not, we continue working with the original name parameter.

  2. we now need to lookup the station with it's data. For this purpose, we could load the jsonld in sqlite too. We could set id as primary key and station name as indexed. Searching would be done on the name parameter, using LIKE $name% and load the entire rows. We parse the rows and return. This is expected to be a performance boost compared to looping over the entire file.

  3. searching by id would happen on the same table mentioned in step 2. Only, we'd now search on id. Again, we load the entire row, parse and return.

We could load the stations in an sqlite database (containing both tables) using a php script, which could be manually run after running the build script. (Or call it from the build script).

To load the special spellings in sqlite, we could use the same php script, and a station-alternatives.csv files, which would contain the same data as the table would (original name | special spelling). This way, we can easily add names and run the script to generate a database table. An alternative would be to allow an unlimited amount of special names on one line, so you'd have the original name followed by all special spellings.

This would also clean up the code a little, since all matching code would be replaced by a single query.

Option 2: In-memory store

We could consider keeping the results in a memory cache. This cache could be filled by a cron job, so we would always have a cache hit. This seems an easier, but less flexible solution. This also seems to be more server and config dependent, compared to the guaranteed, but likely lower performance of sqlite.

Option 3: Combine

The best of 2 worlds: we could use the sqlite followed by an in-memory store. Every cache miss would still get a good performance, the memory store would give a little extra boost.


These performance improvements might help #88 a little. (Not a duplicate, as this is one specific approach for a specific speedup, whereas #88 is a lot broader).

Also related to iRail/iRail#265

I could implement the sqlite version in a fork to compare performance, after which we could evaluate if we switch? Input and discussion wanted. @pietercolpaert

@Bertware

This comment has been minimized.

Show comment
Hide comment
@Bertware

Bertware Jan 24, 2017

Contributor

Another advantage of sqlite is that we can run queries on other fields too, like stations near another station, or stations within x km from a given set of coordinates. This might make the code a little more future-proof.

First tests with SQLite and memcached.

Tests are run as 50 station queries from a brussels-south liveboard. They are run a single time, to simulate a station request (test1) and 100 times for server average across multiple clients (test2). Tests are run inside a virtualbox instance. For the memcached queries, a 3rd test was executed, to see how it scaled to 500 *50 queries. The database is loaded and stored in memory (static) every test run.

method single (ms) average over 100 (ms) average over 500 (ms)
normal 102.40 96.00
normal w/ memcached 94.75 43.86 42.86
sqlite 113.56 117.72
sqlite w/ memcached 122.66 63.10 67.68

the virtualbox instance, and differences in processing power, might skew these results, but the relative difference should be about the same.

method single average over 100 average over 500
normal 100% 100%
normal w/ memcached 92% 45% 44%
sqlite 110% 135%
sqlite w/ memcached 119% 65% 70%

Conclusion

While this was tested with a basic / prototype implementation of SQLite3, but it seems the data is still too small to get a real performance difference here due to the loading overhead. However, memcached proved we can get a 55% speed improvement by using an in-memory cache.

Edit: MySQL and APC

Since the overhead of Sqlite is likely caused by loading the file etc, and not so much by the query, I decided to give it a run on SQL (MariaDb). Results are below, using the same testing method. To prevent any impact from outside on the solution, the normal tests were run again and used as reference. (Prevents processor load by other applications from distorting test results).

method single (ms) average over 100 (ms) average over 500 (ms)
normal 92,57 90,80
normal w/ memcached 83,30 36,70 45,40
normal w/ APC 79 34,70 33,65
SQL 76,70 54,23
SQL w/ memcached 65,87 28,63 27,12
SQL w/ APC 56,80 21,33 23,73
method single average over 100 average over 500
normal 100% 100%
normal w/ memcached 90% 40% 50%
normal w/ APC 85% 38% 37%
SQL 82% 59%
SQL w/ memcached 71% 32% 30%
SQL w/ APC 61% 23% 26%

As we can see, SQL removes the speed limit imposed by the sqlite loading. It also comes relatively close to the normal approach combined with memcached. Possible disadvantages are the fact that this requires an SQL server.

When combined with APC, we can see between 5 and 10% speed improvements over memcached. The final results are significant. When using SQL and APC combined, a benchmark starting from an empty cache is handled 40% faster than now. When running a benchmark multiple times, the average single benchmark is handled up to 77% faster than a request at this moment.
Without moving to SQL, we get 5% improvement on an empty cache and a 60% improvement on a cache hit. The 5% improvement for the 1 hit test with apc is to be ignored.

How can a single benchmark starting from an empty cache still improve performance

The caching mechanism (APC or memcached) store the stations one by one, based on their query.
However, when querying a list of liveboard stations with an empty cache, the cache will be filled. For testing, the liveboard of Brussels-south, 8am was used. This contains multiple entries for Ghent, Antwerp and Schaarbeek. Since we're using caching on station query level, only the first "Schaarbeek" query will be ran, all the others will result in a cache hit and quickly return the earlier calculated value.

Conclusion

APC should be easy to install on the server, and is extremely easy to implement as well. Therefore, we should add it without hesitating.
MySql might add another 10-30% performance gain, at the cost of requiring an SQL server. This can be discussed.

Contributor

Bertware commented Jan 24, 2017

Another advantage of sqlite is that we can run queries on other fields too, like stations near another station, or stations within x km from a given set of coordinates. This might make the code a little more future-proof.

First tests with SQLite and memcached.

Tests are run as 50 station queries from a brussels-south liveboard. They are run a single time, to simulate a station request (test1) and 100 times for server average across multiple clients (test2). Tests are run inside a virtualbox instance. For the memcached queries, a 3rd test was executed, to see how it scaled to 500 *50 queries. The database is loaded and stored in memory (static) every test run.

method single (ms) average over 100 (ms) average over 500 (ms)
normal 102.40 96.00
normal w/ memcached 94.75 43.86 42.86
sqlite 113.56 117.72
sqlite w/ memcached 122.66 63.10 67.68

the virtualbox instance, and differences in processing power, might skew these results, but the relative difference should be about the same.

method single average over 100 average over 500
normal 100% 100%
normal w/ memcached 92% 45% 44%
sqlite 110% 135%
sqlite w/ memcached 119% 65% 70%

Conclusion

While this was tested with a basic / prototype implementation of SQLite3, but it seems the data is still too small to get a real performance difference here due to the loading overhead. However, memcached proved we can get a 55% speed improvement by using an in-memory cache.

Edit: MySQL and APC

Since the overhead of Sqlite is likely caused by loading the file etc, and not so much by the query, I decided to give it a run on SQL (MariaDb). Results are below, using the same testing method. To prevent any impact from outside on the solution, the normal tests were run again and used as reference. (Prevents processor load by other applications from distorting test results).

method single (ms) average over 100 (ms) average over 500 (ms)
normal 92,57 90,80
normal w/ memcached 83,30 36,70 45,40
normal w/ APC 79 34,70 33,65
SQL 76,70 54,23
SQL w/ memcached 65,87 28,63 27,12
SQL w/ APC 56,80 21,33 23,73
method single average over 100 average over 500
normal 100% 100%
normal w/ memcached 90% 40% 50%
normal w/ APC 85% 38% 37%
SQL 82% 59%
SQL w/ memcached 71% 32% 30%
SQL w/ APC 61% 23% 26%

As we can see, SQL removes the speed limit imposed by the sqlite loading. It also comes relatively close to the normal approach combined with memcached. Possible disadvantages are the fact that this requires an SQL server.

When combined with APC, we can see between 5 and 10% speed improvements over memcached. The final results are significant. When using SQL and APC combined, a benchmark starting from an empty cache is handled 40% faster than now. When running a benchmark multiple times, the average single benchmark is handled up to 77% faster than a request at this moment.
Without moving to SQL, we get 5% improvement on an empty cache and a 60% improvement on a cache hit. The 5% improvement for the 1 hit test with apc is to be ignored.

How can a single benchmark starting from an empty cache still improve performance

The caching mechanism (APC or memcached) store the stations one by one, based on their query.
However, when querying a list of liveboard stations with an empty cache, the cache will be filled. For testing, the liveboard of Brussels-south, 8am was used. This contains multiple entries for Ghent, Antwerp and Schaarbeek. Since we're using caching on station query level, only the first "Schaarbeek" query will be ran, all the others will result in a cache hit and quickly return the earlier calculated value.

Conclusion

APC should be easy to install on the server, and is extremely easy to implement as well. Therefore, we should add it without hesitating.
MySql might add another 10-30% performance gain, at the cost of requiring an SQL server. This can be discussed.

@Bertware Bertware changed the title from Use in-memory store or SQLite for station lookups. to Use in-memory store for station lookups. Jan 25, 2017

@Bertware

This comment has been minimized.

Show comment
Hide comment
@Bertware

Bertware Jan 25, 2017

Contributor

The final implementation reduces a liveboard benchmark to a value between 0.75 and 5 msec. This is a reduction of 95 - 99,17%. Therefore, there's no need for SQL here. However, the benchmarks above give some idea on the response time of different technologies, and might be useful for other (iRail) projects too.

Final results:

php5 ~/.composer/vendor/bin/phpunit
Testing 100 liveboards took an average of 1747.603430748 ms for 1 liveboard, with APC disabled.

php5 -d apc.enable_cli=1 ~/.composer/vendor/bin/phpunit 
Testing 100 liveboards took an average of 10.28352022171 ms for 1 liveboard, with APC enabled.

php7 ~/.composer/vendor/bin/phpunit 
Testing 100 liveboards took an average of 92.941551208496 ms for 1 liveboard, with APC disabled.

php7 -d apc.enable_cli=1 ~/.composer/vendor/bin/phpunit 
Testing 100 liveboards took an average of 0.79075813293457 ms for 1 liveboard, with APC enabled.
Contributor

Bertware commented Jan 25, 2017

The final implementation reduces a liveboard benchmark to a value between 0.75 and 5 msec. This is a reduction of 95 - 99,17%. Therefore, there's no need for SQL here. However, the benchmarks above give some idea on the response time of different technologies, and might be useful for other (iRail) projects too.

Final results:

php5 ~/.composer/vendor/bin/phpunit
Testing 100 liveboards took an average of 1747.603430748 ms for 1 liveboard, with APC disabled.

php5 -d apc.enable_cli=1 ~/.composer/vendor/bin/phpunit 
Testing 100 liveboards took an average of 10.28352022171 ms for 1 liveboard, with APC enabled.

php7 ~/.composer/vendor/bin/phpunit 
Testing 100 liveboards took an average of 92.941551208496 ms for 1 liveboard, with APC disabled.

php7 -d apc.enable_cli=1 ~/.composer/vendor/bin/phpunit 
Testing 100 liveboards took an average of 0.79075813293457 ms for 1 liveboard, with APC enabled.
@pietercolpaert

This comment has been minimized.

Show comment
Hide comment
@pietercolpaert

pietercolpaert Jan 29, 2017

Member

Great work @Bertware! For some reason I supposed that PHP would optimize getting files from disk, yet your tests clearly indicate this is not the case.

I have left some comments in pull request #112:

  • Let’s abstract our caching system using PSR-6: http://www.php-fig.org/psr/psr-6/
  • Let’s make sure people can also use this package without cache installed for people quickly wanting to do a pull request and testing their changes locally. Therefore, we also need the tests to work without cache
  • The iRail servers work with memcached only at this moment, so make sure it works with memcached primarily.
Member

pietercolpaert commented Jan 29, 2017

Great work @Bertware! For some reason I supposed that PHP would optimize getting files from disk, yet your tests clearly indicate this is not the case.

I have left some comments in pull request #112:

  • Let’s abstract our caching system using PSR-6: http://www.php-fig.org/psr/psr-6/
  • Let’s make sure people can also use this package without cache installed for people quickly wanting to do a pull request and testing their changes locally. Therefore, we also need the tests to work without cache
  • The iRail servers work with memcached only at this moment, so make sure it works with memcached primarily.
@Bertware

This comment has been minimized.

Show comment
Hide comment
@Bertware

Bertware Jan 29, 2017

Contributor

I agree on abstracting using PSR-6 and the running while it's not available. Currently it already works without PHP available, I only overlooked the clear-cache statement in the tests (All code is already wrapped in IF statements to check if apc is available).

For the iRail servers, I would not use memcached. The additional improvements by APC are way to big to ignore. Note that the current code achieves way better performance than mentioned in the tests above (I'd say APC easily halves the time required by Memcached). Using memcached might lower the priority of using APC ("We're already using caching, so APC is not top priority"). Therefore, I would do it right from the first time, and just install APC.

As I discussed in #88, we do need to move to PHP7 too. The install of APC could be combined by upgrading to PHP7. PHP7 is the way to go anyhow, and shouldn't break too much code (if it breaks anything at all). That 2000x speedup should also allow serving more requests from the same hardware, which in turn should lower costs. (I know that stations is only part of iRail, but any improvement is good, and I might look into improving iRail API as well)

Contributor

Bertware commented Jan 29, 2017

I agree on abstracting using PSR-6 and the running while it's not available. Currently it already works without PHP available, I only overlooked the clear-cache statement in the tests (All code is already wrapped in IF statements to check if apc is available).

For the iRail servers, I would not use memcached. The additional improvements by APC are way to big to ignore. Note that the current code achieves way better performance than mentioned in the tests above (I'd say APC easily halves the time required by Memcached). Using memcached might lower the priority of using APC ("We're already using caching, so APC is not top priority"). Therefore, I would do it right from the first time, and just install APC.

As I discussed in #88, we do need to move to PHP7 too. The install of APC could be combined by upgrading to PHP7. PHP7 is the way to go anyhow, and shouldn't break too much code (if it breaks anything at all). That 2000x speedup should also allow serving more requests from the same hardware, which in turn should lower costs. (I know that stations is only part of iRail, but any improvement is good, and I might look into improving iRail API as well)

pietercolpaert added a commit that referenced this issue Sep 20, 2017

Merge pull request #112 from Bertware/111-caching
Use APC as in-memory store for extreme performance gains. #111
@pietercolpaert

This comment has been minimized.

Show comment
Hide comment
@pietercolpaert

pietercolpaert Sep 20, 2017

Member

I believe this issue can be closed!

Member

pietercolpaert commented Sep 20, 2017

I believe this issue can be closed!

Bertware added a commit that referenced this issue Sep 20, 2017

Bertware added a commit that referenced this issue Sep 20, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment