New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf: Full day simulation (2017 perf test) #3

Open
mmd-osm opened this Issue Sep 2, 2017 · 2 comments

Comments

Projects
None yet
1 participant
@mmd-osm
Owner

mmd-osm commented Sep 2, 2017

Intro

Full simulation on this branch, Commit: 4957911

HW & SW

Preparation

Compiling sources

  • Installing relevant libraries
    o sudo apt-get update
    o sudo apt-get install -y --force-yes --no-install-recommends g++ make expat autoconf automake autotools-dev libtool curl ca-certificates unzip
    o sudo apt-get install bzip2 libexpat1-dev zlib1g-dev liblz4-dev libfcgi-dev libevent-dev libbz2-dev libicu-dev libosmium2-dev supervisor
    o curl -o osm-3s_v0.7.58_mmd.zip https://codeload.github.com/mmd-osm/Overpass-API/zip/test758_lz4hash
    o unzip -q osm-3s_v0.7.58_mmd.zip

  • Running autotools
    o cd Overpass-API-test758_lz4hash/src
    o autoscan
    o aclocal
    o autoheader
    o libtoolize
    o automake --add-missing
    o autoconf
    o cd ..

  • Configure options
    o mkdir -p build
    o cd build
    o ../src/configure CXXFLAGS="-O2 -mtune=native -ggdb -std=c++11" LDFLAGS="-lpthread -lbz2 -levent -licuuc -licui18n" --enable-fastcgi --enable-lz4 --prefix=/srv/osm3s
    o make V=0 -j7
    o make install

Preparing database

Converting zlib database to lz4 compressed database and add tagged nodes file

  • Download clone from http://dev.overpass-api.de/clone to {{database directory zlib}}
  • Run osm3s_query --db-dir={{database directory zlib}} --clone={{new lz4 database directory}} --clone-compression=lz4 --clone-map-compression=lz4
  • Run create_tagged_nodes {{new lz4 database directory}} - this step creates two new files nodes_tagged.bin and nodes_tagged.bin.idx
  • Continue as usual with {{new lz4 database directory}} being your database directory

Configuring supervisor

Stats

Area Details
Query data source: August 30, 2017 (main instance)
Start reprocessing: 02/Sep/2017:10:15:17 +0200
End reprocessing: 02/Sep/2017:20:58:47 +0200
Total processing time: 10:43
Total number queries executed: 534497

Reponse times (quantile)

50% 90% 95% 99% 99.5% 99.9% 99.99% 99.999%
0.062s 0.685s 1.58s 5.5s 11.06s 33.92s 168.37s 853.6s

Reprocessed with 7 parallel tasks, 0s wait time.

Most expensive queries

  1. Similar to Geocoding example for a large area

Executed multiple times because of timeout issues

  1. Highway node intersection

Executed multiple times because of timeout issues

<?xml version="1.0" encoding="UTF-8"?><osm-script timeout="1800" element-limit="1073741824">
  <query type="way" into="hw">
    <has-kv k="highway"/>
    <has-kv k="highway" modv="not" regv="footway|cycleway|path|service|track"/>
    <bbox-query s="14.498508149446216" w="120.94779968261719" n="14.67061869442178" e="121.0638427734375"/>
  </query>
  
  <foreach from="hw" into="w">
    <recurse from="w" type="way-node" into="ns"/>
    <recurse from="ns" type="node-way" into="w2"/>
    <query type="way" into="w2">
      <item set="w2"/>
      <has-kv k="highway"/>
      <has-kv k="highway" modv="not" regv="footway|cycleway|path|service|track"/>
    </query>
    <difference into="wd">
      <item set="w2"/>
      <item set="w"/>
    </difference>
    <recurse from="wd" type="way-node" into="n2"/>
    <recurse from="w"  type="way-node" into="n3"/>
    <query type="node">
      <item set="n2"/>
      <item set="n3"/>
    </query>
    <print/>
  </foreach>
</osm-script>
  1. Opening hours analysis
[date:"2017-08-31T00:00:00"][out:json][timeout:4000];
area["type"="boundary"]["ISO3166-2"="DE-NW"];
foreach(
    node(area)["opening_hours"]->.t; .t out tags;
    node(area)["opening_hours:kitchen"]->.t; .t out tags;
    node(area)["opening_hours:warm_kitchen"]->.t; .t out tags;
    node(area)["happy_hours"]->.t; .t out tags;
    node(area)["delivery_hours"]->.t; .t out tags;
    node(area)["opening_hours:delivery"]->.t; .t out tags;
    node(area)["lit"]->.t; .t out tags;
    node(area)["smoking_hours"]->.t; .t out tags;
    node(area)["collection_times"]->.t; .t out tags;
    node(area)["service_times"]->.t; .t out tags;
    node(area)["fee"]->.t; .t out tags;
    way(area)["opening_hours"]->.t; .t out tags;
    way(area)["opening_hours:kitchen"]->.t; .t out tags;
    way(area)["opening_hours:warm_kitchen"]->.t; .t out tags;
    way(area)["happy_hours"]->.t; .t out tags;
    way(area)["delivery_hours"]->.t; .t out tags;
    way(area)["opening_hours:delivery"]->.t; .t out tags;
    way(area)["lit"]->.t; .t out tags;
    way(area)["smoking_hours"]->.t; .t out tags;
    way(area)["collection_times"]->.t; .t out tags;
    way(area)["service_times"]->.t; .t out tags;
    way(area)["fee"]->.t; .t out tags;
);

Also slow for ["ISO3166-2"="DE-BY"]

Code: https://github.com/opening-hours/opening_hours.js/blob/master/Makefile#L338-L356

Could use regexp instead (already implemented but inactive), also filter out [fee!=no][fee!=yes][lit!=no][lit!=yes]

Query aborts anyway as it uses too much memory: runtime error: Query run out of memory using about 2048 MB of RAM.

If query runs a few minutes after midnight, leave out [date:...] altogether, it just doesn't matter.

Follow up actions:

  1. Overpass Turbo - Map example on large bounding box

  2. Query with very large bbox

Bbox is counter productive, removing it gives faster results

[out:json][timeout:180][maxsize:1048576];(
  
  node["amenity"="compressed_air"](-80.87282721505684,-180,88.09879913729107,180);way["amenity"="compressed_air"](-80.87282721505684,-180,88.09879913729107,180);relation["amenity"="compressed_air"](-80.87282721505684,-180,88.09879913729107,180););out  center meta;>;out skel qt;
  1. Expensive Achavi style queries on large bbox
[adiff:"2017-08-06T22:39:23Z","2017-08-06T22:43:13Z"];(node(36.3181693,5.5767073,47.8357181,18.9969694)(changed);way(36.3181693,5.5767073,47.8357181,18.9969694)(changed););out meta geom(36.3181693,5.5767073,47.8357181,18.9969694);

Munin charts (perf test system on custom branch)

grafik

grafik

grafik

grafik

grafik

grafik

grafik

grafik

grafik

grafik

grafik

grafik

grafik

grafik

grafik

Munin charts (main instance)

grafik

grafik

grafik

grafik

grafik

grafik

grafik

grafik

@mmd-osm

This comment has been minimized.

Show comment
Hide comment
@mmd-osm

mmd-osm Sep 3, 2017

Owner

Recommendations:

  • Reduce default query runtime form 180s to 60s (that's long enough for this branch)
  • Hard limit timeout to 300s (even if user sets ridiculously high timeouts like 86400s).
Owner

mmd-osm commented Sep 3, 2017

Recommendations:

  • Reduce default query runtime form 180s to 60s (that's long enough for this branch)
  • Hard limit timeout to 300s (even if user sets ridiculously high timeouts like 86400s).
@mmd-osm

This comment has been minimized.

Show comment
Hide comment
@mmd-osm

mmd-osm Oct 4, 2017

Owner

Simulating a 24h day using 2 consuming threads

Charts include:

  • Minutely updates
  • Hourly Area updates
  • Nginx gzip compression

grafik

grafik
grafik
grafik
grafik
grafik
grafik
grafik
grafik
grafik

Owner

mmd-osm commented Oct 4, 2017

Simulating a 24h day using 2 consuming threads

Charts include:

  • Minutely updates
  • Hourly Area updates
  • Nginx gzip compression

grafik

grafik
grafik
grafik
grafik
grafik
grafik
grafik
grafik
grafik

@mmd-osm mmd-osm changed the title from Perf: Full day simulation to Perf: Full day simulation (2017 perf test) Oct 13, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment