Nominatim slow query #1139

pedropinheiro75 · 2018-08-14T21:17:44Z

I recently imported the nominatim with full planet and queries are undesirable performance. The first query, before the cache is created, takes longer than 20 seconds and sometimes I see this error:

Internal Server Error
Nominatim has encountered an internal error while accessing the database. This may happen because the database is broken or because of a bug in the software. If you think it is a bug, feel free to report it over on Github. Please include the URL that caused the problem and the complete error details below.
Message: Could not lookup place
SQL Error: DB Error: unknown error
Details:
SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name','name','brand','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name','name','brand','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (92672355,196842012,4488269,196732908,48310082,160137,91687842,70711723,83690356,77366185,74053027,92460750,82583385,113120861,177719652,86034442,82755806,76866792,83301006,92789547)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place'  [nativecode=ERROR:  canceling statement due to statement timeout
CONTEXT:  PL/pgSQL function get_addressdata(bigint,integer) line 103 at FOR over SELECT rows
PL/pgSQL function get_address_by_language(bigint,integer,text[]) line 12 at FOR over SELECT rows]

Due to the delay of the queries I thought the problem was the lack of some index and then I executed:

nohup ./utils/setup.php --create-search-indices --ignore-errors &

Output:

2018-08-13 20:59:56 == Create Search indices
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_CTYPE = "UTF-8",
    LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
ERROR:  relation "idx_word_word_id" already exists
ERROR:  relation "idx_search_name_nameaddress_vector" already exists
ERROR:  relation "idx_search_name_name_vector" already exists
ERROR:  relation "idx_search_name_centroid" already exists
ERROR:  relation "idx_place_addressline_address_place_id" already exists
ERROR:  relation "idx_placex_rank_address" already exists
ERROR:  relation "idx_placex_pendingsector" already exists
ERROR:  relation "idx_placex_parent_place_id" already exists
ERROR:  relation "idx_placex_reverse_geometry" already exists
ERROR:  relation "idx_location_area_country_place_id" already exists
ERROR:  relation "idx_osmline_parent_place_id" already exists
ERROR:  relation "idx_search_name_country_centroid" already exists
NOTICE:  index "place_id_idx" does not exist, skipping
ERROR:  relation "idx_place_osm_unique" already exists
ERROR:  relation "idx_postcode_id" already exists
ERROR:  relation "idx_postcode_postcode" already exists
Summary of warnings:


2018-08-14 02:23:14 == Setup finished.

What does this line mean? NOTICE: index "place_id_idx" does not exist, skipping

After that, I turn on the postgresql log and called the search api with these parameters:

http://nominatim.dev/nominatim/search.php?format=jsonv2&addressdetails=1&q=cancun

Logs:

2018-08-14 14:30:49 UTC [2489-1] www-data@nominatim LOG:  duration: 16092.775 ms  statement: SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name:en-en','name:en-en','short_name:en','name:en','short_name','name','brand','official_name:en-en','official_name:en','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name:en-en','name:en-en','short_name:en','name:en','short_name','name','brand','official_name:en-en','official_name:en','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (197441826,111603036,182595459,139694930,144830394,118512715,74049435,99932703,159450111,141912308,112119380,80303695,80305679,112647794,80522782,115827503,128389572,149585949,74675442,150225528)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place'

2018-08-14 14:31:35 UTC [2490-1] ERROR:  function transliteration(text) does not exist at character 23
2018-08-14 14:31:35 UTC [2490-2] HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
2018-08-14 14:31:35 UTC [2490-3] QUERY:  SELECT gettokenstring(transliteration(name))
2018-08-14 14:31:35 UTC [2490-4] CONTEXT:  PL/pgSQL function public.make_standard_name(text) line 5 at assignment
    automatic analyze of table "nominatim.public.placex"

Looking at the logs I thought some functions might be missing, and execute this:

nohup ./utils/setup.php --create-functions --ignore-errors &

2018-08-14 14:48:04 == Create Functions
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_CTYPE = "UTF-8",
    LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
NOTICE:  drop cascades to function get_addressdata(bigint,integer)
NOTICE:  drop cascades to function get_wikipedia_match(hstore,character varying)
Summary of warnings:


2018-08-14 14:48:04 == Setup finished.

Some info about my installation:

Nominatim version 3.1.0

Ec2 t2.2xlarge with T2 Unlimited | vCPU 8 | 32GB RAM | SSD 870 GB gp2 2610/3000 IOPS

 Postgresql configurations:
- shared_buffers = 2GB
- maintenance_work_mem = 10GB
- work_mem = 50MB
- effective_cache_size = 24GB
- synchronous_commit = off
- checkpoint_timeout = 10min
- checkpoint_completion_target = 0.9
- fsync = on
- full_page_writes = on
- random_page_cost = 1.1
- wal_buffers = 16MB
- max_worker_processes = 8

df -h

Filesystem      Size  Used Avail Use% Mounted on
udev             16G     0   16G   0% /dev
tmpfs           3.2G  9.5M  3.2G   1% /run
/dev/xvda1      844G  674G  170G  80% /
tmpfs            16G  4.0K   16G   1% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/loop0       13M   13M     0 100% /snap/amazon-ssm-agent/295
/dev/loop1       87M   87M     0 100% /snap/core/4917
/dev/loop2       87M   87M     0 100% /snap/core/4830
/dev/loop3       87M   87M     0 100% /snap/core/5145
tmpfs           3.2G     0  3.2G   0% /run/user/1000

free -h 

              total        used        free      shared  buff/cache   available
Mem:            31G        171M         29G        115M        1.5G         30G
Swap:            0B          0B          0B

nominatim=# \d search_name
                Table "public.search_name"
       Column       |          Type           | Modifiers
--------------------+-------------------------+-----------
 place_id           | bigint                  |
 importance         | double precision        |
 search_rank        | smallint                |
 address_rank       | smallint                |
 name_vector        | integer[]               |
 nameaddress_vector | integer[]               |
 country_code       | character varying(2)    |
 centroid           | geometry(Geometry,4326) |
Indexes:
    "idx_search_name_centroid" gist (centroid)
    "idx_search_name_name_vector" gin (name_vector) WITH (fastupdate=off)
    "idx_search_name_nameaddress_vector" gin (nameaddress_vector) WITH (fastupdate=off)
    "idx_search_name_place_id" btree (place_id)

nominatim=# \d placex
                   Table "public.placex"
     Column      |            Type             | Modifiers
-----------------+-----------------------------+-----------
 place_id        | bigint                      | not null
 parent_place_id | bigint                      |
 linked_place_id | bigint                      |
 importance      | double precision            |
 indexed_date    | timestamp without time zone |
 geometry_sector | integer                     |
 rank_address    | smallint                    |
 rank_search     | smallint                    |
 partition       | smallint                    |
 indexed_status  | smallint                    |
 osm_id          | bigint                      | not null
 osm_type        | character(1)                | not null
 class           | text                        | not null
 type            | text                        | not null
 name            | hstore                      |
 admin_level     | smallint                    |
 address         | hstore                      |
 extratags       | hstore                      |
 geometry        | geometry(Geometry,4326)     | not null
 wikipedia       | text                        |
 country_code    | character varying(2)        |
 housenumber     | text                        |
 postcode        | text                        |
 centroid        | geometry(Geometry,4326)     |
Indexes:
    "idx_place_id" UNIQUE, btree (place_id)
    "idx_placex_adminname" btree (make_standard_name(name -> 'name'::text), rank_search) WHERE osm_type = 'N'::bpchar AND rank_search < 26
    "idx_placex_geometry" gist (geometry)
    "idx_placex_linked_place_id" btree (linked_place_id) WHERE linked_place_id IS NOT NULL
    "idx_placex_osmid" btree (osm_type, osm_id)
    "idx_placex_parent_place_id" btree (parent_place_id) WHERE parent_place_id IS NOT NULL
    "idx_placex_pendingsector" btree (rank_search, geometry_sector) WHERE indexed_status > 0
    "idx_placex_rank_address" btree (rank_address)
    "idx_placex_rank_search" btree (rank_search)
    "idx_placex_reverse_geometry" gist (geometry) WHERE rank_search <> 28 AND (name IS NOT NULL OR housenumber IS NOT NULL) AND (class <> ALL (ARRAY['waterway'::text, 'railway'::text, 'tunnel'::text, 'bridge'::text, 'man_made'::text]))
Triggers:
    placex_before_delete AFTER DELETE ON placex FOR EACH ROW EXECUTE PROCEDURE placex_delete()
    placex_before_insert BEFORE INSERT ON placex FOR EACH ROW EXECUTE PROCEDURE placex_insert()
    placex_before_update BEFORE UPDATE ON placex FOR EACH ROW EXECUTE PROCEDURE placex_update()

I did not come up with a solution with no command executed

What else can I do to try to solve this?

Firstly I posted this issue in the OpenStreetMap Help and by @mtmail recommendation I am moving it to github.

The text was updated successfully, but these errors were encountered:

pedropinheiro75 · 2018-08-14T21:20:44Z

@mtmail about Try running test queries with 'EXPLAIN', see a similar discussion at #1023 (comment)_

Here it is:

nominatim=# EXPLAIN SELECT place_id,0::int as exactmatch FROM search_name
nominatim-# WHERE name_vector @> ARRAY[5639492] ORDER BY
nominatim-# (CASE WHEN importance = 0 OR importance IS NULL THEN 0.75001-(search_rank::float/40) ELSE importance END) DESC LIMIT 20;
                                                                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=9809.24..9809.29 rows=20 width=18)
   ->  Sort  (cost=9809.24..9830.51 rows=8507 width=18)
         Sort Key: (CASE WHEN ((importance = '0'::double precision) OR (importance IS NULL)) THEN ('0.75001'::double precision - ((search_rank)::double precision / '40'::double precision)) ELSE importance END) DESC
         ->  Bitmap Heap Scan on search_name  (cost=78.03..9582.87 rows=8507 width=18)
               Recheck Cond: (name_vector @> '{5639492}'::integer[])
               ->  Bitmap Index Scan on idx_search_name_name_vector  (cost=0.00..75.90 rows=8507 width=0)
                     Index Cond: (name_vector @> '{5639492}'::integer[])
(7 rows)

nominatim=# EXPLAIN ANALYSE SELECT place_id,0::int as exactmatch FROM search_name
nominatim-# WHERE name_vector @> ARRAY[5639492] ORDER BY
nominatim-# (CASE WHEN importance = 0 OR importance IS NULL THEN 0.75001-(search_rank::float/40) ELSE importance END) DESC LIMIT 20;
                                                                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=9809.24..9809.29 rows=20 width=18) (actual time=2.235..2.236 rows=1 loops=1)
   ->  Sort  (cost=9809.24..9830.51 rows=8507 width=18) (actual time=2.233..2.234 rows=1 loops=1)
         Sort Key: (CASE WHEN ((importance = '0'::double precision) OR (importance IS NULL)) THEN ('0.75001'::double precision - ((search_rank)::double precision / '40'::double precision)) ELSE importance END) DESC
         Sort Method: quicksort  Memory: 25kB
         ->  Bitmap Heap Scan on search_name  (cost=78.03..9582.87 rows=8507 width=18) (actual time=2.192..2.193 rows=1 loops=1)
               Recheck Cond: (name_vector @> '{5639492}'::integer[])
               Heap Blocks: exact=1
               ->  Bitmap Index Scan on idx_search_name_name_vector  (cost=0.00..75.90 rows=8507 width=0) (actual time=1.846..1.846 rows=1 loops=1)
                     Index Cond: (name_vector @> '{5639492}'::integer[])
 Planning time: 0.745 ms
 Execution time: 2.282 ms
(11 rows)

mtmail · 2018-08-15T14:53:22Z

OK, those queries are fast. It might be worth trying an EXPLAIN on the query from the first error message SELECT osm_type, osm_id, class, type, admin_level...

Do you have monitoring on disk IO, e.g. via AWS Cloudwatch(?).

Do you remember how long the initial import ran?

I see the GP storage type is network mounted (https://aws.amazon.com/ebs/details/) and the T2 instances are not EBS optimized (https://aws.amazon.com/ec2/instance-types/). I can't tell if that means the disk IO is slow or unsuitable for small burst, I only have experience with the super small T2.micro instances.

pedropinheiro75 · 2018-08-15T19:50:55Z

Here is the EXPLAIN from the first error:

EXPLAIN SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name','name','brand','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name','name','brand','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (92672355,196842012,4488269,196732908,48310082,160137,91687842,70711723,83690356,77366185,74053027,92460750,82583385,113120861,177719652,86034442,82755806,76866792,83301006,92789547)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place';

 HashAggregate  (cost=68.63..105383.77 rows=20 width=180)
   Group Key: placex.osm_type, placex.osm_id, placex.class, placex.type, placex.admin_level, placex.rank_search, placex.rank_address, placex.housenumber, placex.country_code, placex.importance, get_address_by_language(placex.place_id, '-1'::integer, '{short_name,name,brand,official_name,ref,type}'::text[]), get_name_by_language(placex.name, '{short_name,name,brand,official_name,ref,type}'::text[]), get_name_by_language(placex.name, '{ref}'::text[]), (placex.extratags -> 'place'::text)
   ->  Index Scan using idx_place_id on placex  (cost=0.57..67.53 rows=20 width=180)
         Index Cond: (place_id = ANY ('{92672355,196842012,4488269,196732908,48310082,160137,91687842,70711723,83690356,77366185,74053027,92460750,82583385,113120861,177719652,86034442,82755806,76866792,83301006,92789547}'::bigint[]))
         Filter: ((linked_place_id IS NULL) AND (((rank_address >= 0) AND (rank_address <= 30)) OR ((extratags -> 'place'::text) = 'city'::text)))
   SubPlan 1
     ->  Aggregate  (cost=5264.97..5264.98 rows=1 width=10)
           ->  Nested Loop  (cost=1.15..5264.92 rows=5 width=10)
                 ->  Index Scan using idx_place_addressline_place_id on place_addressline ai_s  (cost=0.57..1779.24 rows=1246 width=8)
                       Index Cond: (place_id = min(CASE WHEN (placex.rank_search < 28) THEN placex.place_id ELSE placex.parent_place_id END))
                       Filter: isaddress
                 ->  Index Scan using idx_place_id on placex ai_p  (cost=0.57..2.79 rows=1 width=18)
                       Index Cond: (place_id = ai_s.address_place_id)
                       Filter: (importance IS NOT NULL)
(14 rows)

Yes, I have detailed AWS CloudWatch monitoring.

What I noticed was that even running slow queries like this:

2018-08-15 19:40:39 UTC [1821-1] www-data@nominatim LOG:  duration: 11714.008 ms  statement: SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (658333,64429858,6441754,6487545,151535658,63824784,47915977,21898420,112495770,103601664,118742537,104093519,103465203,117810658,114175437,117250188,82721506,95352494,99283832,143236712)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place'

the use of CPU and Disk remains almost null. (I waited a few minutes to get this print)

My initial setup it took about three days. For that I set up the swap , could this mess up something?

In relation about EBS on T2 instance I have another Nominatim instance with full planet with T2 Instances and everything works fine.

lonvia · 2018-08-15T20:00:05Z

Please run ANALYSE on your database and see if that helps. If it doesn't, enable long query logging in your postgres config (log_min_duration_statement = 1000), reload the postgres service and rerun the slow query. Then report the exact query you tried and the postgres log results.

mtmail · 2018-08-15T20:17:26Z

My initial setup it took about three days

Three days is pretty good. Doubling hardware and faster discs might get you to two days. I was just wondering if took two weeks (we have user reporting that on underpowered hardware).

pedropinheiro75 · 2018-08-16T13:50:41Z

@lonvia after run psql -d nominatim -c 'ANALYZE VERBOSE'

http://nominatim.dev/nominatim/search.php?format=jsonv2&accept-language=pt-br&addressdetails=1&q=verdan

http://nominatim.dev/nominatim/search.php?format=jsonv2&accept-language=pt-br&addressdetails=1&q=netuno

http://nominatim.dev/nominatim/search.php?format=jsonv2&accept-language=pt-br&addressdetails=1&q=plutão

2018-08-16 13:29:29 UTC [2037-1] LOG:  database system was shut down at 2018-08-16 13:29:20 UTC
2018-08-16 13:29:29 UTC [2037-2] LOG:  MultiXact member wraparound protections are now enabled
2018-08-16 13:29:29 UTC [2041-1] LOG:  autovacuum launcher started
2018-08-16 13:29:29 UTC [2035-3] LOG:  database system is ready to accept connections
2018-08-16 13:29:29 UTC [2043-1] [unknown]@[unknown] LOG:  incomplete startup packet
2018-08-16 13:41:17 UTC [2168-1] www-data@nominatim LOG:  duration: 2185.926 ms  statement: SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (45050252,22415400,22203205,19410621,4163997)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place'
2018-08-16 13:41:54 UTC [2171-1] www-data@nominatim LOG:  duration: 2715.655 ms  statement: SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (134143511,53735345,197242096,43840262,14573170)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place'
2018-08-16 13:42:12 UTC [2172-1] www-data@nominatim LOG:  duration: 11558.166 ms  statement: SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (89121656,100474154,129955308,110139220,178136713,127603481,127603054,128135891,128135499,118073268,103931881,166244432,164342431,87947148,97083575,114067315,172273665,177880016,72092325,93795339)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place'

lonvia · 2018-08-16T14:03:48Z

Wait, are you saying that the slow queries only happen after the database was restarted and things settle down after a while? That would be perfectly normal. Postgresql has to fetch many gigabytes of index data before it can serve the queries efficiently.

pedropinheiro75 · 2018-08-16T14:50:21Z

I understand. Extremely grateful for the help.

There's something else, can you tell me if it's normal?
If I run the same query several times the caches is created and everything works incredibly fast. If I create an AMI and lauch a new machine based on it, the same query in the first call I always see:

Internal Server Error

Nominatim has encountered an internal error while accessing the database. This may happen because the database is broken or because of a bug in the software. If you think it is a bug, feel free to report it over on Github. Please include the URL that caused the problem and the complete error details below.

Message: Could not lookup place

SQL Error: DB Error: unknown error

Details:

SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (89121656,100474154,129955308,110139220,178136713,127603481,127603054,128135891,128135499,118073268,103931881,166244432,164342431,87947148,97083575,114067315,172273665,177880016,72092325,93795339)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place'  [nativecode=ERROR:  canceling statement due to statement timeout
CONTEXT:  PL/pgSQL function get_addressdata(bigint,integer) line 103 at FOR over SELECT rows
PL/pgSQL function get_address_by_language(bigint,integer,text[]) line 12 at FOR over SELECT rows]

and after a few calls I get status code 200 normally.

Postgresql Log:

2018-08-16 14:32:53 UTC [1541-1] LOG:  database system was shut down at 2018-08-16 14:32:52 UTC
2018-08-16 14:32:53 UTC [1541-2] LOG:  MultiXact member wraparound protections are now enabled
2018-08-16 14:32:53 UTC [1539-3] LOG:  database system is ready to accept connections
2018-08-16 14:32:53 UTC [1545-1] LOG:  autovacuum launcher started
2018-08-16 14:32:54 UTC [1547-1] [unknown]@[unknown] LOG:  incomplete startup packet
Interrupt requested
Interrupt requested
2018-08-16 14:33:56 UTC [1561-1] www-data@nominatim ERROR:  canceling statement due to statement timeout
2018-08-16 14:33:56 UTC [1561-2] www-data@nominatim CONTEXT:  PL/pgSQL function get_addressdata(bigint,integer) line 103 at FOR over SELECT rows
        PL/pgSQL function get_address_by_language(bigint,integer,text[]) line 12 at FOR over SELECT rows
2018-08-16 14:33:56 UTC [1561-3] www-data@nominatim STATEMENT:  SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (89121656,100474154,129955308,110139220,178136713,127603481,127603054,128135891,128135499,118073268,103931881,166244432,164342431,87947148,97083575,114067315,172273665,177880016,72092325,93795339)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place'
Interrupt requested
Interrupt requested
2018-08-16 14:36:45 UTC [1574-1] www-data@nominatim ERROR:  canceling statement due to statement timeout
2018-08-16 14:36:45 UTC [1574-2] www-data@nominatim CONTEXT:  PL/pgSQL function get_addressdata(bigint,integer) line 103 at FOR over SELECT rows
        PL/pgSQL function get_address_by_language(bigint,integer,text[]) line 12 at FOR over SELECT rows
2018-08-16 14:36:45 UTC [1574-3] www-data@nominatim STATEMENT:  SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (89121656,100474154,129955308,110139220,178136713,127603481,127603054,128135891,128135499,118073268,103931881,166244432,164342431,87947148,97083575,114067315,172273665,177880016,72092325,93795339)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place'
Interrupt requested
Interrupt requested
2018-08-16 14:37:19 UTC [1576-1] www-data@nominatim ERROR:  canceling statement due to statement timeout
2018-08-16 14:37:19 UTC [1576-2] www-data@nominatim CONTEXT:  PL/pgSQL function get_addressdata(bigint,integer) line 103 at FOR over SELECT rows
        PL/pgSQL function get_address_by_language(bigint,integer,text[]) line 12 at FOR over SELECT rows
2018-08-16 14:37:19 UTC [1576-3] www-data@nominatim STATEMENT:  SELECT     osm_type,    osm_id,    class,    type,    admin_level,    rank_search,    rank_address,    min(place_id) AS place_id,    min(parent_place_id) AS parent_place_id,    -1 as housenumber,    country_code,get_address_by_language(place_id,-1,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS langaddress,    get_name_by_language(name,ARRAY['short_name:pt-br','name:pt-br','short_name:pt','name:pt','short_name','name','brand','official_name:pt-br','official_name:pt','official_name','ref','type']) AS placename,    get_name_by_language(name, ARRAY['ref']) AS ref,    avg(ST_X(centroid)) AS lon,     avg(ST_Y(centroid)) AS lat,     COALESCE(importance,0.75-(rank_search::float/40)) AS importance, (SELECT max(ai_p.importance * (ai_p.rank_address + 2))   FROM place_addressline ai_s, placex ai_p   WHERE ai_s.place_id = min(CASE WHEN placex.rank_search < 28 THEN placex.place_id ELSE placex.parent_place_id END)     AND ai_p.place_id = ai_s.address_place_id      AND ai_s.isaddress      AND ai_p.importance is not null) AS addressimportance,    (extratags->'place') AS extra_place  FROM placex WHERE place_id in (89121656,100474154,129955308,110139220,178136713,127603481,127603054,128135891,128135499,118073268,103931881,166244432,164342431,87947148,97083575,114067315,172273665,177880016,72092325,93795339)    AND (        placex.rank_address between 0 and 30     OR (extratags->'place') = 'city'       )     AND linked_place_id is null  GROUP BY      osm_type,      osm_id,      class,      type,      admin_level,      rank_search,      rank_address,      housenumber,     country_code,      importance,      langaddress,      placename,      ref,      extratags->'place'

Can AMI be messing something up?

lonvia · 2018-08-18T09:26:24Z

I think you have a wrong idea what 'caching' means in this context. It means that Postgresql fetches data from disk into RAM. Once the most frequently used data is there, it can answer many queries from the data that is cached in RAM and thus is much faster than before. When Postgresql is restarted, it forgets about all the data it has in RAM and needs to start fresh. If you create an AMI and spin it up on another machine, that is the same as restarting Postgresql. So yes, it is normal, that here too the first queries are slow.

pedropinheiro75 · 2018-08-19T00:50:34Z

I always worked with Nominatim with an incredible performance, I did not know that the creation of the cache could take so long.
I wanted to use the AMI for fallback in case something happens in my instance within an autoscaling.
I will adapt a process to load the cache.
Thank you.

louispotok · 2019-01-07T06:32:36Z

@gnosis75 I'm trying to do almost exactly the same thing as you - were you able to figure out a good way to warm up the cache after creating an instance from the AMI?

Emil-Zhou · 2019-05-05T09:08:04Z

@gnosis75 +1

mtmail · 2019-05-06T19:12:02Z

@Emil-Zhou utils/warm.php was written for this purpose.

(Comments on year-long closed issues are easy to miss. There's a geocoding mailing list https://wiki.openstreetmap.org/wiki/Mailing_lists for more general question.)

pedropinheiro75 closed this as completed Aug 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nominatim slow query #1139

Nominatim slow query #1139

pedropinheiro75 commented Aug 14, 2018 •

edited

pedropinheiro75 commented Aug 14, 2018 •

edited

mtmail commented Aug 15, 2018

pedropinheiro75 commented Aug 15, 2018

lonvia commented Aug 15, 2018

mtmail commented Aug 15, 2018

pedropinheiro75 commented Aug 16, 2018

lonvia commented Aug 16, 2018

pedropinheiro75 commented Aug 16, 2018 •

edited

lonvia commented Aug 18, 2018

pedropinheiro75 commented Aug 19, 2018 •

edited

louispotok commented Jan 7, 2019

Emil-Zhou commented May 5, 2019

mtmail commented May 6, 2019

Nominatim slow query #1139

Nominatim slow query #1139

Comments

pedropinheiro75 commented Aug 14, 2018 • edited

pedropinheiro75 commented Aug 14, 2018 • edited

mtmail commented Aug 15, 2018

pedropinheiro75 commented Aug 15, 2018

lonvia commented Aug 15, 2018

mtmail commented Aug 15, 2018

pedropinheiro75 commented Aug 16, 2018

lonvia commented Aug 16, 2018

pedropinheiro75 commented Aug 16, 2018 • edited

lonvia commented Aug 18, 2018

pedropinheiro75 commented Aug 19, 2018 • edited

louispotok commented Jan 7, 2019

Emil-Zhou commented May 5, 2019

mtmail commented May 6, 2019

pedropinheiro75 commented Aug 14, 2018 •

edited

pedropinheiro75 commented Aug 14, 2018 •

edited

pedropinheiro75 commented Aug 16, 2018 •

edited

pedropinheiro75 commented Aug 19, 2018 •

edited