New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query timing out. #1023

Closed
midhunkrishna opened this Issue Apr 24, 2018 · 7 comments

Comments

Projects
None yet
3 participants
@midhunkrishna

midhunkrishna commented Apr 24, 2018

I had done an import of North America and Srilanka using the following Docker file (a section)

USER nominatim

# Download and merge North America (includes Canada and Mexico) and Srilanka
RUN curl -L 'http://download.geofabrik.de/north-america-latest.osm.pbf' \
    --create-dirs -o /srv/nominatim/src/north-america-latest.osm.pbf
RUN curl -L 'http://download.geofabrik.de/asia/sri-lanka-latest.osm.pbf' \
    --create-dirs -o /srv/nominatim/src/sri-lanka-latest.osm.pbf

RUN osmconvert /srv/nominatim/src/north-america-latest.osm.pbf \
    -o=/srv/nominatim/src/north-america-latest.o5m
RUN osmconvert /srv/nominatim/src/sri-lanka-latest.osm.pbf \
    -o=/srv/nominatim/src/sri-lanka-latest.o5m

RUN osmconvert /srv/nominatim/src/north-america-latest.o5m \
    /srv/nominatim/src/sri-lanka-latest.o5m \
    -o=/srv/nominatim/src/data.o5m

RUN osmconvert /srv/nominatim/src/data.o5m \
    -o=/srv/nominatim/src/data.osm.pbf;


# Filter administrative boundaries
USER nominatim
ARG BUILD_THREADS=16
ARG IMPORT_ADMINISTRATIVE=false
COPY scripts/filter_administrative.sh \
      /srv/nominatim/scripts/filter_administrative.sh
RUN /srv/nominatim/scripts/filter_administrative.sh

# Add postgresql users
USER root
RUN service postgresql start && \
    sudo -u postgres createuser -s nominatim && \
    sudo -u postgres createuser www-data && \
    service postgresql stop

# Tune postgresql configuration for import
USER root
ARG BUILD_MEMORY=32GB
ENV PGCONFIG_URL https://api.pgconfig.org/v1/tuning/get-config
RUN IMPORT_CONFIG_URL="${PGCONFIG_URL}? \
      format=alter_system& \
      pg_version=${PGSQL_VERSION}& \
      total_ram=${BUILD_MEMORY}& \
      max_connections=$((8 * ${BUILD_THREADS} + 32))& \
      environment_name=DW& \
      include_pgbadger=false" && \
    IMPORT_CONFIG_URL=${IMPORT_CONFIG_URL// /} && \
    service postgresql start && \
    ( curl -sSL "${IMPORT_CONFIG_URL}"; \
      echo $'ALTER SYSTEM SET fsync TO \'off\';\n'; \
      echo $'ALTER SYSTEM SET full_page_writes TO \'off\';\n'; \
      echo $'ALTER SYSTEM SET logging_collector TO \'off\';\n'; \
    ) | sudo -u postgres psql -e && \
    service postgresql stop

The import is going in fine, but when i search for New York from the debugging interface I get an error

url => http://address:8080/search.php?q=New+York&polygon_geojson=1&viewbox=

Internal Server Error
Nominatim has encountered an internal error while accessing the database. This may happen because the database is broken or because of a bug in the software. If you think it is a bug, feel free to report it over on Github. Please include the URL that caused the problem and the complete error details below.

Message: Could not get places for search terms.

SQL Error: DB Error: unknown error

Details:

SELECT place_id,0::int as exactmatch FROM search_name WHERE name_vector @> ARRAY[5638674] ORDER BY (CASE WHEN importance = 0 OR importance IS NULL THEN 0.75001-(search_rank::float/40) ELSE importance END) DESC LIMIT 20 [nativecode=ERROR:  canceling statement due to statement timeout]

Is it sql timing out due to index issues? Whats the next best approach to debug this?

I have a total available RAM of 8 GB for running and 8 cores, but while the query is running, the ram usage is never above 600MB

@mtmail

This comment has been minimized.

Show comment
Hide comment
@mtmail

mtmail Apr 24, 2018

Collaborator

Can you connect to the database manually using psql?

Does http://address:8080/search.php report a different error? (It's somewhat readable what queries search.php is running https://github.com/openstreetmap/Nominatim/blob/master/website/status.php)

Do you see any errors in the postgresql log file?

Collaborator

mtmail commented Apr 24, 2018

Can you connect to the database manually using psql?

Does http://address:8080/search.php report a different error? (It's somewhat readable what queries search.php is running https://github.com/openstreetmap/Nominatim/blob/master/website/status.php)

Do you see any errors in the postgresql log file?

@midhunkrishna

This comment has been minimized.

Show comment
Hide comment
@midhunkrishna

midhunkrishna Apr 24, 2018

@mtmail I can connect to psql.
The search.php did initially give me a Could not get word tokens which went away with the fix mentioned here

After that, the above pasted is what I am getting. I checked the apache logs; there were no errors there.
I am doing the whole thing in a docker machine, and I thought this might be due to the less Postgres runtime memory I have set in the tuning step. So I increased the whole memory and had scheduled for a rebuilt.

Once it is up, will add results returned in http://address:8080/search.php?q=New+York&polygon_geojson=1&viewbox=&debug=1

midhunkrishna commented Apr 24, 2018

@mtmail I can connect to psql.
The search.php did initially give me a Could not get word tokens which went away with the fix mentioned here

After that, the above pasted is what I am getting. I checked the apache logs; there were no errors there.
I am doing the whole thing in a docker machine, and I thought this might be due to the less Postgres runtime memory I have set in the tuning step. So I increased the whole memory and had scheduled for a rebuilt.

Once it is up, will add results returned in http://address:8080/search.php?q=New+York&polygon_geojson=1&viewbox=&debug=1

@lonvia

This comment has been minimized.

Show comment
Hide comment
@lonvia

lonvia Apr 24, 2018

Collaborator

Please make sure that there was/is no error during import and check that the indices on the search_name table have been created. There should be 4 indexes there.

Collaborator

lonvia commented Apr 24, 2018

Please make sure that there was/is no error during import and check that the indices on the search_name table have been created. There should be 4 indexes there.

@midhunkrishna

This comment has been minimized.

Show comment
Hide comment
@midhunkrishna

midhunkrishna Apr 25, 2018

There are no errors during the initial import. The entire list of indexes. -> https://gist.github.com/midhunkrishna/6d32a34a04ac62593a131dc356cd8635

@lonvia There seem to be only three indexes on search_name

 public | idx_search_name_0_centroid                            | index | nomina
tim | search_name_0
 public | idx_search_name_0_name_vector                         | index | nomina
tim | search_name_0
 public | idx_search_name_0_place_id                            | index | nomina
tim | search_name_0

result of what i get when i run http://address:8080/search.php?q=New+York&polygon_geojson=1&viewbox=&debug=true
-> https://gist.github.com/midhunkrishna/4c399e9e6a11c65974b8c81d8abb5a0a

cc: @mtmail

midhunkrishna commented Apr 25, 2018

There are no errors during the initial import. The entire list of indexes. -> https://gist.github.com/midhunkrishna/6d32a34a04ac62593a131dc356cd8635

@lonvia There seem to be only three indexes on search_name

 public | idx_search_name_0_centroid                            | index | nomina
tim | search_name_0
 public | idx_search_name_0_name_vector                         | index | nomina
tim | search_name_0
 public | idx_search_name_0_place_id                            | index | nomina
tim | search_name_0

result of what i get when i run http://address:8080/search.php?q=New+York&polygon_geojson=1&viewbox=&debug=true
-> https://gist.github.com/midhunkrishna/4c399e9e6a11c65974b8c81d8abb5a0a

cc: @mtmail

@lonvia

This comment has been minimized.

Show comment
Hide comment
@lonvia

lonvia Apr 25, 2018

Collaborator

These are the indexes for search_name_0 which is a different table but all the indexes for search_name seem to be there. What does the following say

EXPLAIN SELECT place_id,0::int as exactmatch FROM search_name 
WHERE name_vector @> ARRAY[5639492] ORDER BY
(CASE WHEN importance = 0 OR importance IS NULL THEN 0.75001-(search_rank::float/40) ELSE importance END) DESC LIMIT 20

It may run for a long time but also try

EXPLAIN ANALYSE SELECT place_id,0::int as exactmatch FROM search_name 
WHERE name_vector @> ARRAY[5639492] ORDER BY
(CASE WHEN importance = 0 OR importance IS NULL THEN 0.75001-(search_rank::float/40) ELSE importance END) DESC LIMIT 20
Collaborator

lonvia commented Apr 25, 2018

These are the indexes for search_name_0 which is a different table but all the indexes for search_name seem to be there. What does the following say

EXPLAIN SELECT place_id,0::int as exactmatch FROM search_name 
WHERE name_vector @> ARRAY[5639492] ORDER BY
(CASE WHEN importance = 0 OR importance IS NULL THEN 0.75001-(search_rank::float/40) ELSE importance END) DESC LIMIT 20

It may run for a long time but also try

EXPLAIN ANALYSE SELECT place_id,0::int as exactmatch FROM search_name 
WHERE name_vector @> ARRAY[5639492] ORDER BY
(CASE WHEN importance = 0 OR importance IS NULL THEN 0.75001-(search_rank::float/40) ELSE importance END) DESC LIMIT 20
@midhunkrishna

This comment has been minimized.

Show comment
Hide comment
@midhunkrishna

midhunkrishna Apr 26, 2018

@lonvia Thank you so much for assisting me with this (so far)
This is the explain

                                                                                
                      QUERY PLAN                                                
                                                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------------------------------------------------
 Limit  (cost=8558.43..8558.48 rows=20 width=20)
   ->  Sort  (cost=8558.43..8563.87 rows=2175 width=20)
         Sort Key: (CASE WHEN ((importance = '0'::double precision) OR (importan
ce IS NULL)) THEN ('0.75001'::double precision - ((search_rank)::double precisio
n / '40'::double precision)) ELSE importance END) DESC
         ->  Bitmap Heap Scan on search_name  (cost=44.86..8500.55 rows=2175 wid
th=20)
               Recheck Cond: (name_vector @> '{5639492}'::integer[])
               ->  Bitmap Index Scan on idx_search_name_name_vector  (cost=0.00.
.44.32 rows=2175 width=0)
                     Index Cond: (name_vector @> '{5639492}'::integer[])
(7 rows)


and when I tried Explain Analyze the query completed very quickly. I am thinking this must be due to the previous explain and the server must have cached it.

                      QUERY PLAN                                                
                                                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------------------------------------------------
 Limit  (cost=8558.43..8558.48 rows=20 width=20) (actual time=0.280..0.397 rows=
20 loops=1)
   ->  Sort  (cost=8558.43..8563.87 rows=2175 width=20) (actual time=0.275..0.31
4 rows=20 loops=1)
         Sort Key: (CASE WHEN ((importance = '0'::double precision) OR (importan
ce IS NULL)) THEN ('0.75001'::double precision - ((search_rank)::double precisio
n / '40'::double precision)) ELSE importance END) DESC
         Sort Method: quicksort  Memory: 26kB
         ->  Bitmap Heap Scan on search_name  (cost=44.86..8500.55 rows=2175 wid
th=20) (actual time=0.058..0.167 rows=24 loops=1)
               Recheck Cond: (name_vector @> '{5639492}'::integer[])
               Heap Blocks: exact=24
               ->  Bitmap Index Scan on idx_search_name_name_vector  (cost=0.00.
.44.32 rows=2175 width=0) (actual time=0.039..0.039 rows=24 loops=1)
                     Index Cond: (name_vector @> '{5639492}'::integer[])
 Planning time: 0.220 ms
 Execution time: 0.495 ms
(11 rows)

Restarted the server and when on explain analyze, I get

                                                                                
                      QUERY PLAN                                                
                                                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------------------------------------------------
 Limit  (cost=8558.43..8558.48 rows=20 width=20) (actual time=0.510..0.625 rows=
20 loops=1)
   ->  Sort  (cost=8558.43..8563.87 rows=2175 width=20) (actual time=0.505..0.54
3 rows=20 loops=1)
         Sort Key: (CASE WHEN ((importance = '0'::double precision) OR (importan
ce IS NULL)) THEN ('0.75001'::double precision - ((search_rank)::double precisio
n / '40'::double precision)) ELSE importance END) DESC
         Sort Method: quicksort  Memory: 26kB
         ->  Bitmap Heap Scan on search_name  (cost=44.86..8500.55 rows=2175 wid
th=20) (actual time=0.085..0.426 rows=24 loops=1)
               Recheck Cond: (name_vector @> '{5639492}'::integer[])
               Heap Blocks: exact=24
               ->  Bitmap Index Scan on idx_search_name_name_vector  (cost=0.00.
.44.32 rows=2175 width=0) (actual time=0.061..0.061 rows=24 loops=1)
                     Index Cond: (name_vector @> '{5639492}'::integer[])
 Planning time: 230658.037 ms
 Execution time: 0.716 ms
(11 rows)

midhunkrishna commented Apr 26, 2018

@lonvia Thank you so much for assisting me with this (so far)
This is the explain

                                                                                
                      QUERY PLAN                                                
                                                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------------------------------------------------
 Limit  (cost=8558.43..8558.48 rows=20 width=20)
   ->  Sort  (cost=8558.43..8563.87 rows=2175 width=20)
         Sort Key: (CASE WHEN ((importance = '0'::double precision) OR (importan
ce IS NULL)) THEN ('0.75001'::double precision - ((search_rank)::double precisio
n / '40'::double precision)) ELSE importance END) DESC
         ->  Bitmap Heap Scan on search_name  (cost=44.86..8500.55 rows=2175 wid
th=20)
               Recheck Cond: (name_vector @> '{5639492}'::integer[])
               ->  Bitmap Index Scan on idx_search_name_name_vector  (cost=0.00.
.44.32 rows=2175 width=0)
                     Index Cond: (name_vector @> '{5639492}'::integer[])
(7 rows)


and when I tried Explain Analyze the query completed very quickly. I am thinking this must be due to the previous explain and the server must have cached it.

                      QUERY PLAN                                                
                                                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------------------------------------------------
 Limit  (cost=8558.43..8558.48 rows=20 width=20) (actual time=0.280..0.397 rows=
20 loops=1)
   ->  Sort  (cost=8558.43..8563.87 rows=2175 width=20) (actual time=0.275..0.31
4 rows=20 loops=1)
         Sort Key: (CASE WHEN ((importance = '0'::double precision) OR (importan
ce IS NULL)) THEN ('0.75001'::double precision - ((search_rank)::double precisio
n / '40'::double precision)) ELSE importance END) DESC
         Sort Method: quicksort  Memory: 26kB
         ->  Bitmap Heap Scan on search_name  (cost=44.86..8500.55 rows=2175 wid
th=20) (actual time=0.058..0.167 rows=24 loops=1)
               Recheck Cond: (name_vector @> '{5639492}'::integer[])
               Heap Blocks: exact=24
               ->  Bitmap Index Scan on idx_search_name_name_vector  (cost=0.00.
.44.32 rows=2175 width=0) (actual time=0.039..0.039 rows=24 loops=1)
                     Index Cond: (name_vector @> '{5639492}'::integer[])
 Planning time: 0.220 ms
 Execution time: 0.495 ms
(11 rows)

Restarted the server and when on explain analyze, I get

                                                                                
                      QUERY PLAN                                                
                                                       
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------------------------------------------------
 Limit  (cost=8558.43..8558.48 rows=20 width=20) (actual time=0.510..0.625 rows=
20 loops=1)
   ->  Sort  (cost=8558.43..8563.87 rows=2175 width=20) (actual time=0.505..0.54
3 rows=20 loops=1)
         Sort Key: (CASE WHEN ((importance = '0'::double precision) OR (importan
ce IS NULL)) THEN ('0.75001'::double precision - ((search_rank)::double precisio
n / '40'::double precision)) ELSE importance END) DESC
         Sort Method: quicksort  Memory: 26kB
         ->  Bitmap Heap Scan on search_name  (cost=44.86..8500.55 rows=2175 wid
th=20) (actual time=0.085..0.426 rows=24 loops=1)
               Recheck Cond: (name_vector @> '{5639492}'::integer[])
               Heap Blocks: exact=24
               ->  Bitmap Index Scan on idx_search_name_name_vector  (cost=0.00.
.44.32 rows=2175 width=0) (actual time=0.061..0.061 rows=24 loops=1)
                     Index Cond: (name_vector @> '{5639492}'::integer[])
 Planning time: 230658.037 ms
 Execution time: 0.716 ms
(11 rows)
@midhunkrishna

This comment has been minimized.

Show comment
Hide comment
@midhunkrishna

midhunkrishna Apr 27, 2018

@lonvia The queries are working correctly now. When I thought about the last explain analyze posted, it seemed the planning was talking all the time. I ran

psql -d nominatim -c 'ANALYZE VERBOSE'

and everything is super fast now.

Thank you for all your help. Appreciate it. 👍

midhunkrishna commented Apr 27, 2018

@lonvia The queries are working correctly now. When I thought about the last explain analyze posted, it seemed the planning was talking all the time. I ran

psql -d nominatim -c 'ANALYZE VERBOSE'

and everything is super fast now.

Thank you for all your help. Appreciate it. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment