Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whosonfirst reports index does not exist for improper timeout configuration in pelias.json #535

Open
creativesapiens opened this issue Oct 23, 2022 · 4 comments
Labels

Comments

@creativesapiens
Copy link

Describe the bug

Pelias whosonfirst importer reports the following error when importing with npm run start with an improper timeout setting in pelias.json. The reported error is:

ERROR: Elasticsearch index pelias does not exist
You must use the pelias-schema tool (https://github.com/pelias/schema/) to create the index first
For full instructions on setting up Pelias, see http://pelias.io/install.html
/home/user/pelias/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39
        throw new Error(`elasticsearch index ${config.schema.indexName} does not exist`);

Error: elasticsearch index pelias does not exist
    at existsCallback (/home/user/Softwares/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39:15)
    at respond (/home/user/Softwares/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:368:9)
    at /home/user/Softwares/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:396:7
    at Timeout.<anonymous> (/home/user/Softwares/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:429:7)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7)

Whereas one can clearly see that the index does exist:

$ curl http://localhost:9200/_cat/indices/*?v=true
health status index            uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .geoip_databases Cfim9lIdRZO1D6X2UqcHqQ   1   0         41            0       39mb           39mb
green  open   pelias           iQYgJrn9QWySZgZ-tC80NA   1   0          0            0       226b           226b

The configuration was done as:

"esclient": {
    "apiVersion": "7.x",
    "keepAlive": true,
    "requestTimeout": 12000,
    "hosts": [{
      "env": "development",
      "protocol": "http",
      "host": "localhost",
      "port": 9200
    }],
}

Steps to Reproduce

  1. Install elasticsearch, with required dependencies
  2. Load elasticsearch schema
  3. Have the above mentioned configuration in pelias.json
  4. Download wof data in data directory
  5. Start wof import with npm run start

Expected behavior

A message should be presented that this was an issue with timeout, or perhaps an issue with JSON file.

Environment (please complete the following information):

  • OS: Linux / Ubuntu 22.04
  • Installation Mode: Install pelias from scratch

Pastebin/Screenshots

Additional context

Complete command run with stack trace was given as:


> pelias-whosonfirst@0.0.0-development start
> ./bin/start

2022-10-23T20:44:18.540Z - debug: [whosonfirst] Loading 'ocean' of whosonfirst-data-admin-latest.db database from /home/user/Downloads/whosonfirst/sqlite
2022-10-23T20:44:21.299Z - debug: [whosonfirst] Loading 'marinearea' of whosonfirst-data-admin-latest.db database from /home/user/Downloads/whosonfirst/sqlite
2022-10-23T20:44:24.400Z - debug: [whosonfirst] Loading 'continent' of whosonfirst-data-admin-latest.db database from /home/user/Downloads/whosonfirst/sqlite
2022-10-23T20:44:27.180Z - debug: [whosonfirst] Loading 'empire' of whosonfirst-data-admin-latest.db database from /home/user/Downloads/whosonfirst/sqlite
2022-10-23T20:44:29.851Z - debug: [whosonfirst] Loading 'country' of whosonfirst-data-admin-latest.db database from /home/user/Downloads/whosonfirst/sqlite
2022-10-23T20:44:34.716Z - debug: [whosonfirst] Loading 'dependency' of whosonfirst-data-admin-latest.db database from /home/user/Downloads/whosonfirst/sqlite
2022-10-23T20:44:37.350Z - debug: [whosonfirst] Loading 'disputed' of whosonfirst-data-admin-latest.db database from /home/user/Downloads/whosonfirst/sqlite
2022-10-23T20:44:40.029Z - debug: [whosonfirst] Loading 'macroregion' of whosonfirst-data-admin-latest.db database from /home/user/Downloads/whosonfirst/sqlite
2022-10-23T20:44:43.366Z - debug: [whosonfirst] Loading 'region' of whosonfirst-data-admin-latest.db database from /home/user/Downloads/whosonfirst/sqlite
ERROR: Elasticsearch index pelias does not exist
You must use the pelias-schema tool (https://github.com/pelias/schema/) to create the index first
For full instructions on setting up Pelias, see http://pelias.io/install.html
/home/user/Softwares/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39
        throw new Error(`elasticsearch index ${config.schema.indexName} does not exist`);
        ^

Error: elasticsearch index pelias does not exist
    at existsCallback (/home/user/Softwares/whosonfirst/node_modules/pelias-dbclient/src/configValidation.js:39:15)
    at respond (/home/user/Softwares/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:368:9)
    at /home/user/Softwares/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:396:7
    at Timeout.<anonymous> (/home/user/Softwares/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:429:7)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7)

References

What fixed it?

Having a proper pelias.json configuration with timeout fixed it:

Note: requestTimeout as changed to a string with a value of "120000".

"esclient": {
    "apiVersion": "7.x",
    "keepAlive": true,
    "requestTimeout": "120000",
    "hosts": [{
      "env": "development",
      "protocol": "http",
      "host": "localhost",
      "port": 9200
    }],
}
@orangejulius
Copy link
Member

Hi @creativesapiens,
thanks for the comprehensive bug report. We've been tracking this issue for a while, with reports in pelias/docker#217 among other places. It appears to us that there's something a bit different about the Who's on First importer where it hits this issue, when other importers don't. However none of the Pelias team has ever been able to reproduce it, so maybe you can help us track it down.

We've also seen the issue where invalid requestTimeout values are interpreted as 0ms, leading to timeout errors though that was a long time ago and with clearly invalid values like 120_000. However, I tested both "120000" and 120000 as timeout values and they both worked fine for me.

Can you answer a couple questions for me?

  • What motivated you to add the requestTimeout value to your pelias.json config in the first place? Was there an example config you found somewhere? If so we'd really like to be able to update it so we can correct it.
  • Can you confirm that changing the requestTimeout value from an integer to a string fixed it? What happens if you change it back or remove that line all together?
  • What happens if you have no pelias.json? The default value is indeed the string "120000" so I would expect your config to have no effect.
  • Can you share more details of your exact setup? The output of node --version and npm ls would be helpful

Thanks!

@gmarti
Copy link

gmarti commented Oct 4, 2023

I think the issue is that here https://github.com/pelias/dbclient/blob/master/src/configValidation.js#L34
If there is an error it logs that it doesn't exist
And the error is silenced and nothing is printed.

@Kilowhisky
Copy link

Kilowhisky commented Aug 14, 2024

I'm also encountering this problem and i'm using the default config timeout.

	"esclient": {
		"apiVersion": "7.x",
		"keepAlive": true,
		"requestTimeout": "120000",

Here's my full config:

{
	"esclient": {
		"apiVersion": "7.x",
		"keepAlive": true,
		"requestTimeout": "120000",
		"hosts": [
			{
				"env": "development",
				"protocol": "https",
				"host": "AWS.us-west-2.es.amazonaws.com",
				"port": 443,
				"auth": "negatron"
			}
		],
		"log": [
			{
				"type": "stdio",
				"json": false,
				"level": [
					"error",
					"warning"
				]
			}
		]
	},
	"elasticsearch": {
		"settings": {
			"index": {
				"number_of_replicas": "0",
				"number_of_shards": "5",
				"refresh_interval": "1m"
			}
		}
	},
	"interpolation": {
		"client": {
			"adapter": "null"
		}
	},
	"dbclient": {
		"statFrequency": 10000,
		"batchSize": 500
	},
	"api": {
		"accessLog": "common",
		"host": "http://pelias",
		"indexName": "pelias",
		"version": "1.0",
		"targets": {
			"auto_discover": true,
			"canonical_sources": [
				"whosonfirst",
				"openstreetmap",
				"openaddresses",
				"geonames"
			],
			"layers_by_source": {
				"openstreetmap": [
					"address",
					"venue",
					"street"
				],
				"openaddresses": [
					"address"
				],
				"geonames": [
					"country",
					"macroregion",
					"region",
					"county",
					"localadmin",
					"locality",
					"borough",
					"neighbourhood",
					"venue"
				],
				"whosonfirst": [
					"continent",
					"empire",
					"country",
					"dependency",
					"macroregion",
					"region",
					"locality",
					"localadmin",
					"macrocounty",
					"county",
					"macrohood",
					"borough",
					"neighbourhood",
					"microhood",
					"disputed",
					"venue",
					"postalcode",
					"ocean",
					"marinearea"
				]
			},
			"source_aliases": {
				"osm": [
					"openstreetmap"
				],
				"oa": [
					"openaddresses"
				],
				"gn": [
					"geonames"
				],
				"wof": [
					"whosonfirst"
				]
			},
			"layer_aliases": {
				"coarse": [
					"continent",
					"empire",
					"country",
					"dependency",
					"macroregion",
					"region",
					"locality",
					"localadmin",
					"macrocounty",
					"county",
					"macrohood",
					"borough",
					"neighbourhood",
					"microhood",
					"disputed",
					"postalcode",
					"ocean",
					"marinearea"
				]
			}
		},
		"port": 3100,
		"attributionURL": "nope",
		"services": {
			"pip": {
				"url": "http://localhost:3102"
			},
			"libpostal": {
				"url": "http://localhost:4400"
			},
			"placeholder": {
				"url": "http://localhost:3000"
			}
		}
	},
	"schema": {
		"indexName": "pelias"
	},
	"logger": {
		"level": "debug",
		"timestamp": true,
		"colorize": true
	},
	"acceptance-tests": {
		"endpoints": {
			"local": "http://localhost:3100/v1/"
		}
	},
	"imports": {
		"adminLookup": {
			"enabled": true,
			"maxConcurrentRequests": 100,
			"usePostalCities": true
		},
		"blacklist": {
			"files": []
		},
		"csv": {},
		"geonames": {
			"datapath": "/data/pelias/geonames",
			"countryCode": "US"
		},
		"openstreetmap": {
			"datapath": "/data/pelias/openstreetmap",
			"leveldbpath": "/tmp",
			"import": [
				{
					"filename": "extract.osm.pbf"
				}
			]
		},
		"openaddresses": {
			"datapath": "/mnt/pelias/openaddresses",
			"token": "oa.bbbcf5787bb4251445883cc417f811ba02b9fd64809fd56c5a972171fbcfb2f6",
			"files": []
		},
		"polyline": {
			"datapath": "/data/pelias/polyline",
			"files": [
				"north-america-valhalla.polylines.0sv"
			]
		},
		"whosonfirst": {
			"datapath": "/data/pelias/whosonfirst",
			"importPostalcodes": true,
			"countryCode": "US"
		}
	}
}

@michaelkirk
Copy link
Contributor

If the root causes is a timeout (hard to know with the current logging, until pelias/dbclient#129 is rolled out to the various client libraries), you can increase the timeout.

From pelias/docker#217 (comment)

After having the planet sized import fail a couple dozen times with the default 2 minute timeout, I specified a timeout of 10 minutes and was able to complete the import on the first try.

pelias config:

{
  "esclient": {
    "requestTimeout": "600000",
    ...
  },
  ...
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants