Not able to run db migration to bootstrap the database #187

bqbn · 2018-08-03T21:29:27Z

Background

Resources for -stage environment (database, elasticsearch, application stack and etc.) are set up and the application is able to start up.

Trying to run the migrate command for the first time to bootstrap the database in this environment.

Issue

Here is the command we ran:

$ docker run --rm --net host --env-file /etc/dockerflow/buildhub2.txt --log-driver=syslog hub.prod.mozaws.net/pipelines/buildhub2/buildhub2:latest migrate

And here is the error:

Not waiting for any services
{"Timestamp": 1533331050280736768, "Type": "markus.backends.datadog", "Logger": "buildhub", "Hostname": "ip-172-31-41-251.ec2.internal", "EnvVersion": "2.0", "Severity": 6, "Pid": 6, "Fields": {"msg": "DatadogMetrics configured: localhost:8125 "}}
Traceback (most recent call last):
  File "manage.py", line 9, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 371, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 365, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 288, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 332, in execute
    self.check()
  File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 364, in check
    include_deployment_checks=include_deployment_checks,
  File "/usr/local/lib/python3.6/site-packages/django/core/management/commands/migrate.py", line 58, in _run_checks
    issues.extend(super()._run_checks(**kwargs))
  File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 351, in _run_checks
    return checks.run_checks(**kwargs)
  File "/usr/local/lib/python3.6/site-packages/django/core/checks/registry.py", line 73, in run_checks
    new_errors = check(app_configs=app_configs)
  File "/app/buildhub/dockerflow_extra.py", line 41, in check_elasticsearch
    health = fetch(url)["status"]
  File "/usr/local/lib/python3.6/site-packages/backoff/_sync.py", line 99, in retry
    ret = target(*args, **kwargs)
  File "/app/buildhub/dockerflow_extra.py", line 33, in fetch
    response.raise_for_status()
  File "/usr/local/lib/python3.6/site-packages/requests/models.py", line 939, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 408 Client Error: Request Timeout for url: http://internal-buildhub2-es-s-elb-1qf0a53abi35c-1034878185.us-east-1.elb.amazonaws.com:9200/_cluster/health/buildhub2

So it looks like there is some kinda of chicken-egg problem. Because it tries to check the health of a non-existing index before running the database migration.

Is it possible to document the steps of how to bootstrap a brand new environment? That would be ideal if we can follow those steps exactly to set this up.

A less ideal solution would be let us know how to manually work around it. But that means we'll have to play the same "manual tricks" when we deploy -prod environment.

The text was updated successfully, but these errors were encountered:

bqbn · 2018-08-03T21:36:27Z

Also just a note that we have verified the connection between the app instances and elasticsearch cluster is good. For example, on one of the app instances, we can run,

$ curl http://internal-buildhub2-es-s-elb-1qf0a53abi35c-1034878185.us-east-1.elb.amazonaws.com:9200/_cluster/health?pretty
{
  "cluster_name" : "buildhub2.elasticsearch.stage.batch1.3",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

peterbe · 2018-08-08T16:19:32Z

The simplest solution is to change the dockerflow check to not expect the index to be there but to check that Elasticsearch is up and healthy. That was the original design. It was only later that I added the check to see if the index is there. When I did that, on my local dev environment, I already had the index created.

By the way, the migrate command is about the PostgreSQL tables. I have to refresh my memory but I think the Elasticsearch index gets created on the fly.

Fixes mozilla-releng#187

194: chicken-and-egg es index health r=peterbe a=peterbe Fixes #187 Co-authored-by: Peter Bengtsson <mail@peterbe.com>

peterbe added a commit to peterbe/buildhub2 that referenced this issue Aug 8, 2018

chicken-and-egg es index health

597889a

Fixes mozilla-releng#187

peterbe mentioned this issue Aug 8, 2018

chicken-and-egg es index health #194

Merged

bors bot added a commit that referenced this issue Aug 8, 2018

Merge #194

a4404d1

194: chicken-and-egg es index health r=peterbe a=peterbe Fixes #187 Co-authored-by: Peter Bengtsson <mail@peterbe.com>

bors bot closed this as completed in #194 Aug 8, 2018

bqbn mentioned this issue Aug 9, 2018

Not able to run db migration to bootstrap the database #203

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to run db migration to bootstrap the database #187

Not able to run db migration to bootstrap the database #187

bqbn commented Aug 3, 2018

bqbn commented Aug 3, 2018

peterbe commented Aug 8, 2018

Not able to run db migration to bootstrap the database #187

Not able to run db migration to bootstrap the database #187

Comments

bqbn commented Aug 3, 2018

Background

Issue

bqbn commented Aug 3, 2018

peterbe commented Aug 8, 2018