Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context timing out on migrations (omnibus) #318

Closed
invisiblethreat opened this issue Jun 29, 2022 · 3 comments
Closed

Context timing out on migrations (omnibus) #318

invisiblethreat opened this issue Jun 29, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@invisiblethreat
Copy link

Describe the bug
When attempting to migrate to an InfluxDB version of the migration fails with Client.Timeout exceeded while awaiting headers, which causes a panic. It's not entirely clear to me if this is a processing timeout, or a failure to connect to Influx that looks like a timeout rather than an auth failure. The issue resolved once I added the token from Influx into the config file.

Expected behavior
Migration completion.

Log Files

2022/06/29 11:02:54 Loading configuration file: /opt/scrutiny/config/scrutiny.yaml

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
github.com/AnalogJ/scrutiny                            dev-0.4.14

Start the scrutiny server
time="2022-06-29T11:02:54-03:00" level=info msg="Trying to connect to scrutiny sqlite db: /opt/scrutiny/config/scrutiny.db\n"
[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
 - using env:   export GIN_MODE=release
 - using code:  gin.SetMode(gin.ReleaseMode)

time="2022-06-29T11:02:54-03:00" level=info msg="Successfully connected to scrutiny sqlite db: /opt/scrutiny/config/scrutiny.db\n"
time="2022-06-29T11:02:54-03:00" level=debug msg="InfluxDB url: http://localhost:8086"
time="2022-06-29T11:02:54-03:00" level=debug msg="Determine Influxdb setup status..."
time="2022-06-29T11:02:54-03:00" level=debug msg="Influxdb un-initialized, running first-time setup..."
time="2022-06-29T11:02:57-03:00" level=info msg="Database migration starting. Please wait, this process may take a long time...."

2022/06/29 11:02:57 /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/database/scrutiny_repository_migrations.go:84
[185.913ms] [rows:377] SELECT * FROM `smarts` WHERE `smarts`.`device_wwn` IN ("0x5001517bb2705e7a","0x5000c50035ee2e48","0x50014ee20cadedee","0x5000c500e389d55a") AND `smarts`.`deleted_at` IS NULL ORDER BY smarts.created_at ASC

2022/06/29 11:02:57 /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/database/scrutiny_repository_migrations.go:84
[187.211ms] [rows:4] SELECT * FROM `devices`
time="2022-06-29T11:02:57-03:00" level=debug msg="===================================="
time="2022-06-29T11:02:57-03:00" level=info msg="begin processing device: 0x5001517bb2705e7a"

2022/06/29 11:02:58 /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/database/scrutiny_repository_migrations.go:322
[315.922ms] [rows:21] SELECT * FROM `smart_ata_attributes` WHERE `smart_ata_attributes`.`smart_id` = 1 AND `smart_ata_attributes`.`deleted_at` IS NULL

2022/06/29 11:02:58 /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/database/scrutiny_repository_migrations.go:322
[316.837ms] [rows:1] SELECT * FROM `smarts` WHERE `smarts`.`deleted_at` IS NULL AND `smarts`.`id` = 1
time="2022-06-29T11:02:58-03:00" level=debug msg="device (0x5001517bb2705e7a) smart data added to bucket: monthly"
ts=2022-06-29T14:02:59.106500Z lvl=info msg="index opened with 8 partitions" log_id=0bNx55U0000 service=storage-engine index=tsi
ts=2022-06-29T14:02:59.107245Z lvl=info msg="Reindexing TSM data" log_id=0bNx55U0000 service=storage-engine engine=tsm1 db_shard_id=1
ts=2022-06-29T14:02:59.107267Z lvl=info msg="Reindexing WAL data" log_id=0bNx55U0000 service=storage-engine engine=tsm1 db_shard_id=1
scrutiny api not ready
time="2022-06-29T11:03:01-03:00" level=debug msg="device (0x5001517bb2705e7a) smart data added to bucket: monthly"
ts=2022-06-29T14:03:01.293516Z lvl=info msg="index opened with 8 partitions" log_id=0bNx55U0000 service=storage-engine index=tsi
ts=2022-06-29T14:03:01.294000Z lvl=info msg="Reindexing TSM data" log_id=0bNx55U0000 service=storage-engine engine=tsm1 db_shard_id=2
ts=2022-06-29T14:03:01.294025Z lvl=info msg="Reindexing WAL data" log_id=0bNx55U0000 service=storage-engine engine=tsm1 db_shard_id=2
time="2022-06-29T11:03:02-03:00" level=debug msg="device (0x5001517bb2705e7a) smart data added to bucket: weekly"
ts=2022-06-29T14:03:04.102057Z lvl=info msg="index opened with 8 partitions" log_id=0bNx55U0000 service=storage-engine index=tsi
ts=2022-06-29T14:03:04.103457Z lvl=info msg="Reindexing TSM data" log_id=0bNx55U0000 service=storage-engine engine=tsm1 db_shard_id=3
ts=2022-06-29T14:03:04.103480Z lvl=info msg="Reindexing WAL data" log_id=0bNx55U0000 service=storage-engine engine=tsm1 db_shard_id=3
scrutiny api not ready
scrutiny api not ready
scrutiny api not ready
scrutiny api not ready
scrutiny api not ready
scrutiny api not ready
time="2022-06-29T11:03:30-03:00" level=error msg="Database migration failed with error. \n Please open a github issue at https://github.com/AnalogJ/scrutiny and attach a copy of your scrutiny.db file. \n Post \"http://localhost:8086/api/v2/write?bucket=metrics_weekly&org=scrutiny&precision=ns\": context deadline exceeded (Client.Timeout ex
ceeded while awaiting headers)"
panic: Post "http://localhost:8086/api/v2/write?bucket=metrics_weekly&org=scrutiny&precision=ns": context deadline exceeded (Client.Timeout exceeded while awaiting headers)


goroutine 1 [running]:
github.com/analogj/scrutiny/webapp/backend/pkg/web/middleware.RepositoryMiddleware({0x1018300, 0xc000316078}, {0x10219d0, 0xc0004055e0})
        /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/middleware/repository.go:14 +0xa5
github.com/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Setup(0xc000315890, {0x10219d0, 0xc0004055e0})
        /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/server.go:27 +0xb4
github.com/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Start(0xc000315890)
        /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/server.go:105 +0x46b
main.main.func2(0xc0003a1340)
        /go/src/github.com/analogj/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:112 +0x1f7
github.com/urfave/cli/v2.(*Command).Run(0xc00040b560, 0xc0003a11c0)
        /go/src/github.com/analogj/scrutiny/vendor/github.com/urfave/cli/v2/command.go:164 +0x64a
github.com/urfave/cli/v2.(*App).RunContext(0xc000424000, {0x10026d0, 0xc000036050}, {0xc000032060, 0x2, 0x2})
        /go/src/github.com/analogj/scrutiny/vendor/github.com/urfave/cli/v2/app.go:306 +0x926
github.com/urfave/cli/v2.(*App).Run(...)
        /go/src/github.com/analogj/scrutiny/vendor/github.com/urfave/cli/v2/app.go:215
main.main()
        /go/src/github.com/analogj/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:137 +0x679

Please also provide the output of docker info

# docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.7.1-docker)
  scan: Docker Scan (Docker Inc., v0.12.0)

Server:
 Containers: 25
  Running: 25
  Paused: 0
  Stopped: 0
 Images: 26
 Server Version: 20.10.12
 Storage Driver: btrfs
  Build Version: Btrfs v4.20.1
  Library Version: 102
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.16.0-3-amd64
 Operating System: Debian GNU/Linux bookworm/sid
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 31.06GiB
 Name: gulf
 ID: 775X:SYTR:LRZ4:WLPC:DK7U:73S2:E6PB:WUFU:U2HY:5AQE:4535:DFIK
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  registry.lan:5000
  127.0.0.0/8
 Live Restore Enabled: false
@invisiblethreat invisiblethreat added the bug Something isn't working label Jun 29, 2022
@AnalogJ
Copy link
Owner

AnalogJ commented Jul 10, 2022

definitely not a processing timeout, but it could be a networking failure.
Are you running the omnibus container or are you deploying in hub&spoke mode?

@AnalogJ
Copy link
Owner

AnalogJ commented Jul 10, 2022

Ah just noticed that you mentioned omnibus in the issue title. Can you retry the container and see if it happens again? if it does, can you send me a copy of your scrutiny.db file? you can email it to jason@thesparktree.com

@AnalogJ
Copy link
Owner

AnalogJ commented Aug 4, 2022

Hey @invisiblethreat Is this still an issue for you?
I'm going to close this issue since its been almost a month without a reply, but feel free to reopen it if this is still affecting you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants