Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agents not triggered at scheduled time? #2667

Open
Videl opened this issue Jan 10, 2020 · 12 comments
Open

Agents not triggered at scheduled time? #2667

Videl opened this issue Jan 10, 2020 · 12 comments
Assignees

Comments

@Videl
Copy link
Contributor

Videl commented Jan 10, 2020

Hello,
I'm fairly new to Huginn :)
I have noticed many jobs do not run on schedule. Is there anything I can do to debug what's happening?
I have ensured that I have set up the right timezone because the date of 'now' in events is correct when I trigger them myself.

I have seen comments about a "Job Management" page but I can't see it (#1668)? I have seen issues talking about admin users but I didn't see any choices? I had to use the invitation code to create an account.

I'm using the docker hub huginn/huginn image, with those variables setup:

  HUGINN_DATABASE_NAME
  HUGINN_DATABASE_USERNAME
  HUGINN_DATABASE_PASSWORD
  HUGINN_DATABASE_ADAPTER
  DATABASE_HOST
  POSTGRES_PORT_5432_TCP_ADDR
  DO_NOT_CREATE_DATABASE: true
  APP_SECRET_TOKEN
  DOMAIN
  INVITATION_CODE
  EMAIL_FROM_ADDRESS 
  RAILS_ENV: production
  SMTP_DOMAIN
  SMTP_PASSWORD
  SMTP_PORT
  SMTP_SERVER
  SMTP_USER_NAME
  TIMEZONE: Paris

Thanks for any help :)

@Videl
Copy link
Contributor Author

Videl commented Jan 10, 2020

Alright I got ahold of the admin account to erase the failed jobs.
That did not help the run of my Email Digest agent that have completely ignored it's scheduled time 4 minutes ago, even though its memory property shows:

{
  "events": [
    242,
    249
  ]
}

That means if the agent is launched, it will send an email with the content of it's memory?

@Videl
Copy link
Contributor Author

Videl commented Jan 10, 2020

I see there are a bunch of errors about postgresql I see:

foreman stdout | 18:14:19 jobs.1 | { 70170609036080 rufus-scheduler intercepted an error:
18:14:19 jobs.1 |   70170609036080   job:
18:14:19 jobs.1 |   70170609036080     Rufus::Scheduler::EveryJob "1m" {:tag=>"HuginnScheduler"}
18:14:19 jobs.1 |   70170609036080   error:
18:14:19 jobs.1 |   70170609036080     70170609036080
18:14:19 jobs.1 |   70170609036080     ActiveRecord::StatementInvalid
18:14:19 jobs.1 |   70170609036080     PG::UnableToSend: no connection to the server
18:14:19 jobs.1 | : SELECT "agents".* FROM "agents" WHERE "agents"."type" = $1
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:622:in `async_exec_prepared'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:622:in `block (2 levels) in exec_cache'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.2.1/lib/active_support/dependencies/interlock.rb:48:in `block in permit_concurrent_loads'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.2.1/lib/active_support/concurrency/share_lock.rb:187:in `yield_shares'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.2.1/lib/active_support/dependencies/interlock.rb:47:in `permit_concurrent_loads'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:621:in `block in exec_cache'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:579:in `block (2 levels) in log'
18:14:19 jobs.1 |   70170609036080       /usr/lib/ruby/2.5.0/monitor.rb:226:in `mon_synchronize'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:578:in `block in log'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.2.1/lib/active_support/notifications/instrumenter.rb:23:in `instrument'
foreman stdout | 18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:569:in `log'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:620:in `exec_cache'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:600:in `execute_and_clear'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:81:in `exec_query'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract/database_statements.rb:474:in `select_prepared'
foreman stdout | 18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract/database_statements.rb:60:in `select_all'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract/query_cache.rb:101:in `select_all'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/querying.rb:41:in `find_by_sql'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation.rb:560:in `block in exec_queries'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation.rb:584:in `skip_query_cache_if_necessary'
foreman stdout | 18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation.rb:547:in `exec_queries'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation.rb:422:in `load'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation.rb:200:in `records'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation/delegation.rb:71:in `each'
18:14:19 jobs.1 |   70170609036080       /app/lib/huginn_scheduler.rb:84:in `map'
18:14:19 jobs.1 |   70170609036080       /app/lib/huginn_scheduler.rb:84:in `schedule_scheduler_agents'
foreman stdout | 18:14:19 jobs.1 |   70170609036080       /app/lib/huginn_scheduler.rb:149:in `block in setup'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:210:in `do_call'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:254:in `trigger_now'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:296:in `block (3 levels) in start_work_thread'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:299:in `block (2 levels) in start_work_thread'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:285:in `loop'
18:14:19 jobs.1 |   70170609036080       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:285:in `block in start_work_thread'
18:14:19 jobs.1 |   70170609036080   tz:
18:14:19 jobs.1 |   70170609036080     ENV['TZ']: Europe/Paris
18:14:19 jobs.1 |   70170609036080     Time.now: 2020-01-10 19:14:19 +0100
18:14:19 jobs.1 |   70170609036080     local_tzone: #<TZInfo::DataTimezone: Europe/Paris>
18:14:19 jobs.1 |   70170609036080   et-orbi:
18:14:19 jobs.1 |   70170609036080     (etz:"Europe/Paris",tnz:"CET",tzid:nil,rv:"2.5.5",rp:"x86_64-linux-gnu",eov:"1.0.9",rorv:"5.2.2.1",astz:[ActiveSupport::TimeZone, "Europe/Paris"],debian:"Etc/UTC",centos:nil,osx:nil)

@Videl
Copy link
Contributor Author

Videl commented Jan 10, 2020

Alright I restarted from scratch the docker image, everything seems fine, I'll see if the error comes back.

@dsander
Copy link
Collaborator

dsander commented Jan 10, 2020

PG::UnableToSend: no connection to the server

Huh interesting, I have never seen that error before.

Do you think the postgres error prevented the Agents from being scheduled?

@Videl
Copy link
Contributor Author

Videl commented Jan 11, 2020

Indeed I did just a test, that's what happened!

My db (and huginn too) are installed with ansible (a provisioning tool) and sometimes they are stopped and restarted => looks like it impacts Huginn.

I'm not a ruby developer so I don't know if it can be fixed in Huginn, but I can make sure Huginn is restarted every time the db is stopped.

Do you know if there is a way to auto reconnect when there is a lost connection?

@dsander
Copy link
Collaborator

dsander commented Jan 15, 2020

I am pretty sure the Rails application itself handles reconnections automatically. For the background processes we would need to look into that.

@BlueHatbRit
Copy link

I've been digging into a similar issue as the top post of this issue. I realised this week when I started setting some email digests up that the times must be way out. I went in and set the timezone correctly to "London" and restarted both the web and worker containers (using the single-threaded postgres docker-compose) but to no avail.

I'm thinking I might need to tear down the compose stack and restart it for that setting to actually come through. I've not checked any of the logs though as I've only spent 1 evening on this so far and did the above.

I can't find the "Job Management" page either so can only assume it doesn't exist anymore. But I'll inspect the logs to see if I can find what the problem is. By the sounds of it, a full restart might be the best option to get the setting respected.

@dsander
Copy link
Collaborator

dsander commented Jan 18, 2020

I'm thinking I might need to tear down the compose stack and restart it for that setting to actually come through. I've not checked any of the logs though as I've only spent 1 evening on this so far and did the above.

docker-compose up -d should pick up the changes you made to the environment settings. restarting a container does not work because it will still have the previous environment variables applied.

I can't find the "Job Management" page either so can only assume it doesn't exist anymore. But I'll inspect the logs to see if I can find what the problem is. By the sounds of it, a full restart might be the best option to get the setting respected.

If you don't see that your user probably isn't an admin.

@Videl
Copy link
Contributor Author

Videl commented Jan 19, 2020

To inspect the logs, as I'm using the docker image, I use: docker logs -f huginn.
If you see things like this:

12:43:05 jobs.1 |   49028500     ActiveRecord::StatementInvalid
12:43:05 jobs.1 |   49028500     PG::UnableToSend: no connection to the server
12:43:05 jobs.1 | : SELECT "agents".* FROM "agents" WHERE "agents"."type" IN ('Agents::TwitterStreamAgent') AND "agents"."disabled" = $1 AND "agents"."deactivated" = $2 ORDER BY "agents"."id" ASC
foreman stdout | 12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:611:in `async_exec_params'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:611:in `block (2 levels) in exec_no_cache'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.2.1/lib/active_support/dependencies/interlock.rb:48:in `block in permit_concurrent_loads'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.2.1/lib/active_support/concurrency/share_lock.rb:187:in `yield_shares'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.2.1/lib/active_support/dependencies/interlock.rb:47:in `permit_concurrent_loads'
foreman stdout | 12:43:05 jobs.1 | { 48990440 rufus-scheduler intercepted an error:
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:610:in `block in exec_no_cache'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:579:in `block (2 levels) in log'
12:43:05 jobs.1 |   49028500       /usr/lib/ruby/2.5.0/monitor.rb:226:in `mon_synchronize'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:578:in `block in log'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.2.1/lib/active_support/notifications/instrumenter.rb:23:in `instrument'
12:43:05 jobs.1 |   48990440   job:
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract_adapter.rb:569:in `log'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:609:in `exec_no_cache'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:598:in `execute_and_clear'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/postgresql/database_statements.rb:81:in `exec_query'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract/database_statements.rb:470:in `select'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract/database_statements.rb:62:in `select_all'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract/query_cache.rb:101:in `select_all'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/querying.rb:41:in `find_by_sql'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation.rb:560:in `block in exec_queries'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation.rb:584:in `skip_query_cache_if_necessary'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation.rb:547:in `exec_queries'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation.rb:422:in `load'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation.rb:200:in `records'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/relation/delegation.rb:71:in `each'
12:43:05 jobs.1 |   49028500       /app/app/models/agents/twitter_stream_agent.rb:128:in `group_by'
12:43:05 jobs.1 |   49028500       /app/app/models/agents/twitter_stream_agent.rb:128:in `setup_worker'
12:43:05 jobs.1 |   49028500       /app/lib/agent_runner.rb:94:in `block (2 levels) in load_workers'
12:43:05 jobs.1 |   49028500       /app/lib/agent_runner.rb:62:in `block in with_connection'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.2.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:416:in `with_connection'
12:43:05 jobs.1 |   49028500       /app/lib/agent_runner.rb:61:in `with_connection'
12:43:05 jobs.1 |   49028500       /app/lib/agent_runner.rb:93:in `block in load_workers'
12:43:05 jobs.1 |   49028500       /app/lib/agent_runner.rb:89:in `each'
12:43:05 jobs.1 |   49028500       /app/lib/agent_runner.rb:89:in `load_workers'
12:43:05 jobs.1 |   49028500       /app/lib/agent_runner.rb:69:in `run_workers'
12:43:05 jobs.1 |   49028500       /app/lib/agent_runner.rb:24:in `block in initialize'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:210:in `do_call'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:254:in `trigger_now'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:296:in `block (3 levels) in start_work_thread'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:299:in `block (2 levels) in start_work_thread'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:285:in `loop'
12:43:05 jobs.1 |   49028500       /app/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-3.4.2/lib/rufus/scheduler/jobs.rb:285:in `block in start_work_thread'

(PG should be PostgreSQL for example.)
In this stacktrace, I'm not sure if the reconnection should be done in there /app/app/models/agents/twitter_stream_agent.rb:128:in `group_by' or higher, like /app/lib/agent_runner.rb:24:in `block in initialize'.

I'll see if I can do some testing.

@BlueHatbRit, if you haven't configured anything about the admin user, the default password should be "password" (as written here: https://github.com/Videl/huginn/blob/fb13686598596d648f3dd422288cfe87b69d8244/docker/multi-process/README.md)

@BlueHatbRit
Copy link

Hey both, thanks for the info about the admin user, I totally missed that in the docs. I've sorted that out and the guides all make a lot more sense now, also I've closed that potential security hole 😄

I don't have any logs like the ones you're seeing @Videl so I'm afraid I can't be of any help on that one.

@dsander good shout, I've not used compose for a few months and went for a restart for a docker-compose restart rather than a rebuild. That seems to have rebuilt the containers and a quick test shows it's respecting the time setting now. Thanks so much for the help and info both!

@dsander dsander self-assigned this Jan 19, 2020
@skrobul
Copy link

skrobul commented Dec 7, 2020

Just experienced the same problem (job that is scheduled to run every 12h) didn't run for over 30 hours now. Before killing and restarting the container, I have verified that connection is actually still established:

root@cloud:~# nsenter -t 19850 -n ss -an | grep 5432
tcp    ESTAB       0       0           172.19.0.3:60982        172.17.0.1:5432  
tcp    ESTAB       0       0           172.19.0.3:50932        172.17.0.1:5432  
root@cloud:~# 

both of them sit idle:

root@cloud:~# ps uax | grep huginn | grep 172.19.0.3
postgres 16139  0.0  0.4 215504 18588 ?        Ss   08:07   0:00 postgres: 11/main: huginn huginn 172.19.0.3(60982) idle
postgres 27242  0.0  0.4 215624 16900 ?        Ss   Dec06   0:11 postgres: 11/main: huginn huginn 172.19.0.3(50932) idle
root@cloud:~# 

The container logs are available here.

Could this be related to:
https://stackoverflow.com/questions/45218189/pgunabletosend-no-connection-to-the-server-in-rails
https://stackoverflow.com/questions/11248808/connection-pool-issue-with-activerecord-objects-in-rufus-scheduler

?

@skrobul
Copy link

skrobul commented Dec 9, 2020

A solution can be found in #1154 or here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants