Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request: disable logging #349

Open
ppcololo opened this issue Mar 16, 2024 · 11 comments
Open

request: disable logging #349

ppcololo opened this issue Mar 16, 2024 · 11 comments

Comments

@ppcololo
Copy link
Contributor

For now as I can understand we have such flow:

  1. workers publish logs in rabbit in queue logs
  2. coordinator pull logs from this queue logs
  3. coordinator saves logs into postgres DB
  4. after I press button Logs I can see them (logs from DB)

This makes some problems - if we have a lot of workers and logs I can see more than x millions mesages in the queue logs. As I can see coordinator doesn't pull and save logs in DB in proper time and that means if I press button Logs it shows nothing. And from time to time I have to purge this queue to see logs
image

Possible options:

  1. disable logging (use alternatives like ELK stack)
  2. adjust coordinator\logging performance
@runabol
Copy link
Owner

runabol commented Mar 16, 2024

Did you try setting TORK_COORDINATOR_QUEUES_LOGS to a value greater than 1 (default) to have multiple subscribers processing the logs queue?

@ppcololo
Copy link
Contributor Author

Thanks for the pointing to this.
As I can see - https://github.com/runabol/tork/blob/main/configs/sample.config.toml#L39-L45
Here some values. But could you share the link to documentation which describes in more details what that values mean?
If I set logs=x what does it mean?

@runabol
Copy link
Owner

runabol commented Mar 16, 2024

It means number of subscribers/goroutines that will process the logs queue in parallel.

@ppcololo
Copy link
Contributor Author

Thanks @runabol
It helped a lot.
Please add more info to the documentation - you will avoid a lof of questions in the future

@runabol
Copy link
Owner

runabol commented Mar 18, 2024

That's fair

@ppcololo ppcololo reopened this May 3, 2024
@ppcololo
Copy link
Contributor Author

ppcololo commented May 3, 2024

I have to reopen this:
We really need option to disable logging. Take a look - we set option in the logs:

[datastore.postgres]
dsn = ""
task.logs.interval = "168h"

It means - logs retention 7d
Today is 03.05.2024
But in DB I can see:

tork=# select min(created_at) from tasks_log_parts limit 1;
            min
----------------------------
 2024-04-22 12:38:07.650203
(1 row)

And our DB grows indefenetly:

tork=# select
  table_name,
  pg_size_pretty(pg_total_relation_size(quote_ident(table_name))),
  pg_total_relation_size(quote_ident(table_name))
from information_schema.tables
where table_schema = 'public'
order by 3 desc;
   table_name    | pg_size_pretty | pg_total_relation_size
-----------------+----------------+------------------------
 tasks_log_parts | 137 GB         |           146990120960
 tasks           | 221 MB         |              231546880
 jobs            | 5168 kB        |                5292032
 nodes           | 1744 kB        |                1785856
(4 rows)

So I can say - option in config doesnt work OR work really slow and can't delete all new logs.
We want to disable logging completely and use another software for this like ELK stack

@runabol
Copy link
Owner

runabol commented May 4, 2024

Try this config option:

postgres.WithTaskLogRetentionPeriod(conf.DurationDefault("datastore.postgres.task.logs.interval", postgres.DefaultTaskLogsRetentionPeriod)),

@ppcololo
Copy link
Contributor Author

ppcololo commented May 4, 2024

If you check my message above - this option doesn't work
Today I've checked logs in DB and I see

tork=# select min(created_at) from tasks_log_parts;
            min
----------------------------
 2024-04-22 12:42:13.105112
(1 row)

and

tork=# select
  table_name,
  pg_size_pretty(pg_total_relation_size(quote_ident(table_name))),
  pg_total_relation_size(quote_ident(table_name))
from information_schema.tables
where table_schema = 'public'
order by 3 desc;
   table_name    | pg_size_pretty | pg_total_relation_size
-----------------+----------------+------------------------
 tasks_log_parts | 162 GB         |           173817077760
 tasks           | 221 MB         |              231948288
 jobs            | 5312 kB        |                5439488
 nodes           | 1832 kB        |                1875968
(4 rows)

+ 25GB of logs from yesterday
As you can see tork deleted 6 mins of logs
was: 2024-04-22 12:38:07.650203
now: 2024-04-22 12:42:13.105112

@runabol
Copy link
Owner

runabol commented May 4, 2024

Sounds like the pruning process is not catching up quickly enough with the amount of logs you're generating per day. I can make the number of records it deletes per cleaning period configurable. Right now it's hard-coded to 1000 I believe.

@ppcololo
Copy link
Contributor Author

ppcololo commented May 4, 2024

If I'm not mistaken - we have about 20 millions of rows per day in DB

@runabol
Copy link
Owner

runabol commented May 6, 2024

Can you try release 0.1.73? It adds improvements to log shipping -- buffering log messages (up to one second) rather than sending each log line separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants