Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log management for long running indexing service tasks #401

Closed
gianm opened this issue Feb 19, 2014 · 6 comments
Closed

Log management for long running indexing service tasks #401

gianm opened this issue Feb 19, 2014 · 6 comments

Comments

@gianm
Copy link
Contributor

gianm commented Feb 19, 2014

If someone wants to have long running indexing tasks (believable in the case of realtime index tasks) the log will grow without bound. We need to do log rotation or truncation or something.

@deepujain
Copy link
Contributor

I would like to take up this issue and have spoken to neo for a little background about it.

Assuming druid uses log4j for logging purposes a simple fix would be to use DailyRollingFileAppender instead of FileAppender. I found log4j.xml within druid/install module. However after compilation i did not find it being bundled with tar, hence i manually copied the file and included it in classpath of historical node (used historical as example). It did not cause any affect.

In addition to using dailyrollingfileappender, i found this very useful appender called CustodianDailyRollingFileAppender by Ryan that does the following in addition to DailyRollingFileAppender

  1. compress log files older than today
  2. remove log files older than a specified number of days

through two additional settings

Will compress log files daily and delete them after 14 days.

I have made the above changes but unable to plug druid/install/log4j.xml into executable.

@gianm
Copy link
Contributor Author

gianm commented Jul 13, 2015

Some discussion happened on this issue last week. I think what needs to be done to make log rotation work along with archiving is:

  • Modification of TaskLogPusher that allows pushing logs in chunks. This is technically an external interface, so we may have to consider whether it's likely that there are any implementations outside of druid-io.
  • Modification of ForkingTaskRunner to use that new stuff and actually push logs in chunks.
  • Some kind of TaskLogStreamer that can unify the log chunks and any fragment still present on the actual task runner.

Or if we just want to throw away old logs, it's a little simpler:

  • Modification of ForkingTaskRunner to split logs into chunks and periodically delete old chunks.
  • When it's time to push logs (at the end of a task, if that ever happens) then merge & push whatever chunks haven't been deleted yet.

I think it would be nice to do the first thing, since I am a pack rat when it comes to logs, but I'm also wondering how other people feel.

@guobingkun
Copy link
Contributor

@gianm I've thought about this a little and discussed with @himanshug.

I am thinking about creating a ScheduledExecutorService in ForkingTaskRunner that periodically checks if it should split the current toLogFile (it can either check if the current log file is too big or if it is too old, haven't decided yet on this one).

After splitting the current log file, we can use TaskLogPusher to push the larger/older chunk to deep storage, and potentially merge with the existing one on the deep storage. Meanwhile, if TaskLogStreamer needs to show the task log to user (e.g., user wants to see log from overlord console), it can temporarily merge all the existing chunks + current log file on overlord/middleManager and return it.

@drcrallen
Copy link
Contributor

@guobingkun How would you feel if the log was a directory instead of a file, and all files in it were assumed to have lexicographical ordering?

Then you can define a log4j2.xml with a sensible log rotation rule set and the log reader has a simple set of rules on how to concatenate files in the directory.

@drcrallen
Copy link
Contributor

In other words, leave it up to the logging system to manage splitting and maintaining the files, but allow the logging facilitator in druid to handle a simple set of rules

@xianyinxin
Copy link

Any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants