Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set jaeger-spark-dependency date in operator #1068

Open
linjmeyer opened this issue May 20, 2020 · 4 comments
Open

Set jaeger-spark-dependency date in operator #1068

linjmeyer opened this issue May 20, 2020 · 4 comments

Comments

@linjmeyer
Copy link

By default the jaeger-spark-dependency CronJob runs just before midnight. I have several large clusters sharing a single Jaeger instance so the CronJob uses a lot of resources. The problem I am facing is that ~12 AM UTC (cluster time) is not an ideal time to run a memory intensive job as the clusters are still heavily used at this time. I would like to move it to more like 2-3 AM the day after, but the job defaults to the current day and never finishes when I try this setup. I assume because it is reading the index of the current day which is growing faster than it can read the traces/spans.

The Jaeger Spark Dependency can be set to use a particular date using the DATE environment variable (docs) and they include some bash examples to dynamically get the previous day.

It would be great if the Jaeger Operator exposed this setting or added an env field for arbitrary environment variables to be set.

Thanks!

@ghost ghost added the needs-triage New issues, in need of classification label May 20, 2020
@pavolloffay pavolloffay removed the needs-triage New issues, in need of classification label May 20, 2020
@pavolloffay
Copy link
Member

It sounds reasonable, would you like to submit a PR?

I would like to move it to more like 2-3 AM the day after, but the job defaults to the current day and never finishes when I try this setup. I assume because it is reading the index of the current day which is growing faster than it can read the traces/spans.

It seems controversial as at 2-3 AM there should be less load.

@linjmeyer
Copy link
Author

It seems controversial as at 2-3 AM there should be less load.

I'm not sure what you mean by this?

Happy to submit a PR, I don't have a ton of Go experience but this change seems fairly simple so I'll git it a shot!

@linjmeyer
Copy link
Author

I could not find a clean way to achieve this in the operator. Instead opened a PR to add this as feature in the Spark Dependencies repo here. Feedback is welcome!

@jpkrohling
Copy link
Contributor

We should really use @daily in the job and let the scheduler figure out the best time to run it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants