Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report katib metrics by Prometheus #691

Closed
3 tasks done
hougangliu opened this issue Jul 19, 2019 · 4 comments
Closed
3 tasks done

report katib metrics by Prometheus #691

hougangliu opened this issue Jul 19, 2019 · 4 comments
Assignees

Comments

@hougangliu
Copy link
Member

hougangliu commented Jul 19, 2019

/kind feature

Describe the solution you'd like
[A clear and concise description of what you want to happen.]
Metrics being targeted to track by HTTP endpoint in Prometheus metrics format:

Report katib metrics:

@hougangliu
Copy link
Member Author

/assign

@hougangliu
Copy link
Member Author

v0.1.18 controller-runtime had introduced prometheus metrics to internal controller, and v0.1.18 controller-runtime depends on kubernetes-1.12.3 package api as here
for now pytorch-operator and tf-operator depend on kubernetes-1.11.2

If we upgrades controller-runtime to v0.1.18 or above in kabit, it will reduce much effort of this feature. controller-runtime v0.1.18 has conflict dependence with pytorch-operator and tf-operator, which katib depends on both.

@richardsliu @gaocegege @johnugeorge
do pytorch-operator and tf-operator repo have any plan to upgrade kubernetes lib dependence?

@gaocegege
Copy link
Member

I really suggest updating the dep version of the operators, but I am not sure if we can.

/cc @richardsliu

@johnugeorge
Copy link
Member

I can try

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants