Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CR2-22 CR2-49 CR2-48 add metrics list to sdk #328

Merged
merged 13 commits into from Nov 19, 2020

Conversation

robghchen
Copy link
Contributor

@robghchen robghchen commented Oct 29, 2020

Adding list custom metrics functionality to sdk and cli for jobs, experiments, deployments, and notebooks

QA Test Plan:

  1. in cli, check that this new list command shows up in the help text: gradient deployments metrics --help
  2. create a deployment (gui or cli doesn't matter) and wait for it to finish provisioning
  3. try the new list command: gradient deployments metrics list --id XYZ, should return custom metrics (a list of words),
  4. repeat 1 and 2 for jobs, experiments, and notebooks. these might return null but that's okay for now, just checking that this new command is available. note the stuff that returns null below if any and I'll solve that in a separate ticket.

@linear
Copy link

linear bot commented Oct 29, 2020

CR-22 View custom metrics from CLI

  • Allow querying custom metrics using experiment metrics or experiment stream. Right now we use a hardcoded list of options and we would like to relax the validation requirement to allow any input. We should update the help for these commands to list all of the known available options (found here).
  • We need a new metrics sub-command for each workload, list, which will query available custom metrics from the metrics server.
# all three of these go to the metrics service on the cluster
# talks to api to get metrics url, then talks to metrics url
gradient experiments metrics list      # optional timeframe, done over https
# --> names of metrics
gradient experiments metrics stream    # query, done over ws, long-lived
gradient experiments metrics get       # query + timeframe, done over https
# --> actual metric values

@robghchen robghchen changed the title CR-22 add metrics list to sdk CR2-22 CR2-49 CR2-48 add metrics list to sdk Nov 5, 2020
@robghchen robghchen marked this pull request as ready for review November 6, 2020 19:07
@robghchen
Copy link
Contributor Author

this can be ready for review, i made a separate PR for messing around with tests #331

"interval": interval,
"objecttype": self.OBJECT_TYPE,
"handle": instance_id,
"metrics_api_url": "% s:8080"% metrics_api_url if 'local' in metrics_api_url else metrics_api_url,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be too broad of a check. How about localhost?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at first i assumed it would contain localhost but when i logged out metrics_api_url it said something along the lines of http...local.paperspace...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i could check local.paperspace instead duh lol

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is specific to that domain then maybe we should be including the port in the metrics URL returned for workloads in that domain, ie. in the service layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea that would be better

Copy link

@tallyw00d tallyw00d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list exists. Deployments, experiments, jobs, notebooks all currently return null

LGTM :shipit:

@robghchen robghchen merged commit 16b9e08 into master Nov 19, 2020
@robghchen robghchen deleted the CR-22-add-metrics-list-to-sdk branch November 19, 2020 21:34
@PSBOT
Copy link

PSBOT commented Nov 19, 2020

🎉 This PR is included in version 1.2.3 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

@PSBOT PSBOT added the released label Nov 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants