Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add non-scalar metric expectation to protobufs [DET-4893] [DET-4911] #1876

Merged
merged 21 commits into from
Feb 3, 2021

Conversation

hamidzr
Copy link
Member

@hamidzr hamidzr commented Jan 27, 2021

Description

Test Plan

  • run the provided noop experiment and check taht the api does not fail
  • add master integ test. watch it fail, add a fix, watch it pass
    2021-01-29_15-12-27

Commentary (optional)

  • for a more end-to-end test we need to seed the database for testing the api, mock the api response for testing the UI, or submit the test experiment and wait for it to finish before proceeding to have both. what do you suggest?

Checklist

  • remove the test commit
  • User-facing API changes need the "User-facing API Change" label.
  • remove test scope limit
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.

@cla-bot cla-bot bot added the cla-signed label Jan 27, 2021
@hamidzr hamidzr force-pushed the 4893-non-scalar-metric branch 2 times, most recently from 764a487 to 8966e6e Compare January 28, 2021 00:43
@hamidzr hamidzr requested a review from hkang1 January 28, 2021 07:25
@hamidzr hamidzr force-pushed the 4893-non-scalar-metric branch 2 times, most recently from dc44162 to 9dbbdd3 Compare January 28, 2021 23:26
@hamidzr hamidzr marked this pull request as ready for review January 28, 2021 23:27
@hamidzr hamidzr changed the title fix: add non-scalar metric expectation to protobufs [DET-4893] fix: add non-scalar metric expectation to protobufs [DET-4893] [DET-4911] Jan 28, 2021
@@ -11,7 +11,7 @@ checkpoint_storage:

db:
user: postgres
host: localhost
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more curiosity than anything, but why does this have to change?

Copy link
Member Author

@hamidzr hamidzr Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a short version would be less chance for DNS (or hosts entry?) to mess up and delay responding with the address.

I was in this situation today where sqlx.connect took aroun 3 min each time and haven't fully figure out why but this helped...
https://determined-ai.slack.com/archives/CSKCTUKEE/p1611877804019800?thread_ts=1611864602.016200&cid=CSKCTUKEE

@mackrorysd
Copy link
Member

The protobuf changes themselves look good to me. I don't fully understand why the incompatibility errors manifest the way they do. I'm guessing they operate on generated code, so there are 3 distinct symptoms of changing the type?

Still wrapping my head around the test changes.

@hamidzr
Copy link
Member Author

hamidzr commented Jan 29, 2021

The protobuf changes themselves look good to me. I don't fully understand why the incompatibility errors manifest the way they do. I'm guessing they operate on generated code, so there are 3 distinct symptoms of changing the type?

Still wrapping my head around the test changes.

don't worry about the test changes they're mostly just setup for testing the change through WebUI e2e tests but I'm now limiting the test scope and just writing integration tests in Go.

3 symptoms

hmm

  • generated code, and the change is different in each language
  • bigger set of acceptable metrics values. will need to be ignored or handled otherwise by clients
  • API docs
  • what am i missing

@hamidzr
Copy link
Member Author

hamidzr commented Jan 29, 2021

I took out irrelevant changes.

src/determined/checkpoint/v1/checkpoint.proto:11:1:Previously present message "Metrics.ValidationMetricsEntry" was deleted from file.
src/determined/checkpoint/v1/checkpoint.proto:15:3:Field "2" on message "Metrics" changed label from "repeated" to "optional".
src/determined/checkpoint/v1/checkpoint.proto:15:3:Field "2" on message "Metrics" changed type from "determined.checkpoint.v1.Metrics.ValidationMetricsEntry" to "google.protobuf.Struct".
src/determined/trial/v1/trial.proto:42:1:Previously present message "MetricsWorkload.MetricsEntry" was deleted from file.
src/determined/trial/v1/trial.proto:61:3:Field "4" on message "MetricsWorkload" changed label from "repeated" to "optional".
src/determined/trial/v1/trial.proto:61:3:Field "4" on message "MetricsWorkload" changed type from "determined.trial.v1.MetricsWorkload.MetricsEntry" to "google.protobuf.Struct".
Copy link
Contributor

@hkang1 hkang1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decoder changes look good with one minor improvement.

@hkang1 hkang1 removed their assignment Feb 2, 2021
@hamidzr hamidzr merged commit 48418e6 into determined-ai:master Feb 3, 2021
@hamidzr hamidzr deleted the 4893-non-scalar-metric branch February 3, 2021 01:43
determined-dsw pushed a commit that referenced this pull request Feb 3, 2021
@dannysauer dannysauer added this to the 0.14.2 milestone Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants