Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nmt download #300

Merged
merged 4 commits into from
Feb 9, 2024
Merged

Nmt download #300

merged 4 commits into from
Feb 9, 2024

Conversation

johnml1135
Copy link
Collaborator

@johnml1135 johnml1135 commented Feb 5, 2024

Fixes #268


This change is Reviewable

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 14 of 14 files at r1, all commit messages.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @johnml1135)


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 15 at r1 (raw file):

    rpc StartBuild(StartBuildRequest) returns (google.protobuf.Empty);
    rpc CancelBuild(CancelBuildRequest) returns (google.protobuf.Empty);
    rpc GetModelPresignedUrl(GetModelPresignedUrlRequest) returns (GetModelPresignedUrlResponse);

I wouldn't use Presigned in the name. That term is specific to S3. I would call this something like GetModelDownloadUrl or just GetModelUrl.


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 84 at r1 (raw file):

message GetModelPresignedUrlResponse {
    string PresignedUrl = 1;

These fields should be snake case.


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 85 at r1 (raw file):

message GetModelPresignedUrlResponse {
    string PresignedUrl = 1;
    int32 BuildRevision = 2;

This should be model_revision.


src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 86 at r1 (raw file):

    string PresignedUrl = 1;
    int32 BuildRevision = 2;
    string UrlExpirationTime = 3;

I believe that Protobuf has Timestamp and DateTime types.


src/Serval.Translation/Contracts/ModelPresignedUrlDto.cs line 7 at r1 (raw file):

    public string PresignedUrl { get; set; } = default!;
    public string BuildRevision { get; set; } = default!;
    public string UrlExpirationTime { get; set; } = default!;

Just ExpirationTime should be sufficient. Also, what is this? It is a specific date/time. If so, you should use the appropriate type, such as DateTime.


src/Serval.Translation/Contracts/TranslationEngineConfigDto.cs line 34 at r1 (raw file):

    /// </summary>
    [JsonProperty(Required = Required.Always)]
    public bool IsModelRetrievable { get; set; } = false;

We want to use the same property to enable live inferencing, so the name should be more generalized. Maybe IsModelPersisted or SaveModel.


src/Serval.Translation/Controllers/TranslationEnginesController.cs line 906 at r1 (raw file):

    /// <param name="id">The translation engine id</param>
    /// <param name="cancellationToken"></param>
    /// <response code="200">The build job was cancelled successfully.</response>

This is incorrect.


src/Serval.Translation/Controllers/TranslationEnginesController.cs line 920 at r1 (raw file):

    [ProducesResponseType(typeof(void), StatusCodes.Status405MethodNotAllowed)]
    [ProducesResponseType(typeof(void), StatusCodes.Status503ServiceUnavailable)]
    public async Task<ActionResult<ModelPresignedUrlDto>> DownloadModelAsync(

This should be GetModelDownloadUrlAsync or GetModelUrlAsync, since it isn't actually downloading the model.


src/Serval.Translation/Controllers/TranslationEnginesController.cs line 1032 at r1 (raw file):

        try
        {
            build.Options = Newtonsoft.Json.JsonConvert.DeserializeObject<IDictionary<string, object>>(

Is there any way to get this to work using System.Text.Json instead?


src/Serval.Translation/Services/EngineService.cs line 281 at r1 (raw file):

            cancellationToken: cancellationToken
        );
        return new ModelPresignedUrlDto

DTOs shouldn't be used in the service layer. The controllers are responsible for mapping between the service layer models and DTOs.


tests/Serval.E2ETests/ServalClientHelper.cs line 304 at r1 (raw file):

    public async ValueTask DisposeAsync()
    {
        await ClearEnginesAsync();

Why did you remove this? It is a good idea to clean up after every test.

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 15 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I wouldn't use Presigned in the name. That term is specific to S3. I would call this something like GetModelDownloadUrl or just GetModelUrl.

GetModelDownloadUrl - explicit that it can be used for external clients.

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 86 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I believe that Protobuf has Timestamp and DateTime types.

Done.

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Contracts/ModelPresignedUrlDto.cs line 7 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Just ExpirationTime should be sufficient. Also, what is this? It is a specific date/time. If so, you should use the appropriate type, such as DateTime.

Changed to ExpiresAt and a DateTime.

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Contracts/TranslationEngineConfigDto.cs line 34 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

We want to use the same property to enable live inferencing, so the name should be more generalized. Maybe IsModelPersisted or SaveModel.

I like ``IsModelPersisted`.

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Controllers/TranslationEnginesController.cs line 906 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This is incorrect.

Done.

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Controllers/TranslationEnginesController.cs line 920 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This should be GetModelDownloadUrlAsync or GetModelUrlAsync, since it isn't actually downloading the model.

and what about changing the endpoint name to model-url?

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Controllers/TranslationEnginesController.cs line 1032 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Is there any way to get this to work using System.Text.Json instead?

I couldn't find a way - I had to rip a custom DictionaryJsonConverter.cs as well. It's because nested options don't currently work. It's not needed for this anymore (because we aren't using buildOptions, but it is needed. Is there a reason you don't want to use Newtonsoft? It's used in quite a few places already.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @johnml1135)


src/Serval.Translation/Controllers/TranslationEnginesController.cs line 920 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

and what about changing the endpoint name to model-url?

We should just be consistent everywhere. If we use GetModelDownloadUrlAsync, then we should call it model-download-url. If we use GetModelUrlAsync, then we should call it model-url. I would lean towards model-download-url, since it is more explicit. Also, this should be a GET endopoint.


src/Serval.Translation/Controllers/TranslationEnginesController.cs line 1032 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

I couldn't find a way - I had to rip a custom DictionaryJsonConverter.cs as well. It's because nested options don't currently work. It's not needed for this anymore (because we aren't using buildOptions, but it is needed. Is there a reason you don't want to use Newtonsoft? It's used in quite a few places already.

Instead of using two JSON libraries, I would like to move us to only using the one built-in to .NET. Using two JSON libraries causes confusion.

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Controllers/TranslationEnginesController.cs line 920 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

We should just be consistent everywhere. If we use GetModelDownloadUrlAsync, then we should call it model-download-url. If we use GetModelUrlAsync, then we should call it model-url. I would lean towards model-download-url, since it is more explicit. Also, this should be a GET endopoint.

model-download-url

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Controllers/TranslationEnginesController.cs line 1032 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Instead of using two JSON libraries, I would like to move us to only using the one built-in to .NET. Using two JSON libraries causes confusion.

What do you want to do now? make an issue and try to clean it up later?

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Services/EngineService.cs line 281 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

DTOs shouldn't be used in the service layer. The controllers are responsible for mapping between the service layer models and DTOs.

Done.

@johnml1135
Copy link
Collaborator Author

tests/Serval.E2ETests/ServalClientHelper.cs line 304 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Why did you remove this? It is a good idea to clean up after every test.

I sometimes want to interact with the model and data after the test has completed via swagger. Having the data persist after the test is done accomplishes this.

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 84 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

These fields should be snake case.

Done.

@johnml1135
Copy link
Collaborator Author

src/Serval.Grpc/Protos/serval/translation/v1/engine.proto line 85 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This should be model_revision.

Done.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 14 of 14 files at r2, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @johnml1135)


src/Serval.Translation/Contracts/ModelDownloadUrlDto.cs line 6 at r2 (raw file):

{
    public string Url { get; set; } = default!;
    public string ModelRevision { get; set; } = default!;

This should be an int, so that it is consistent with the TranslationEngineDto.


src/Serval.Translation/Controllers/TranslationEnginesController.cs line 1032 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

What do you want to do now? make an issue and try to clean it up later?

We should be able to update the ObjectToInferredTypesConverter to support objects and arrays. You can either revert this and I can create a separate PR to specifically fix this issue or I can fix it in this PR.


tests/Serval.E2ETests/ServalClientHelper.cs line 304 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

I sometimes want to interact with the model and data after the test has completed via swagger. Having the data persist after the test is done accomplishes this.

Is that for debugging purposes?

@codecov-commenter
Copy link

codecov-commenter commented Feb 6, 2024

Codecov Report

Attention: 106 lines in your changes are missing coverage. Please review.

Comparison is base (28001cf) 69.95% compared to head (119e88d) 68.91%.

Files Patch % Lines
src/Serval.Client/Client.g.cs 2.73% 71 Missing ⚠️
src/Serval.Translation/Services/EngineService.cs 11.11% 16 Missing ⚠️
...lation/Controllers/TranslationEnginesController.cs 23.52% 13 Missing ⚠️
...erval.Translation/Contracts/ModelDownloadUrlDto.cs 0.00% 3 Missing ⚠️
src/Serval.Translation/Models/ModelDownloadUrl.cs 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #300      +/-   ##
==========================================
- Coverage   69.95%   68.91%   -1.04%     
==========================================
  Files         148      150       +2     
  Lines        6810     6920     +110     
  Branches     1071     1093      +22     
==========================================
+ Hits         4764     4769       +5     
- Misses       1612     1717     +105     
  Partials      434      434              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Contracts/ModelDownloadUrlDto.cs line 6 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This should be an int, so that it is consistent with the TranslationEngineDto.

ok

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Controllers/TranslationEnginesController.cs line 1032 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

We should be able to update the ObjectToInferredTypesConverter to support objects and arrays. You can either revert this and I can create a separate PR to specifically fix this issue or I can fix it in this PR.

I'll take a shot at it - and try to test it.

@johnml1135
Copy link
Collaborator Author

tests/Serval.E2ETests/ServalClientHelper.cs line 304 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Is that for debugging purposes?

Yes, for debugging. Should we gate it with isDevelopment?

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @johnml1135)


tests/Serval.E2ETests/ServalClientHelper.cs line 304 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Yes, for debugging. Should we gate it with isDevelopment?

Yeah, that's a good idea.

@johnml1135
Copy link
Collaborator Author

tests/Serval.E2ETests/ServalClientHelper.cs line 304 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Yeah, that's a good idea.

What do you think? These are E2E tests - the environment is really about the target. Moreover, I use "Staging" for all of my development work with docker compose.

@johnml1135
Copy link
Collaborator Author

tests/Serval.E2ETests/ServalClientHelper.cs line 308 at r4 (raw file):

    public async ValueTask DisposeAsync()
    {
        DeploymentInfo? result = await StatusClient.GetDeploymentInfoAsync();

THis is not good - I will redo with isdevelopment - that is what we want anyway.

@johnml1135
Copy link
Collaborator Author

tests/Serval.E2ETests/ServalClientHelper.cs line 308 at r4 (raw file):

Previously, johnml1135 (John Lambert) wrote…

THis is not good - I will redo with isdevelopment - that is what we want anyway.

This is the best I can do. Dependency injection in testing doesn't work well.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @johnml1135)


tests/Serval.E2ETests/ServalClientHelper.cs line 308 at r4 (raw file):

Previously, johnml1135 (John Lambert) wrote…

This is the best I can do. Dependency injection in testing doesn't work well.

Yeah, I think that's the best we can do.

@Enkidu93
Copy link
Collaborator

Enkidu93 commented Feb 7, 2024

I haven't been following this PR very closely along the way. I would have expected changes to EngineServiceTests.cs as well as (judging from the codecov report) a parsing failure case test in the TranslationEngineTests.cs. Is there a reason that's not the case? Are we just pushing that down the road? Or was it deemed minor enough?

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Controllers/TranslationEnginesController.cs line 1032 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

I'll take a shot at it - and try to test it.

Done

@johnml1135
Copy link
Collaborator Author

GetModelDownloadUrl is tested in E2E tests. I'll add a bad parsing case.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 7 files at r6, 17 of 17 files at r8, 1 of 1 files at r9, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @johnml1135)

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @johnml1135)


src/Serval.Translation/Contracts/TranslationEngineConfigDto.cs line 34 at r9 (raw file):

    /// </summary>
    [JsonProperty(Required = Required.Always)]
    public bool IsModelPersisted { get; set; } = false;

This should be optional, so that each engine can determine the default.

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Contracts/TranslationEngineConfigDto.cs line 34 at r9 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This should be optional, so that each engine can determine the default.

Everywhere? Optional means bool?? Is there another parameter that follows a similar pattern?

Copy link
Collaborator Author

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I don't know if there is an easy way to add a bad Engine service test...

Reviewable status: 17 of 22 files reviewed, 1 unresolved discussion (waiting on @ddaspit)

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r10, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @johnml1135)


src/Serval.Translation/Contracts/ModelDownloadUrlDto.cs line 7 at r10 (raw file):

    public string Url { get; set; } = default!;
    public int ModelRevision { get; set; } = default!;
    public string ExpiresAt { get; set; } = default!;

We are already using DateTime in the TranslationBuildDto. I don't want to return two different DateTime formats.


src/Serval.Translation/Contracts/TranslationEngineConfigDto.cs line 34 at r9 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Everywhere? Optional means bool?? Is there another parameter that follows a similar pattern?

Yes, it should probably be optional everywhere in Serval. You make it optional by making it nullable. There are many other optional parameters.

@johnml1135
Copy link
Collaborator Author

Added some more testing.

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Contracts/ModelDownloadUrlDto.cs line 7 at r10 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

We are already using DateTime in the TranslationBuildDto. I don't want to return two different DateTime formats.

The DateTimeOffset thing was confusing me: https://stackoverflow.com/questions/72261151/does-nswag-convert-datetime-to-datetimeoffset-in-the-timezone-the-server-is-in. I'll convert back to DateTime.

add E2E test
Add expiration time
remove files at end of test based off of Development env.  But just look at the env variable - dependency injection in testing is weird.
Switch to normal JSON converter
Add tesing
Add echo download response
Clarifiy documentation
@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Contracts/TranslationEngineConfigDto.cs line 34 at r9 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Yes, it should probably be optional everywhere in Serval. You make it optional by making it nullable. There are many other optional parameters.

Done

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 9 files at r11, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @johnml1135)


src/Serval.Translation/Services/EngineService.cs line 140 at r11 (raw file):

            if (engine.Name is not null)
                request.EngineName = engine.Name;
            await client.CreateAsync(request, cancellationToken: cancellationToken);

Do you think it makes sense to return the default value for IsModelPersisted from the engine so we can set it on the model? Or should we leave it as null if it isn't set.

@johnml1135
Copy link
Collaborator Author

src/Serval.Translation/Services/EngineService.cs line 140 at r11 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Do you think it makes sense to return the default value for IsModelPersisted from the engine so we can set it on the model? Or should we leave it as null if it isn't set.

keep it nullable for backwards compatibility?

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r12, 7 of 7 files at r13, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @johnml1135)


src/Serval.Translation/Services/EngineService.cs line 140 at r13 (raw file):

        // IsModelPersisted may be updated by the engine with the respective default.
        engine.IsModelPersisted = createResponse.IsModelPersisted;
        await Entities.InsertAsync(engine, cancellationToken);

I was purposefully inserting the engine model in Mongo before creating the engine. There are a couple of reasons:

  1. I don't want to get into a state where Machine has created an engine that Serval doesn't know about. It will be difficult to clean up properly. This can occur if you create the Machine engine before you insert the engine model in Mongo.
  2. It is possible (although I don't think it currently happens) for the Machine engine to callback to Serval referencing an engine that hasn't been created yet.

After the try...catch, you should update the IsModelPersisted property.

if (engine.IsModelPersisted is null)
    await Entities.UpdateAsync(engine, u => u.Set(e => e.IsModelPersisted, createResponse.IsModelPersisted, cancellationToken: cancellationToken));

You should also return the engine model, so that the updated model can be returned as the response.

Copy link
Collaborator Author

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)


src/Serval.Translation/Services/EngineService.cs line 140 at r13 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I was purposefully inserting the engine model in Mongo before creating the engine. There are a couple of reasons:

  1. I don't want to get into a state where Machine has created an engine that Serval doesn't know about. It will be difficult to clean up properly. This can occur if you create the Machine engine before you insert the engine model in Mongo.
  2. It is possible (although I don't think it currently happens) for the Machine engine to callback to Serval referencing an engine that hasn't been created yet.

After the try...catch, you should update the IsModelPersisted property.

if (engine.IsModelPersisted is null)
    await Entities.UpdateAsync(engine, u => u.Set(e => e.IsModelPersisted, createResponse.IsModelPersisted, cancellationToken: cancellationToken));

You should also return the engine model, so that the updated model can be returned as the response.

Done.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 7 of 7 files at r14, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @johnml1135)

@johnml1135 johnml1135 merged commit 8db70f4 into main Feb 9, 2024
2 checks passed
@ddaspit ddaspit deleted the nmt_download branch February 9, 2024 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Download NMT Models
4 participants