Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initializing GeneratedServiceMetadataProvider takes a while #3664

Open
1 of 2 tasks
awmcc90 opened this issue Jan 3, 2023 · 5 comments · May be fixed by #3686
Open
1 of 2 tasks

Initializing GeneratedServiceMetadataProvider takes a while #3664

awmcc90 opened this issue Jan 3, 2023 · 5 comments · May be fixed by #3686
Labels
feature-request A feature should be added or improved.

Comments

@awmcc90
Copy link

awmcc90 commented Jan 3, 2023

Describe the feature

When creating a service client (e.g. DynamoDbAsyncClient) there is an expensive one time call to create GeneratedServiceMetadataProvider which incurs the penalty of an expensive static initializer. The static initializer creates a map object with 299 entries (currently), each of which also initializes an instance that extends ServiceMetadata which also all have large static initialization overhead.

From the looks of it, every single instance of ServiceMetadata is initialized but in my use case, only DynamodbServiceMetadata was needed. And this call was only needed to acquire the client endpoint.

I profiled just the cold start initialization of GeneratedServiceMetadataProvider to confirm my suspicions and get a general idea of the overhead incurred. On my machine the operation took between 350 and 500 ms. I think this could be improved significantly by implementing lazy loading of ServiceMetadata objects.

Benchmark

@State(Scope.Benchmark)
@Warmup(iterations = 0)
@Measurement(iterations = 1)
@Fork(20)
public class GeneratedProviderBenchmark {

    @Benchmark
    @BenchmarkMode(Mode.SingleShotTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    public void createObject() {
        GeneratedServiceMetadataProvider generated = new GeneratedServiceMetadataProvider();
        ServiceMetadata metadata = generated.serviceMetadata("xray");
    }

    public static void main(String... args) throws RunnerException, CommandLineOptionException {
        Options opt = new OptionsBuilder()
            .parent(new CommandLineOptions())
            .include(GeneratedProviderBenchmark.class.getSimpleName())
            .addProfiler(StackProfiler.class)
            .build();
        Collection<RunResult> run = new Runner(opt).run();
    }
}

Results:

Benchmark Mode Cnt Score Error Units
GeneratedProviderBenchmark.createObject ss 20 364.140 ± 16.565 ms/op

Use Case

The goal is to reduce the overhead as much as possible and make cold start times as performant as they can be. This is an ongoing process that a number of other tickets are related to. In fact the ticket that lead to the creation of the GeneratedServiceMetadataProvider class was done so in an effort to reduce cold start times of the DynamoDbClient.

Proposed Solution

Lazily initialize the GeneratedServiceMetadataProvider using a static factory method. I created a proof of concept to test a possible improvement to the object initialization.

public final class GeneratedServiceMetadataProvider implements ServiceMetadataProvider {
    private static final Map<String, ServiceMetadata> SERVICE_METADATA = new HashMap<>();

    private static ServiceMetadata createServiceMetadata(String endpointPrefix) {
        switch (endpointPrefix) {
            case "a4b":
                return new A4bServiceMetadata();
            ... // other cases      
            default:
                throw new IllegalStateException("Unexpected value: " + endpointPrefix);
        }
    }

    private static ServiceMetadata getServiceMetadata(String endpointPrefix) {
        if (!SERVICE_METADATA.containsKey(endpointPrefix)) {
            SERVICE_METADATA.put(endpointPrefix, createServiceMetadata(endpointPrefix));
        }
        return SERVICE_METADATA.get(endpointPrefix);
    }

    @Override
    public ServiceMetadata serviceMetadata(String endpointPrefix) {
        return getServiceMetadata(endpointPrefix);
    }
}

This produced the following results using the same benchmark as above:

Benchmark Mode Cnt Score Error Units
GeneratedProviderBenchmark.createObject ss 20 14.140 ± 0.465 ms/op

About a 25x improvement in object creation speed, not to mention the memory overhead thats saved by deferring initialization of all the other ServiceMetadata instances that aren't needed.

Other Information

As an aside, I was surprised that providing an EndpointProvider to the client builder - one which doesn't go through GeneratedServiceMetadataProvider - wasn't used at all when setting the endpoint of the client on creation. Why wouldn't the client builder use the endpoint provider, if present, if it's going to use the provider for every request anyway?

Related issues:
#748
#6

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

AWS Java SDK version used

2

JDK version used

1.8

Operating System and version

macOS Monterey 12.6.1

@awmcc90 awmcc90 added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Jan 3, 2023
@zoewangg
Copy link
Contributor

zoewangg commented Jan 13, 2023

Hi @awmcc90 thank you for the thorough report! We recently changed to a use EndpointProvider for endpoint resolution in place of generated service metadata classes, and I think it makes sense to lazily initialize it. We will take a look at your PR shortly. Thanks again for the detailed information.

As to the endpoint provider issue, it seems like a bug, could you create a new issue for it?

@zoewangg zoewangg removed the needs-triage This issue or PR still needs to be triaged. label Jan 13, 2023
@awmcc90
Copy link
Author

awmcc90 commented Jan 13, 2023

Thanks for your reply. I'm not sure I'm the right person to make the ticket about the EndpointProvider because I'm not sure how it's supposed to work. The only use of EndpointProvider that I can find are in various ExecutionInterceptor classes. Are you saying that GeneratedServiceMetadataProvider should never be called given there is always a default EndpointProvider?

Also, while looking into this I noticed that some of the codegen results in dead code. One example of that is this line in AsyncClientClass. That line needs to be inside the if statement on line 293 or else a variable is declared that is never used. In DefaultDynamoDbAsyncClient there are 38 instances of this unused variable declared. Should I create a ticket for that too?

@zoewangg
Copy link
Contributor

Are you saying that GeneratedServiceMetadataProvider should never be called given there is always a default EndpointProvider?

Yeah, we are planning to make the change to remove the usage ofGeneratedServiceMetadataProvider from the request path.

That line needs to be inside the if statement on line 293 or else a variable is declared that is never used. In DefaultDynamoDbAsyncClient there are 38 instances of this unused variable declared. Should I create a ticket for that too?

Good catch, could you create a ticket for it?

@MikeDombo
Copy link

Beyond being slow, this also allocates ~10MB based on a test with async-profiler. If none of this data is ever retrieved, then this is very wasteful, creating a spike of memory allocation.

@awmcc90
Copy link
Author

awmcc90 commented Aug 10, 2023

Any update on this? I made a PR but that's way out of sync at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved.
Projects
None yet
3 participants