Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide support for caching GRPC method response #7945

Closed
makdharma opened this issue Sep 1, 2016 · 37 comments
Closed

Provide support for caching GRPC method response #7945

makdharma opened this issue Sep 1, 2016 · 37 comments

Comments

@makdharma
Copy link
Contributor

No description provided.

@makdharma makdharma self-assigned this Sep 1, 2016
@makdharma
Copy link
Contributor Author

makdharma commented Sep 1, 2016

gRPC Caching Design

Author: @makdharma

State: draft

Contents

[[TOC]]

Background

Caching of resources is used in HTTP to improve performance of websites and web-apps. gRPC is an alternative for HTTP/REST with many clear advantages. However lack of caching support in gRPC is a hindrance to adoption where caching performance matters. This document evaluates options and suggests an approach to mitigate this disadvantage.

Design and Implementation Goals

Implementation Goals

  1. Caching solution for last mile of compute (mobile, desktop, web)
  2. Reducing load on the server by using caching where appropriate
  3. Improve user experience by reducing latency of fetch requests

Design goals

We would like the design choice to accommodate following future goals. Actual timeline to implement these will be driven by customer requirements.

  1. Offline operation
  2. Implementation of on-device (private) cache
  3. Caching in the data center between VMs
  4. Streaming RPC support

Potential solutions

Terminology

gRPC request originates from the client application. Client application may maintain an on-device private cache. The requests traverses through various proxies and reaches the reverse proxy (GFE). Reverse proxy sends the request to the Application server. Application server sends a response. The response may be cached by any and all proxies on the response path.

Option 1 - HTTP Cache

gRPC is based on HTTP/2 which already supports caching. Use HTTP’s caching mechanism by using GET instead of POST verb for gRPC request. Typical interaction will follow these steps:

  1. Client application starts a gRPC call
  2. gRPC library uses GET instead of POST
    1. The request payload is base64 encoded and sent as query string
  3. A proxy looks for the response in the cache.
    1. If found in the cache, the response is sent by the proxy directly
    2. Else, the request is forwarded to the server and the response is cached
  4. gRPC library returns the response to the application.

Option 2 - Native Cache

Implement a native cache in the proxy that understands gRPC protocol. Add a component to the proxy that can parse gRPC requests, and generate native gRPC response. Typical interaction will follow these steps:

  1. Client application starts a gRPC call
  2. gRPC client library computes a cache lookup key and sends it as part of HTTP header
  3. A proxy parses the request. If cache lookup key exists, proxy uses it to search the cache.
    1. If found in the cache, the response is sent by the proxy directly
    2. Else, the request is forwarded to the server and the response is cached
  4. gRPC library returns the response to the application.

Other options investigated

  1. GET with payload - Possible but non standard. Not supported by GFE.
  2. SEARCH/FETCH/REPORT verbs - These features are not widely adopted. It might become a feasible option in future. When that happens we can deprecate GET based caching.
  3. Cacheable POST - It is possible to cache result of a POST request which can be subsequently accessed by a GET request with the same parameters as POST. This looks appealing but involves complex design for choosing between POST or GET on the client.

Pros and Cons

HTTP Cache Pros:

  1. HTTP standard compliant
  2. Works everywhere (No Google specific changes)
  3. Make use of mature and well understood technology

HTTP Cache Cons:

  1. Base64 encoding increases request size, leading to inefficient use of upstream bandwidth.
  2. Proxies impose limit on URI size. GET requests exceeding the URI Size limit will fail. The URI size limit is non-standard and depends on proxy configuration. For example, nginx defaults to 8K.

Native gRPC Cache Pros:

  1. No limitation on the URI size
  2. Efficient use of bandwidth, because no need to base64 encoding.

Native gRPC Cache Cons:

  1. Similar to, but different solution than HTTP caching.
  2. Needs to be implemented for all major proxies.

Given the open source adoption as a stated goal, HTTP caching based solution seems like the right choice. The two cons - size bloat and limitation on URI size - are problems in theory but not in practice. Typical use case of caching in gRPC is for fetching static resources. Such requests don’t have lot of parameters, therefore not likely to hit the URI size limitation.

Rest of this document explores the HTTP Caching solution.

Design Goals

The choice of HTTP Caching works well for the design goals.

  1. Offline operation: Having on-device cache is the enabler for offline operation.
  2. Implementation of on-device (private) cache: Cache control headers have built in support for both on-device (private) and proxy (public) caches. Implementation of a on-device cache is out of scope for this document.
  3. Caching in the data center between VMs: There is nothing special about this use case compared with a standard client-server caching use case, except maybe the support for streaming RPC caching.
  4. Streaming RPC caching support: Works out of the box for half-duplex RPCs. Typical use case is that of server streaming for large file fetch. A proxy can cache all responses in the stream and send it as one response when it sees a GET request.

Caching Considerations

Cacheable methods

Only the service author knows

  1. Whether to cache the response at all,
  2. For how long the response should be cached

If a method is deemed safe and idempotent the response can be cached. The proposal is as follows:

  1. Service author marks methods as cacheable using proto annotation
  2. gRPC client will use GET instead of POST for cacheable gRPC requests
  3. Expose an API in gRPC server to set the cache headers at runtime.
  4. Expose an API in the client to turn on/off cacheable requests, and acceptable cache age limit.

RPC failure and retry mechanism

An RPC using GET can fail in ways different from POST listed below. The reasons for failure may be transient or nontransient (such as configuration problems).:

  1. Transient - Cache warnings that are treated as errors
  2. Nontransient - Request exceeds maximum size
  3. Nontransient - gRPC server does not support caching

How to react to failures? Options:

  1. Automatic retry with POST (disable caching)
  2. Let the application decide. Application might retry the RPC for transient failures, or disable caching for non-transient failures. A binary-wide API is provided to disable caching.

Proposal: Go with Option 2 in the interest of giving maximum visibility and control to the application.

API Additions

Implementation details TBD.

Server side API addition

  1. Function to Set cache control header. If Server doesn’t call this API explicitly then caching is disabled for this method by setting appropriate cache control headers. For example, Cache-Control: no-cache, no-store, must-revalidate

Client side API addition

  1. Function to disable caching completely for all methods. The actual API signature will depend upon wrapped language implementation.

Cache validators

Cache validators such as ETag (entity tag) or "if-modified-since" or “last-modified” header fields are useful to validate a locally stored response. Since there is no on-device cache, cache validators are not used by gRPC client. It is however possible that a caching proxy along the request path may use validator.

Cache Control Headers

Cache-Control headers set by client

  1. max-age: Indicates that the client is willing to accept a response whose age is no greater than the specified time in seconds. The value for max-age comes from application. It is set on per request.

Cache-Control headers set by server

  1. public:
  2. must-revalidate: When the must-revalidate directive is present in a response received by a cache, that cache MUST NOT use the entry after it becomes stale to respond to a subsequent request without first revalidating it with the origin server. (I.e., the cache MUST do an end-to-end revalidation every time, if, based solely on the origin server's Expires or max-age value, the cached response is stale.) We do not want third party caches along the way to make their own judgment about freshness of a cache entry. Server gets the final word in determining when cache entry needs to be thrown out.
  3. no-transform: Therefore, if a message includes the no-transform directive, an intermediate cache or proxy MUST NOT change those headers that are listed in section 13.5.2 as being subject to the no-transform directive. This implies that the cache or proxy MUST NOT change any aspect of the entity-body that is specified by these headers, including the value of the entity-body itself. This is just to be on the safe side.
  4. no-cache: This will disable caching of a response.

Cache Warnings and Errors

Given the strict requirements set by client and server cache control headers, we do not expect to receive warnings in the normal course of operation. Any warning is treated as error and the RPC fails.

Security and Privacy

Open question: Should caching be allowed for authenticated request responses? Proposed answer: No. This is to avoid sensitive information cached by mistake in public caches. It can be a great follow on feature though as the solution matures.

Compatibility

Newer server (with caching support) and older client (no caching support)

If the client is old and does not support caching, then nothing changes from the client’s perspective. Client never originates a GET request and server never responds with cache control headers.

Older server (no caching support) and newer client (with caching support)

Client will generate GET request for cacheable methods. Server will respond with error. Application will disable caching.

Proposed Method Annotations

ENUM_IDEMPOTENT, ENUM_IDEMPOTENT_AND_SAFE are enumerated values.

  1. Idempotency
    1. Syntax: option method_flags = ENUM_IDEMPOTENT;
    2. Description: Any method that can be called more than once with the same result.
  2. Safe
    1. Syntax: option method_flags = ENUM_IDEMPOTENT_AND_SAFE;
    2. Description: Any idempotent method without side effects.

Expected behavior

Safe Idempotent GRPC behavior Use Case
Yes - Client: Maybe use GET verb. Server: Enable caching of response by adding cache control headers in the response Use this annotation to designate a method cacheable.
No Yes Client: Maybe use PUT verb. Server: No change in behavior. PUT request is processed just like a POST request. Use this annotation for enabling 0-RTT (QUIC) on clients using Cronet stack
No No Client: no change Server: no change This is the default gRPC implementation

Example

service Greeter {
  // Sends a greeting
  rpc SayHello (HelloRequest) returns (HelloReply) {
    option method_flags = ENUM_IDEMPOTENT_AND_SAFE;
  }
}

GRPC Wire Protocol Change

  • Request → (Request-Headers *Delimited-Message / Cachable-Request-Headers) EOS
  • Cachable-Request-Headers → Request-Headers *Encoded-Message
  • Call-Definition → Method Scheme Path TE [Authority] [Timeout] [Content-Type] [Message-Type] [Message-Encoding] [Message-Accept-Encoding] [User-Agent][Cache-Control]
  • method → ":method (POST/GET)"
  • Cache-Control → "max-age" / [“public” “must-revalidate” “no-transform”]
  • Encoded-Request → "grpc-encoded-request" {base64 encoded value}

@rvolosatovs
Copy link

any progress on this?

@makdharma
Copy link
Contributor Author

Hi - The code is mostly checked in and working. what is your use case? what languages will you be using for server and client?

@rvolosatovs
Copy link

We would use it in https://github.com/TheThingsNetwork/ttn, an IoT network.
Both client and server use Go.
cc @htdvisser

@makdharma
Copy link
Contributor Author

caching works for Java and C stacks. Go support is not there yet unfortunately.

@rmichela
Copy link

When will cache support become available from protoc generated grpc stubs?

@ejona86
Copy link
Member

ejona86 commented Aug 17, 2017

It looks like protoc got support with a slightly different syntax than mentioned above:
https://github.com/google/protobuf/blob/9e745f7/src/google/protobuf/descriptor.proto#L630

service Greeter {
  // Sends a greeting
  rpc SayHello (HelloRequest) returns (HelloReply) {
    option idempotency_level = NO_SIDE_EFFECTS;
  }
}

@dfawley
Copy link
Member

dfawley commented Oct 17, 2017

Should this be added as a gRFC in the proposal repo?

@m-sasha
Copy link

m-sasha commented Jan 22, 2018

Is there an update on this? Can we start using it if we have Go servers and Android/iOS clients?

@ericgribkoff
Copy link
Contributor

@m-sasha This feature is experimental but operational for Android and iOS clients (see our interop test clients for examples of configuring RPCs to use GET). As far as I know, responding to GET requests is currently only supported by C++ on the server side. We will be converting this into a gRFC in the next couple weeks, at which point we will be better able to estimate when support will be available for servers in the other languages.

@ejona86
Copy link
Member

ejona86 commented Jan 22, 2018

@m-sasha, to be clear, "experimental" means we are still free to do whatever with it, including changing the protocol. You can play with it and provide feedback, but it shouldn't get anywhere near production.

@m-sasha
Copy link

m-sasha commented Jan 23, 2018

I understand. Thanks!

@yazsh
Copy link

yazsh commented Aug 18, 2018

Has there been any update on this?

In particular for the golang client implementation? My team would be very interested in using it.

@phemmer
Copy link

phemmer commented Aug 24, 2018

Just to throw in another case for consideration. We're considering whether it would be possible to use HTTP/2 server-push for gRPC services. The idea being the typical HTTP/2 server-push where we want to preempt the client asking for related resources.
Thus this is very tightly related to the caching issue, as the pushed message sits in a "push cache" on the client, and needs to be uniquely identified by the URL. Also in the case of cross-service server-pushes, we might identify the related resource in a Link header, and want the intermediary proxy to transform this Link header into a server push (if the other service identified in the link sits behind the proxy), and thus the URL needs to uniquely identify the resource, and instruct how to build the gRPC request (but would assume the proxy has a copy of the proto spec).

The fortunate thing is that the "Option 1 - HTTP Cache" solution would address this just fine, and this appears to be the solution being pursued.

So I don't think any action is needed here, just wanted to mention the use case in case it does have any impact to the implementation.

@phemmer
Copy link

phemmer commented Aug 31, 2018

Oh, and for this:

Open question: Should caching be allowed for authenticated request responses? Proposed answer: No. This is to avoid sensitive information cached by mistake in public caches. It can be a great follow on feature though as the solution matures.

This is the purpose of the private and proxy-revalidate headers.
These would allow the response to be cached locally on the client, but not by an intermediary (shared) proxy.

@ejona86
Copy link
Member

ejona86 commented Aug 31, 2018

private is more appropriate than proxy-revalidate, because we know we don't really optimize the revalidate case. So it would be better for the proxy to avoid storing the value in the cache at all. With proxy-revalidate the proxy still caches the response.

@SteveDunn
Copy link

Are we any closer to getting this for C#?

@mikaelstaldal
Copy link

Is there any Java implementation of this?

@juansalvatella
Copy link

Is there any update on the golang implementation?

@dfawley
Copy link
Member

dfawley commented Nov 26, 2018

@juansalvatella the Go team has no current plans to implement this proposal. We are not opposed to it (although I would like to see it as a gRFC or added to the spec first), but we have many higher priority things to work on at this time.

@stale
Copy link

stale bot commented Sep 4, 2019

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 180 days. It will be closed automatically if no further update occurs in 1 day. Thank you for your contributions!

@stale stale bot closed this as completed Sep 5, 2019
@csdexter
Copy link

Please do not let this die off into oblivion. Has a formal gRFC been written for this? Has the implementation progressed any further past the experimental stage? It's a good feature, please don't throw it away.

@ejona86 ejona86 reopened this Sep 16, 2019
@stale stale bot removed the disposition/stale label Sep 16, 2019
@codemedian
Copy link

Would this approach also work for streaming responses?

I'm thinking along the lines of streams emitting deltas to the previous responses and hence requiring some form of conflation mechanism.

@tonybase
Copy link

Is there an update on this? Can we start using it if we have Go servers and Android/iOS clients?

@ejona86
Copy link
Member

ejona86 commented Dec 27, 2019

@tonyboxes, this stalled. It is not available for use. The protoc option is available, but it doesn't do anything.

@wilburx9
Copy link

Please, is there any update on this?

@zsmatrix62
Copy link

any updates?

@dfawley
Copy link
Member

dfawley commented Mar 24, 2020

No updates. This is very low on any priority list (note the "P3" label) as it's seen as a nice-to-have at this time.

@stale
Copy link

stale bot commented May 6, 2020

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 30 days. It will be closed automatically if no further update occurs in 7 day. Thank you for your contributions!

@mmontes11
Copy link

Any updates os this ?

@dperetz1
Copy link

dperetz1 commented Nov 22, 2020

does soneone know if it's possible to use idempotency_level on c#/typescript? seems like genearated code has no difference after applying it... any other way to make it a GET request?? (for perfroming http response caching)

@ejona86
Copy link
Member

ejona86 commented Nov 23, 2020

@dperetz1, GET is not supported in any language. There's some code in at least C core and Java, but IIRC they aren't entirely compatible and not everything was working.

@amenzhinsky
Copy link

We've worked around this by writing a custom codec to cache marshaling results, because most of the time our servers spent in encoding huge responses:

func New() *Store {
	return &Store{
		vm: map[string]proto.Message{},
		bm: map[proto.Message][]byte{},
	}
}

type Store struct {
	mu sync.RWMutex
	vm map[string]proto.Message
	bm map[proto.Message][]byte
}

func (s *Store) Delete(key string) {
	s.mu.Lock()
	defer s.mu.Unlock()
	if i := strings.Index(key, "*"); i != -1 {
		for k, v := range s.vm {
			if strings.HasPrefix(k, key[:i]) {
				delete(s.vm, k)
				delete(s.bm, v)
			}
		}
		return
	}

	v, ok := s.vm[key]
	if !ok {
		return
	}
	delete(s.vm, key)
	delete(s.bm, v)
}

func (s *Store) LoadOrStore(key string, fn func() (proto.Message, error)) (proto.Message, error) {
	s.mu.RLock()
	v, ok := s.vm[key]
	if ok {
		s.mu.RUnlock()
		return v, nil
	}
	s.mu.RUnlock() // upgrade lock
	s.mu.Lock()
	defer s.mu.Unlock()
	if v, ok := s.vm[key]; ok {
		return v, nil // someone did the job before
	}
	v, err := fn()
	if err != nil {
		return nil, err
	}
	b, err := proto.Marshal(v.(proto.Message))
	if err != nil {
		return nil, err
	}
	s.vm[key] = v
	s.bm[v] = b
	return v, nil
}

func (s *Store) Name() string {
	return "proto"
}

func (s *Store) Marshal(v interface{}) ([]byte, error) {
	s.mu.RLock()
	defer s.mu.RUnlock()
	if b, ok := s.bm[v.(proto.Message)]; ok {
		return b, nil
	}
	return proto.Marshal(v.(proto.Message))
}

func (s *Store) Unmarshal(data []byte, v interface{}) error {
	return proto.Unmarshal(data, v.(proto.Message))
}

grpc.ForceServerCodec(codec)

func (s *Server) Hello(ctx context.Context, req *hellopb.HelloRequest) (*hellopb.HelloResponse, error) {
	v, err := s.cache.LoadOrStore("hello", func() (proto.Message, error)) (proto.Message, error) {
		return &hellopb.HelloResponse{...}, nil
	})
	if err != nil {
		return nil, err
	}
	return v.(*hellopb.HelloResponse), nil
}

@dperetz1
Copy link

please reopen, I think it's a major & relevant enhancement to the protocol

@llarsson
Copy link

During my PhD studies, my co-authors and I addressed this using open source components we built. We extended the protobuf compiler to emit a reverse proxy in Golang, and then used gRPC Interceptors to both cache responses and estimate TTLs for responses. The latter is of course optional, if you want to, for instance, set a TTL yourself.

The publication is available for free here: https://arxiv.org/pdf/2104.02463.pdf

If you want to take a look at the code itself, please have a look at:

@tdv
Copy link

tdv commented Feb 13, 2023

Having faced the same issue, I've implemented a library for server- and client-side.
https://github.com/tdv/go-care
Might be useful...

@thanhdatvo
Copy link

Are there any updates on this request?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests