Google Cloud Run - FAQ
⚠️This repository is a community-maintained knowledge base. It does not reflect Google’s product roadmap. Refer to the Cloud Run documentation for the most up-to-date information, as this page may go out of date.
- Is this repo useful? Please ⭑Star this repository and share the love.
- Curious about something? Open an issue, someone may be able to add it to the FAQ.
- Contribute if you learned something interesting about Cloud Run.
- Trouble using Cloud Run? Ask a question on Stack Overflow.
- Check out awesome-cloudrun for a curated list of Cloud Run articles, tools and examples.
- Developing Applications
- Which applications are suitable for Cloud Run?
- What if my application is doing background work outside of request processing?
- Which languages can I run on Cloud Run?
- Can I run my own system libraries and tools?
- Where do I get started to deploy a HTTP web server container?
- How do I make my web application compatible with Cloud Run?
- Can Cloud Run receive events?
- How to configure secrets for Cloud Run applications?
- How can I have cronjobs on Cloud Run?
- Cold Starts
- Serving Traffic
- What's the maximum request execution time limit?
- Does my service get a domain name on Cloud Run?
- Are all Cloud Run services publicly accessible?
- How much additional latency does running on Cloud Run add?
- Does my application get multiple requests concurrently?
- What if my application can’t handle concurrent requests?
- How do I find the right concurrency level for my application?
- Can serve Cloud Run services with Cloud HTTP(S) Load Balancer?
- How can I configure CDN for Cloud Run services?
- Does Cloud Run offer SSL/TLS certificates (HTTPS)?
- How can I redirect all HTTP traffic to HTTPS?
- Is traffic between my app and Google’s load balancer encrypted?
- Is HTTP/2 supported on Cloud Run?
- Is gRPC supported on Cloud Run?
- Are WebSockets supported on Cloud Run?
- Which operating system Cloud Run applications run on?
- Can I use the local filesystem?
- Which system calls are supported?
- Which executable ABIs are supported?
- What happens if my container exits/crashes?
- What is the termination signal for Cloud Run services?
- Where can I find the "instance ID" of my container?
- How can I find the number of instances running?
- How can my service can tell it is running on Cloud Run?
- Monitoring and Logging
What is Cloud Run?
Cloud Run is a service by Google Cloud Platform to run your stateless HTTP containers without worrying about provisioning machines, clusters or autoscaling.
With Cloud Run, you go from a "container image" to a fully managed web application running on a domain name with TLS certificate that auto-scales with requests in a single command. You only pay while a request is handled.
How is it different than App Engine Flexible?
- GAE Flexible is built on VMs, therefore is slower to deploy and scale.
- GAE Flexible does not scale to zero, at least 1 instance must be running.
- GAE Flexible billing has 1 minute granularity, Cloud Run in 0.1 second.
- GAE Flexible supports Websockets in beta, unlike Cloud Run.
Read more about choosing a container option on GCP.
How is it different than Google Cloud Functions?
GCF lets you deploy snippets of code (functions) written in a limited set of programming languages, to natively handle HTTP requests or events from many GCP sources.
Cloud Run lets you deploy using any programming language, since it accepts container images (more flexible, but also potentially more tedious to develop). It also allows using any tool or system library from your application (see here) and GCF doesn’t let you use such custom system executables.
Both services auto-scale your code, manage the infrastructure your code runs on and they both run on GCP’s serverless infrastructure.
Read more about choosing between GCP's serverless options
How does it compare to AWS Fargate?
AWS Fargate and Cloud Run both let you run containers without managing the underlying VM instances).
Fargate focuses on being a managed replacement for VM instances in Amazon Elastic Container Service (ECS). It shares the same API constructs as ECS. In Fargate, you are in charge of managing your cluster.
Cloud Run is a standalone compute platform, abstracting cluster management and focusing on fast automatic scaling. Cloud Run supports running only HTTP servers, and therefore can do request-aware autoscaling, as well as scale-to-zero. Fargate autoscaling is CPU/memory based and is more suitable for containers that can be long-running (batch, background etc).
Therefore, the pricing model is different. On Cloud Run, you only pay while a request is being handled.
How does it compare to Azure Container Instances?
Azure Container Instances and Cloud Run both let you run containers without managing the underlying infrastructure (VMs, clusters). Both ACI and Cloud Run give you a publicly accessible endpoint after deploying the application.
Cloud Run supports running only HTTP servers and offers auto-scaling, and scale to zero. ACI is for long-running containers. Therefore, the pricing model is different. On Cloud Run, you only pay while a request is being handled.
What is “Cloud Run on GKE”?
Both Cloud Run and "Cloud Run on GKE" have:
- the same application format (container images)
- the same deployment/management experience (
gcloudor Cloud Console)
- the same API (Knative serving API).
Cloud Run on GKE basically installs and manages a Knative installation (with some additional GCP-specific components for monitoring etc) on your Kubernetes cluster so that you don’t have to worry about installing and managing Knative yourself.
Is Cloud Run hosted Knative?
With Cloud Run on GKE, you actually get a Knative installation.
Which applications are suitable for Cloud Run?
Cloud Run is designed to run stateless request-driven containers.
This means you can deploy:
- publicly accessible applications: web applications, APIs or webhooks
- private microservices: internal microservices, data transformation, background jobs, potentially triggered asynchronously by Pub/Sub events or Cloud Tasks.
Other kinds of applications may not be fit for Cloud Run.
If your application is doing processing while it’s not handling requests or storing in-memory state, it may not be suitable.
What if my application is doing background work outside of request processing?
Your application’s CPU is significantly throttled nearly down to zero while it's not handling a request.
Therefore, your application should limit CPU usage outside request processing to a minimum. It might not be entirely possible since the programming language you use might do garbage collection or similar runtime tasks in the background.
Which languages can I run on Cloud Run?
If an application can be packaged into a container image that can run on Linux (x86-64), it can be executed on Cloud Run.
Web applications written in languages like Node.js, Python, Go, Java, Ruby, PHP, Rust, Kotlin, Swift, C/C++, C# can work on Cloud Run.
Can I run my own system libraries and tools?
Yes, see the section above. Since Cloud Run accepts container images as the
deployment unit, you can add arbitrary executables (like
imagemagick) or system libraries (
.dll) to your container image and
use them in your application.
See this tutorial
dot that generates PNG diagrams.
Where do I get started to deploy a HTTP web server container?
See Cloud Run Quickstart which has sample applications written in many languages.
How do I make my web application compatible with Cloud Run?
Your existing applications must listen on
PORT environment variable to work
on Cloud Run (see container contract). (This value is
8080, but it may change in the future.)
If your existing application doesn't allow you to configure port number it
listens on, Cloud Run currently doesn't allow customizing the
Can Cloud Run receive events?
Cloud Run integrates securely with Pub/Sub push subscriptions:
- Events are delivered via HTTP to the endpoint of your Cloud Run service.
- Pub/Sub automatically validates the ownership of the
*.run.appCloud Run URLs
- You can leverage Pub/Sub push authentication to securely and privately push events to Cloud Run services, without exposing them publicly to the internet.
Many GCP services like Google Cloud Storage are able to send events to a Pub/Sub topic. You can publish your own events to a Pub/Sub topic and push them to a Cloud Run service.
Follow this tutorial for instructions about how to push Pub/Sub events to Cloud Run services.
How to configure secrets for Cloud Run applications?
Currently, Cloud Run does not have integration with Cloud KMS or any secret stores. Some workarounds you can apply:
- Pass secrets as plain-text environment variables (
- Upload secrets to Google Cloud Storage (GCS) and download them in runtime.
- Pass secrets as encrypted environment variables and decode them using Cloud KMS.
These methods are explained in the Secrets in Serverless article.
Alternatively, you can try the experimental berglas which provides a command-line tool to create and store secrets, and a set of libraries to obtain the secrets in the runtime.
How can I have cronjobs on Cloud Run?
If you need to invoke your Cloud Run applications periodically, use Google Cloud Scheduler. It can make a request to your application’s specific URL at an interval you specify.
How do I continuously deploy to Cloud Run?
(If you know of articles about other CI/CD system integrations, add them here.)
For other CI/CD systems, roughly the steps you should follow look like:
Create a new service account with a JSON key.
Give the service account IAM permissions to deploy to Cloud Run.
Upload the JSON key to the CI/CD environment, and authenticate to
gcloud auth activate-service-account --key-file=[KEY_JSON_FILE]
Deploy the app by calling:
gcloud beta run deploy [MY_SERVICE] --image=[...] [...]
Which container registries can I deploy from?
Cloud Run currently only allows deploying images hosted on Google Container
How can I deploy from other GCR registries?
If you're deploying from GCR registries on another GCP project:
- public registries: should be deploying without additional configuration
- private registries: need to give GCR access to service account used by Cloud Run.
How can I serve traffic multiple revisions?
If you updated your Cloud Run service, you probably realized it creates a new revision for every new configuration of your service.
However, Cloud Run (currently) only supports serving traffic from the last healthy revision of your service. Therefore, it currently does not support revision based traffic splitting and canary deployments.
Can I use
kubectl to deploy to Cloud Run?
Many fields of the container specification exposed in Knative API, some
annotations, and operations like patching (
kubectl apply) are not supported.
Does Cloud Run have cold starts?
Yes. If a Cloud Run service does not receive requests for a long time, it will take some time to start it again. This will add additional delay to the first request.
Cold start latency depends on many factors, however many users observe additional ~2 seconds latency on cold starts. [more user data needed!]
When will my service scale to zero?
Cloud Run does not provide any guarantees on how long it will keep a service "warm". It depends on factors like capacity and Google’s implementation details.
Some users see their services staying warm up to an hour, or longer. [more user data needed!]
How do I minimize the cold start latencies?
See performance optimization tips, basically:
- minimize the number and size of the dependencies that your app loads
- keep your app’s "time to listen for requests" startup time short
- prevent your application process from crashing
(The size of your container image has almost no impact on cold starts).
Do I get "warmup requests" like in App Engine?
Cloud Run does not have the notion of App Engine warmup requests. You can perform initialization of your application (such as loading data) until you start listening on the port number.
Note that delaying the listening on the port number causes longer cold starts, so consider lazily computing/fetching the data you need to reduce cold start latencies.
How to keep a Cloud Run service “warm”?
You can work around "cold starts" by periodically making requests to your Cloud Run service which can help prevent the container instances from scaling to zero.
Use Google Cloud Scheduler to make requests every few minutes.
How can I tell if a request was a “cold start”?
Each request to Cloud Run services is logged to Stackdriver logging, with an indicator whether instance was "warm" or "cold" during that request (see Viewing Logs).
If you view logs from Cloud Run console, these requests are marked (and if you view them in Stackdriver Logging, you can see the structured log label indicating "cold" request):
What's the maximum request execution time limit?
Currently, a request times out after 15 minutes. See limits.
Does my service get a domain name on Cloud Run?
Yes, every Cloud Run service gets a
*.run.app domain name for free. You can
also use your domain names.
Are all Cloud Run services publicly accessible?
No. Cloud Run allows services to be either publicly accessible to anyone on the Internet, or private services that require authentication.
How much additional latency does running on Cloud Run add?
TODO(ahmetb): Write this section. Ideally we should link to some blog posts doing an analysis of this.
Does my application get multiple requests concurrently?
Contrary to most serverless products, Cloud Run is able to send multiple requests to be handled simultaneously to your container instances.
Each container instance on Cloud Run is (currently) allowed to handle up to 80 concurrent requests. This is also the default value.
What if my application can’t handle concurrent requests?
If your application cannot handle this number, you
can configure this number while deploying your service in
gcloud or Cloud
Most of the popular programming languages can process multiple requests at the same time thanks to multi-threading. But some languages may need additional components to do concurrent requests (e.g. PHP with Apache, or Python with gunicorn).
How do I find the right concurrency level for my application?
Each application and language can process different levels of simultaneously without having them time out. That's why Cloud Run allows you to configure concurrency per service.
You should do "load testing" to find out where your application should stop handling additional request and additional instances should be created. Read Tuning concurrency for more.
Can serve Cloud Run services with Cloud HTTP(S) Load Balancer?
Currently, you can’t route traffic to Cloud Run services via the Cloud HTTP(S) Load Balancer.
How can I configure CDN for Cloud Run services?
However, you can have CDN from Firebase Hosting by:
- responding to requests with a
- configuring a rewrite configuration in
firebase.jsonof your Firebase app.
Does Cloud Run offer SSL/TLS certificates (HTTPS)?
Yes. If you’re using the domain name provided by Cloud Run (
application is immediately ready to serve on
If you’re using your own custom domain name, Cloud Run provisions a TLS
certificate for your domain name. This may take ~15 minutes to provision and
serve traffic on
https://. Cloud Run uses Let’s
Encrypt to get a certificate for your domains.
How can I redirect all HTTP traffic to HTTPS?
Unfortunately, Cloud Run does not offer a built-in feature to redirect all
http:// traffic to
https://. However, your application can readt the
X-Forwarded-Proto header and when it is
http, make an HTTP 301 response to
redirect to the
Is traffic between my app and Google’s load balancer encrypted?
Since your app serves traffic on
PORT (i.e. 8080) unencrypted, you might think
the connection between Google’s load balancer and your application is
However, the transit between Google’s frontend/load balancer and your Cloud Run container instance is encrypted. Google terminates TLS/HTTPS connections before they reach your application, so that you don’t have to handle TLS yourself.
Is HTTP/2 supported on Cloud Run?
Yes. If you query your application with
https://, you should be seeing HTTP/2
$ curl -v https://<url> ... < HTTP/2 200 ...
Is gRPC supported on Cloud Run?
Are WebSockets supported on Cloud Run?
Does my Cloud Run service scale to zero?
Yes, although you can’t really see how many container instances are running your service. When your service is not receiving requests, you are not paying for anything.
Therefore, after not receiving any requests for a while, the first request may observe cold start latency.
How can I limit the total number of instances for my application?
You currently can’t.
Cloud Run currently does not provide an option to limit the count of container instances your application runs on.
What’s the upper scaling limit for Cloud Run?
Which operating system Cloud Run applications run on?
However, since you bring your own container image, you get to decide your system libraries like libs (e.g. musl libc in alpine, or glibc in debian based images).
Your applications run on gVisor which only supports Linux (currently).
Can I use the local filesystem?
Yes, however files written to the local filesystem count towards available memory and may cause container instance to go out-of-memory and crash.
Therefore, writing files to local filesystem are discouraged, with the exception
/var/log/* path for logging.
Which system calls are supported?
Cloud Run applications run on gVisor container sandbox, which executes Linux kernel system calls made by your application in userspace.
gVisor does not implement all system calls (see
here). If your app
has such a system call (quite rare), it will not work on Cloud Run. Such an
event is logged and
you can use
to determine when the system call was made in your app.
Which executable ABIs are supported?
What happens if my container exits/crashes?
If the entrypoint process of a container exits, the container is stopped. A crashed container triggers cold start while the container is restarted. Avoid exiting/crashing your server process by handling exceptions. See development tips.
What is the termination signal for Cloud Run services?
Currently, Cloud Run terminates containers while scaling to
zero with unix signal 9 (
SIGKILL is not trappable (capturable) by applications. Therefore, your
applications should be okay to be killed abruptly.
Where can I find the "instance ID" of my container?
The logs collected from a container instance specify the unique instance ID of the container when the logs are viewed on Stackdriver Logging. This instance ID is not made available to the application.
To identify your container instance while it’s running, generate a random UUID during the startup of your process and store it in a variable.
How can I find the number of instances running?
Cloud Run currently does not offer you a way to learn the number of container instances running at a time.
Ideally you should not care about this in a serverless world where your applications autoscale based on traffic patterns better and you only pay while a request is being handled (not the idle instance time).
How can my service can tell it is running on Cloud Run?
You can also access instance
determine if you are on Cloud Run. However, this will not distinguish "Cloud
Run" vs "Cloud Run on GKE" as the metadata service is available on GKE nodes as
Monitoring and Logging
Where do I write my application logs?
Anything your application writes to standard output (stdout) or standard error (stderr) is collected as logs by Cloud Run.
Some existing apps might not be complying with that (e.g. nginx writes logs to
/var/log/nginx/error.log). Therefore any files written under
also aggregated. Learn more here.
How can I have structured logs?
All your log lines must be JSON objects with fields recognized by Stackdriver
Is Cloud Run integrated with Stackdriver APM?
Yes. See this document on how to view various metrics about your Cloud Run container instances.
How can I do Tracing on Cloud Run?
TODO(ahmetb): Write this section.
Cloud Run Pricing documentation has the most up-to-date information.
Is there a “Free Tier”?
Yes! See Pricing documentation.
When am I charged?
You only pay while a request is being handled on your container instance.
This means an application that is not getting traffic is free of charge.
How is billed time calculated?
Based on "time serving requests" on each instance. If your service handles multiple requests simultaneously, you do not pay for them separately. (This is a cost saver!)
Each billable timeslice is rounded up to the nearest 100 milliseconds.
Read how the billable time is calculated, it is basically like this:
request1 response1 | request2 ʌ response2 | | | ʌ v........|......./ | | | v.............../ |-----FREE-----|----------BILLED----------|----FREE...
What do I pay for on Cloud Run?
You are paying for CPU, memory and the traffic sent to the client from your application (egress traffic).
This is not an official Google project or roadmap. Refer to the Cloud Run documentation for the authoritative information. This project is licensed under Creative common Attribution 4.0 International (CC BY 4.0) license.
Your question not answered here? Open an issue and see if we can answer.