Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add Cloud Run Jobs Infrastructure Block #48

Merged
merged 82 commits into from
Sep 19, 2022
Merged

Add Cloud Run Jobs Infrastructure Block #48

merged 82 commits into from
Sep 19, 2022

Conversation

peytonrunyan
Copy link
Contributor

@peytonrunyan peytonrunyan commented Sep 7, 2022

Summary

Adds a CloudRunJob infrastructure block to allow for running deployments using Google Cloud Run Jobs.

Relevant Issue(s)

Closes PrefectHQ/prefect#5884

Checklist

Details and discussion

I want to go ahead and get this under review while I'm finishing up the last of the tests for what I have. I plan to schedule a walk-through for either this afternoon or tomorrow afternoon if possible.

Credentials
I made some changes to the Credentials block that I would like to discuss with @ahuang11 before anything is finalized.

  • I moved some of the validation out of the function body to separate validators
  • I made get_service_account_value public so that I could access it from within CloudRunJob
  • I removed project as a field and made it a property because it is contained within the service account info, and seems like an extra point of failure if the provided project doesn't match the service account
  • I also added a method getting an access token, which is no longer being used in my code, but will probably be useful for future work.

Design Decisions

  • A unique cloud run job can only have one instance running. After some discussion with @madkinsz and @anna-geller we decided to just create a new job each time, instead of checking to see if the job is free and creating a new job if not.
  • Cloud Run Jobs can only pull from google cloud registries, so no default image option is available
  • Resource requests (cpu and memory) cannot be higher than limits, so when specifying a resource amount, both the limit and the request are set to the same number

Open Question and things I'm kicking around

  • I have a couple of error response formatting functions that aren't especially well fleshed out.
  • The client calls could possibly be moved to Job and Execution and out of CloudRunJob. This would probably make testing easier. This feels like it could probably be done later.

@ahuang11 ahuang11 marked this pull request as ready for review September 14, 2022 00:41
@ahuang11
Copy link
Contributor

Not entirely sure about:
"""
Incorporated Michael's work on PrefectHQ/prefect#6622
Figured out how to make this prettier on the UI
"""

Also, have to run an actual test tomorrow, but should be good for a round of reviews first

requirements.txt Show resolved Hide resolved
prefect_gcp/cloud_run_job.py Outdated Show resolved Hide resolved
@ahuang11
Copy link
Contributor

ahuang11 commented Sep 14, 2022

cloud_run_job.preview() shows the prefect api key; not sure if that should be included.

Latest commit removes the key from preview

{
  "apiVersion": "run.googleapis.com/v1",
  "kind": "Job",
  "metadata": {
    "name": "dummy-66711637f9c14fb9b10fffa0d7ef2c57",
    "annotations": {
      "run.googleapis.com/launch-stage": "BETA"
    }
  },
  "spec": {
    "template": {
      "spec": {
        "template": {
          "spec": {
            "containers": [
              {
                "image": "gcr.io/project/dummy",
                "env": [
                  {
                    "name": "PREFECT_API_URL",
                    "value": "api-url"
                  },
                  {
                    "name": "PREFECT_API_KEY",
                    "value": "pnu_mysecretkey"
                  }
                ],
                "command": [
                  "echo",
                  "hello world"
                ]
              }
            ]
          }
        }
      }
    }
  }
}

@ahuang11
Copy link
Contributor

ahuang11 commented Sep 14, 2022

Tested run() and it successfully ran!
image

from prefect import flow, task
from prefect_gcp import GcpCredentials
from prefect_gcp.cloud_run_job import CloudRunJob

@flow
def test_flow():
    cloud_run_job = CloudRunJob(
        image="us-docker.pkg.dev/cloudrun/container/job:latest",
        credentials=GcpCredentials.load("infra-block"),
        region="us-central1",
        command=["echo", "hello world"],
    )
    return cloud_run_job.run()

test_flow()

Tried both keep_job=True/False; the logs of True:

10:37:22.964 | INFO    | prefect.engine - Created flow run 'crouching-kakapo' for flow 'test-flow'
10:37:31.232 | INFO    | prefect.infrastructure.cloud-run-job - Creating Cloud Run Job container-3736994d0fd84c7a9657234ac610a65a
10:37:32.130 | INFO    | prefect.infrastructure.cloud-run-job - Submitting Cloud Run Job container-3736994d0fd84c7a9657234ac610a65a for execution.
10:37:32.739 | INFO    | prefect.infrastructure.cloud-run-job - Cloud Run Job 'container-3736994d0fd84c7a9657234ac610a65a': Running command 'echo hello world'
10:37:58.528 | INFO    | prefect.infrastructure.cloud-run-job - Job Run container-3736994d0fd84c7a9657234ac610a65a completed successfully
10:37:58.529 | INFO    | prefect.infrastructure.cloud-run-job - Job Run logs can be found on GCP at: https://console.cloud.google.com/logs/viewer?project............................................
10:38:00.761 | INFO    | Flow run 'crouching-kakapo' - Finished in state Completed()
CloudRunJobResult(identifier='container-3736994d0fd84c7a9657234ac610a65a', status_code=0)

@ahuang11 ahuang11 self-requested a review September 14, 2022 17:39
@ahuang11
Copy link
Contributor

Would be nice if UI had it sorted by "required" then "optional"
image

Ran successfully:

from prefect import flow, task
from prefect_gcp import GcpCredentials
from prefect_gcp.cloud_run_job import CloudRunJob

@flow
def test_flow():
    cloud_run_job_block = CloudRunJob.load("infra-block-test")
    return cloud_run_job_block.run()

test_flow()
10:50:14.985 | INFO    | prefect.engine - Created flow run 'logical-jackrabbit' for flow 'test-flow'
/Users/andrew/Applications/python/prefect/src/prefect/blocks/core.py:649: UserWarning: Block document has schema checksum sha256:8fec0d10b2a2092fc52574443cea045d9f5230f4d9cbbb0df7e7a45be9bb38ff which does not match the schema checksum for class 'CloudRunJob'. This indicates the schema has changed and this block may not load.
  return cls._from_block_document(block_document)
10:50:18.894 | INFO    | prefect.infrastructure.cloud-run-job - Creating Cloud Run Job container-a791824ab5de4788b6a8a72ed0e09b69
10:50:19.905 | INFO    | prefect.infrastructure.cloud-run-job - Submitting Cloud Run Job container-a791824ab5de4788b6a8a72ed0e09b69 for execution.
10:50:20.751 | INFO    | prefect.infrastructure.cloud-run-job - Cloud Run Job 'container-a791824ab5de4788b6a8a72ed0e09b69': Running command 'default container command'
10:50:49.240 | INFO    | prefect.infrastructure.cloud-run-job - Job Run container-a791824ab5de4788b6a8a72ed0e09b69 completed successfully
10:50:49.242 | INFO    | prefect.infrastructure.cloud-run-job - Job Run logs can be found on GCP at: 

Changes mkdocstrings Python handler to support pydantic fields
@zanieb
Copy link
Contributor

zanieb commented Sep 14, 2022

cloud_run_job.preview() shows the prefect api key; not sure if that should be included.

Should be handled by "Secret" settings rather than manual exclusion

@ahuang11
Copy link
Contributor

cloud_run_job.preview() shows the prefect api key; not sure if that should be included.

Should be handled by "Secret" settings rather than manual exclusion

I think we need a SecretJson field type to support service_account_info

@@ -0,0 +1 @@
::: prefect_gcp.cloud_run_job
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new module should be named cloud_run since that's the name of the GCP service.

Suggested change
::: prefect_gcp.cloud_run_job
::: prefect_gcp.cloud_run

Copy link
Contributor Author

@peytonrunyan peytonrunyan Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, Cloud Run has two distinct services - Cloud Run Services and Cloud Run Jobs. This infrastructure block only supports Cloud Run Jobs. I think it's probably worth making the distinction, because the two have different use-cases, slightly different interfaces (I think), and different resource limits.
https://cloud.google.com/run/docs/overview/what-is-cloud-run#services-and-jobs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the block will still be called CloudRunJob. I'm only suggesting that we change the name of the module to match the granularity of other modules in this collection. If we develop other integrations that work with Cloud Run services or Cloud Run jobs, they can go into this module.

mkdocs.yml Outdated
@@ -61,3 +61,4 @@ nav:
- Cloud Storage: cloud_storage.md
- BigQuery: bigquery.md
- Secret Manager: secret_manager.md
- Cloud Run Job: cloud_run_job.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Cloud Run Job: cloud_run_job.md
- Cloud Run: cloud_run.md

setup.py Outdated Show resolved Hide resolved
prefect_gcp/credentials.py Outdated Show resolved Hide resolved
prefect_gcp/credentials.py Outdated Show resolved Hide resolved
prefect_gcp/cloud_run_job.py Outdated Show resolved Hide resolved
prefect_gcp/cloud_run_job.py Outdated Show resolved Hide resolved
prefect_gcp/cloud_run_job.py Outdated Show resolved Hide resolved
prefect_gcp/cloud_run_job.py Outdated Show resolved Hide resolved
prefect_gcp/cloud_run_job.py Outdated Show resolved Hide resolved
@ahuang11
Copy link
Contributor

ahuang11 commented Sep 19, 2022

Ran successfully with new changes

from prefect import flow, task
from prefect_gcp import GcpCredentials
from prefect_gcp.cloud_run import CloudRunJob

@flow
def test_flow():
    cloud_run_job = CloudRunJob(
        image="us-docker.pkg.dev/cloudrun/container/job:latest",
        credentials=GcpCredentials.load("infra-block"),
        region="us-central1",
        command=["echo", "hello world"],
        project="abc"
        keep_job=True
    )
    return cloud_run_job.run()

test_flow()

image

@ahuang11 ahuang11 merged commit 7735af4 into main Sep 19, 2022
@ahuang11
Copy link
Contributor

Thanks @peytonrunyan for contributing the bulk of this PR!

@peytonrunyan
Copy link
Contributor Author

Whoop - thanks for taking this to the finish line man!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add GCP Cloud Run infrastructure block to the GCP Prefect Collection
4 participants