In this project, an Extract, Transform, and Load Pipeline application alongside all its dependencies is packaged and contanarized into a dokcer container.
The project focuses on deploying the scraper application built in Project one on Google Cloud Run, a mangaged compute platform that allows you run stateless containers.
The Project is the 5th series in my "Building your first Google Cloud Analytics Project"
The Project is also a direct sequel to the Third Project
- Introduction
- Setting up the Environment
- Setting up PGADMIN
- Building the Docker Image
- Deploying the Container on Cloud Run
- Conclusion
Clone the Github Repo
git clone https://github.com/paulonye/DockerXPostgres
Install the Required Libraries
pip install -r requirements.txt
Set up the .env
file
PGUSER=user
PGPASS=******
HOST=**********
DB=database_name
key_file=key.json
python app/batch.py
cd
into the directory of the cloned repo, open the Dockerfile
and make the changes you need to make. It is well documented, so just follow through.
Some changes to watch out for:
- Directory and Name of the service account.
- Name of the environment variable for the service account.
Once this is done; you can then build the docker image using
docker build -t image_name .
The above command builds the docker image; to test that it works, use:
docker run image_name
Once you are sure that it works, go ahead and set up the artifact registry as described in the medium article above.
Authenticate to the Region where your Artifact Registry is located
gcloud auth configure-docker us-central1-docker.pkg.dev
Build the Docker Image for Artifact Registry
docker build -t us-central1-docker.pkg.dev/my-project/my-repo/my-image:tag1 .
Where my-project
is your GCP Project ID and my-repo
is the name of the repo you created on artifact registry.
Push the Docker Image to Artifact Registry
docker push us-central1-docker.pkg.dev/my-project/my-repo/my-image:tag1
Deploy the Container on Cloud Run
gcloud beta run jobs create job-quickstart --image image_name:tag --region us-central1