Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] How to deploy a Pathway AirByte streaming ETL microservice to Google Cloud run? #53

Open
vikas-velora opened this issue May 17, 2024 · 9 comments
Assignees
Labels
question Further information is requested

Comments

@vikas-velora
Copy link

Hi,

Is there a way to deploy pathway airbyte streaming ETL microservice to Google cloud run? If yes, how to go about it?

thanks.

@vikas-velora vikas-velora added the question Further information is requested label May 17, 2024
@dxtrous
Copy link
Member

dxtrous commented May 21, 2024

Hi @vikas-velora apologies for the slow turnaround on your question - our team is verifying if this is the case.

As a general rule, we advocate deployment from source (in this spirit: https://cloud.google.com/run/docs/deploying-source-code), and will provide the easiest recipe that works in this direction. The intended experience is something like this one with Render: https://pathway.com/developers/user-guide/deployment/render-deploy/.

@vikas-velora
Copy link
Author

vikas-velora commented May 21, 2024 via email

@vikas-velora
Copy link
Author

vikas-velora commented May 21, 2024 via email

@zxqfd555-pw
Copy link
Contributor

Hi Vikas,

To give a small technical heads-up, there is a way to dockerize the airbyte connector code for the local tests. Precisely, you would need to install Docker in your Dockerfile as follows:

RUN apt update && apt install docker.io -y

And then run with mounting two volumes, as follows:

docker run -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp <your_image_name>

The first volume is required to enable DinD, while the second one is needed because the /tmp is currently used to store the temporary artifacts of the airbyte connector. Note that this wouldn't be so easy to deploy (and I suppose, it's impossible to deploy it at Google Cloud) because of giving access to the Docker socket.

While I've also tried to use the docker:dind image as a base, I've also figured out that it's unusable for our case because of using Alpine Linux as the base for docker:dind which is not supported by Pathway yet. Thus, I think we need to do something different and implement running the airbyte connector without depending on Docker, in GCP. It would need to be done for the Pathway framework.

So, to wrap it up, the way to go will be to run the airbyte connector in the GCP - a feature that must be added to Pathway. I am currently checking this possibility and will be back to you today or in a few days.

@vikas-velora
Copy link
Author

Thanks so much @zxqfd555-pw . We tried multiple ways, and were unable to deploy - at least this confirms that it was not something to do with our knowledge 😊. Will wait for your update.

@zxqfd555-pw
Copy link
Contributor

Hi Vikas!

A quick heads-up: we can eliminate the need for the DinD technique for airbyte connectors by introducing a mode where they run as GCP jobs. I am in the process of implementing it, and we can release the corresponding update next week.

@voodoo11 voodoo11 assigned zxqfd555-pw and unassigned voodoo11 May 24, 2024
@vikas-velora
Copy link
Author

vikas-velora commented May 24, 2024 via email

@zxqfd555-pw
Copy link
Contributor

Hi Vikas!

Please note that now you can run airbyte data extraction jobs as Google Cloud Runs, which eliminates the need for DinD. Please refer to the Airbyte connector docs for the details.

@vikas-velora
Copy link
Author

vikas-velora commented Jun 10, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants