Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move some query runners to separate packages #2921

Open
jezdez opened this issue Oct 10, 2018 · 4 comments
Open

Move some query runners to separate packages #2921

jezdez opened this issue Oct 10, 2018 · 4 comments
Assignees

Comments

@jezdez
Copy link
Member

jezdez commented Oct 10, 2018

Issue Summary

Some query runners require packages that are not available as pre-built wheel files, and have to be compiled during the Docker image build phase. Since this happens on every test run on CircleCI it increases the test runtime without much added benefit.

Since we now have a Python extension API we can use it to ship query runners in separate Python packages and register them via the setuptools entrypoints.

In addition that would also potentially help making the Redash core leaner in terms of maintenance burden since more community members could contribute to the query runner packages directly.

Some open questions:

  • What would the release cadence and maintenance responsibility of the Redash team be?

  • Would outside contributors be able to own a query runner package completely?

  • Should there be a cookiecutter package template for easier creation of query runner packages?

Technical details:

  • Redash Version: 5.0.1
  • Browser/OS: Firefox 63, macOS 10.14
  • How did you install Redash: Docker
@arikfr
Copy link
Member

arikfr commented Oct 10, 2018

So the idea is that during tests we will build a slimmer version of the image and only when building the final version we will include the other packages?

In overall it sounds right and I was thinking of splitting the query runners to their own packages regardless, but I'm worried about the friction it adds to the release process. 🤔

We should also consider dropping CircleCI. On SemaphoreCI running the tests + building the Docker image takes 5 to 10 minutes (depends on the caching state). Only concern with Semaphore, is that it seems that they don't have a way to not load secrets when building external pull requests. We can overcome this by using a separate pipeline for pushing the Docker image to Docker Hub, but it's nice to have it all in one public place. I will reach out to them about this.

@jezdez
Copy link
Member Author

jezdez commented Oct 10, 2018

So the idea is that during tests we will build a slimmer version of the image and only when building the final version we will include the other packages?

Yeah, plus the idea would be that the individual query runners should be tested in their own repos since it's unrealistic that we'd have all the various data sources in the main repo test setup anyway.

I'm wondering how this could be solved in along the lines of #2810, which would allow stopping to ship some of the query runners in the main Redash docker image and instead use the native extensibility of Dockerfiles as a way to setup up a "full" Redash instance. I think there is some precedence out there in the common Docker images that allow extending them easily, e.g.

FROM redash/redash:latest

RUN pip install redash-cassandra

Or even have an own install script that acts as a switch between pip/npm:

FROM redash/redash:latest

RUN redash install --query-runner cassandra

In overall it sounds right and I was thinking of splitting the query runners to their own packages regardless, but I'm worried about the friction it adds to the release process. 🤔

I'm not sure if it'd mean release friction per se, but it would definitely be a different area to worry about during releases while reducing the friction in other places. But that could be handled by further formalizing of releases (e.g. a release playbook, scheduled releases like every 3 months, a dedicated release team).

We should also consider dropping CircleCI. On SemaphoreCI running the tests + building the Docker image takes 5 to 10 minutes (depends on the caching state). Only concern with Semaphore, is that it seems that they don't have a way to not load secrets when building external pull requests. We can overcome this by using a separate pipeline for pushing the Docker image to Docker Hub, but it's nice to have it all in one public place. I will reach out to them about this.

Sure that works for me, although I have no experience with Semaphore I've heard good things. My only worry is that it'd be a bandaid for the slowness but not separate concerns between the Redash core app and the query runners.

@arikfr
Copy link
Member

arikfr commented Oct 10, 2018

I'm not sure if it'd mean release friction per se, but it would definitely be a different area to worry about during releases while reducing the friction in other places. But that could be handled by further formalizing of releases (e.g. a release playbook, scheduled releases like every 3 months, a dedicated release team).

I have a feeling that we should start from these steps (formalizing & automating the release process) before making the process more involved.

Until then to make things faster we can add a toggle to the Dockerfile that controls whether to run pip install requirements_all_ds.txt -- it's not really needed for the tests and skipping it will address the issue described here. (#2928)

WDYT?

@jezdez
Copy link
Member Author

jezdez commented Oct 11, 2018

Yeah, that works for me, and #2928 is a good bandaid till then. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants