Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init docker for local development environment. #4193

Merged

Conversation

xiaohanyu
Copy link
Contributor

This commit will try to dockerize superset in local development
environment.

The basic design is:

  • Enable superset, redis and postgres service instead of using sqlite,
    just want to simulate production environment settings
  • Use environment variables to config various app settings. It's easy to
    run and config superset to any environment if we use environment than
    traditional config files
  • For local development environment, we just expose postgres and redis
    to local host machine thus you can connect local port via psql or
    redis-cli
  • Wrap start up command in a standard docker-entrypoint.sh, and use
    tail -f /dev/null combined with manually superset runserver -d to
    make sure that code error didn't cause the container to fail.
  • Use volumes to share code between host and container, thus you can use
    your favourite tools to modify code and your code will run in
    containerized environment
  • Use volumes to persistent postgres and redis data, and also
    node_modules data.
    • If we don't cache node_modules in docker volume, then every time
      run docker build, the node_modules directory, will is about 500 MB
      large, will be sent to docker daemon, and make the build quite slow.
  • Wrap initialization commands to a single script docker-init.sh

After this dockerize setup, any developers who want to contribute to
superset, just follow three easy steps:

git clone https://github.com/apache/incubator-superset/
cd incubator-superset
docker-compose up -d
docker-compose exec superset bash
bash docker-init.sh

@xiaohanyu xiaohanyu force-pushed the feature-enable-docker-for-development branch 3 times, most recently from 3909812 to 5eb7c36 Compare January 11, 2018 03:15
@mistercrunch
Copy link
Member

This is great and useful, though I'm not 100% sure it belongs in the root of this repo, especially if it cannot be maintained by the committers.

Curious what other committers think? Does this belong in this repo? Another repo? In a folder?

Wherever this lands, we should document this in the installation docs as an option to set things up.

One thing we need to change for sure here is the location of superset/superset_config.py, this doesn't belong in the main package.

@xiaohanyu
Copy link
Contributor Author

xiaohanyu commented Jan 19, 2018

Hi, @mistercrunch ,

Actually, we have used docker based workflow for developing and deploying customized superset for more than half a year and it works great in both developers local machine and for simple deployment just use a modified docker-compose.yml in single machine for normal use. So I propose this PR just want to see if anybody have some interest and get some suggestions.

For the documentation, I can provide more if you think this is OK, and I'll explain some details about the basic workflow for development, some tips to make docker behaves like your local machine without too much code copy and folder share, and real experience about how to use docker-compose to manage different superset environment, like from local development environment, to testing, staging, and finally production environment.

The reason about why I contribute this PR to the main repo is I think there should be a faster way to newbies to set up an dev environment for superset. With this PR, new comers can just get a workable/reproducable superset dev workflow in minutes instead of hours. So I enable port mapping, like exposing redis and postgres port directly from docker to host, so you can use most of your familiar redis/postgres to facilitate the dev process.

And about superset_config.py, when I propose this PR, I don't want to modify the default config.py, so I put some example code to superset_config.py and do a git add -f, yeah, this is intentional. And we think that use environment variables to configure and inject config to runtime system is a very good idea from https://12factor.net/config. Or I think I can put the code to config.py directly, first check whether the necessary environment variable exists, if not, then return back to the original default way in config.py.

About whether we should put docker-compose.yml to another repo, I think it's possible, just like many other project does. And I can provide docker-compose settings for production environment if you like. But right now, for simplicity, I think a simple docker-compose.yml in main repo for developers to get on board is enough.

@mistercrunch
Copy link
Member

I'm 100% sure that superset_config.py shouldn't be in the main package though, it seems conflictual and potentially could take over another existing superset_config.py depending on PYTHONPATH or the cwd. I'm just thinking it would need to live somewhere else in the repo, and in the context of running in docker it would set the PYTHONPATH to add that folder. Maybe we can we have a docker subfolder in the repository?

I want to make it clear to users that we don't prescribe this setup specifically (using docker, using py3.4, postgres over mysql, redis over rabbit, ...) or that this is the official/proper way of running Superset or whatever. I'd feel more comfortable having all this in a subfolder and adding a section to the installation docs "Quick Start with Docker"

@xiaohanyu
Copy link
Contributor Author

@mistercrunch Agree with you about the superset_config.py things.

About the second part, I'll do a later refactoring based on your suggestions.

@brylie
Copy link
Contributor

brylie commented Feb 8, 2018

A docker-compose.yml file would be a great step to make Superset deployment easier! :-)

@mistercrunch
Copy link
Member

I agree @brylie, I'm just wondering where it fits in the project. Anything that the maintainer can't commit to maintain should probably go under a contrib/ folder.

@brylie
Copy link
Contributor

brylie commented Mar 12, 2018

@xiaohanyu, any status update here?

Perhaps we can compromise and put these files in a docker folder, so it is explicit what they are for?

Having a Docker container would really make this project more viable for several reasons, notably development and deployment. By way of example, we have deployed Metabase and Redash recently on AWS using their Docker images.

@xiaohanyu
Copy link
Contributor Author

@brylie I'll update the status this week. Just come back to work from this week. Sorry for the delay.

Think can put it to a contrib/docker folder.

@xiaohanyu xiaohanyu force-pushed the feature-enable-docker-for-development branch from 5eb7c36 to a9a345e Compare June 3, 2018 13:27
@xiaohanyu
Copy link
Contributor Author

Just rebased and put all files into a contrib/docker directory.

It seems current master branch has an issue with db migration, #5088 , so my docker setup fails. Wait for this to be fixed.

@xiaohanyu xiaohanyu force-pushed the feature-enable-docker-for-development branch from a9a345e to ee7697d Compare June 9, 2018 06:52
@xiaohanyu
Copy link
Contributor Author

Hi, All,

After #5088 has been fixed, I've revised my patch and now it works.

image

@mistercrunch I've refactored and put all docker related things to a contrib/docker directory, users can choose to use this if they like.

@brylie I just finish revision of this PR, too busy in last months.

@brylie
Copy link
Contributor

brylie commented Jun 9, 2018

OK, cool. I will try to test this on my work computer next Monday. Thanks for creating this development image, as not being able to install Superset has been a real blocking issue for me.

@xiaohanyu xiaohanyu force-pushed the feature-enable-docker-for-development branch from ee7697d to 9a46597 Compare June 10, 2018 02:33
This commit will try to dockerize superset in local development
environment.

The basic design is:
- Enable superset, redis and postgres service instead of using sqlite,
  just want to simulate production environment settings
- Use environment variables to config various app settings. It's easy to
  run and config superset to any environment if we use environment than
  traditional config files
- For local development environment, we just expose postgres and redis
  to local host machine thus you can connect local port via `psql` or
  `redis-cli`
- Wrap start up command in a standard `docker-entrypoint.sh`, and use
  `tail -f /dev/null` combined with manually `superset runserver -d` to
  make sure that code error didn't cause the container to fail.
- Use volumes to share code between host and container, thus you can use
  your favourite tools to modify code and your code will run in
  containerized environment
- Use volumes to persistent postgres and redis data, and also
  `node_modules` data.
  - If we don't cache `node_modules` in docker volume, then every time
    run docker build, the `node_modules` directory, will is about 500 MB
    large, will be sent to docker daemon, and make the build quite slow.
- Wrap initialization commands to a single script `docker-init.sh`

After this dockerize setup, any developers who want to contribute to
superset, just follow three easy steps:

```
git clone https://github.com/apache/incubator-superset/
cd incubator-superset
cp contrib/docker/{docker-build.sh,docker-compose.yml,docker-entrypoint.sh,docker-init.sh,Dockerfile} .
cp contrib/docker/superset_config.py superset/
bash -x docker-build.sh
docker-compose up -d
docker-compose exec superset bash
bash docker-init.sh
```
@xiaohanyu xiaohanyu force-pushed the feature-enable-docker-for-development branch from 9a46597 to de876b5 Compare June 10, 2018 03:16
@codecov-io
Copy link

codecov-io commented Jun 10, 2018

Codecov Report

Merging #4193 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #4193   +/-   ##
=======================================
  Coverage   77.46%   77.46%           
=======================================
  Files          44       44           
  Lines        8729     8729           
=======================================
  Hits         6762     6762           
  Misses       1967     1967

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d1c035...de876b5. Read the comment docs.

@xiaohanyu
Copy link
Contributor Author

CI passed after two more commits!

@mistercrunch
Copy link
Member

docker

@mistercrunch mistercrunch merged commit 0a276ff into apache:master Jun 10, 2018
@brylie
Copy link
Contributor

brylie commented Jun 11, 2018

When I try to run this via docker-compose up, I get the following error:

Pulling superset (apache/incubator-superset:)...
ERROR: pull access denied for apache/incubator-superset, repository does not exist or may require 'docker login'

@xiaohanyu
Copy link
Contributor Author

@bkyryliuk Hi, you need to build the image locally and manually.

image

Full steps:

git clone https://github.com/apache/incubator-superset/
cd incubator-superset
cp contrib/docker/{docker-build.sh,docker-compose.yml,docker-entrypoint.sh,docker-init.sh,Dockerfile} .
cp contrib/docker/superset_config.py superset/
bash -x docker-build.sh
docker-compose up -d
docker-compose exec superset bash
bash docker-init.sh

@brylie
Copy link
Contributor

brylie commented Jun 11, 2018

Cool, I am really eager to try Superset. I have been waiting for several months for some relatively easy packaging, since the manual installation is somewhat difficult.

@ahsanshah
Copy link

I had to modify the Entrypoint for this to work as advertized

Original:
#!/bin/bash
set -ex

if [ "$#" -ne 0 ]; then
exec "$@"
elif [ "$SUPERSET_ENV" = "local" ]; then
superset runserver -d
elif [ "$SUPERSET_ENV" = "production" ]; then
superset runserver -a 0.0.0.0 -w $((2 * $(getconf _NPROCESSORS_ONLN) + 1))
else
superset --help
fi

Change to:

#!/bin/bash
set -ex

if [ "$SUPERSET_ENV" = "local" ]; then
superset runserver -d
elif [ "$SUPERSET_ENV" = "production" ]; then
superset runserver -a 0.0.0.0 -w $((2 * $(getconf _NPROCESSORS_ONLN) + 1))
else
superset --help
fi

timifasubaa pushed a commit to airbnb/superset-fork that referenced this pull request Jul 25, 2018
This commit will try to dockerize superset in local development
environment.

The basic design is:
- Enable superset, redis and postgres service instead of using sqlite,
  just want to simulate production environment settings
- Use environment variables to config various app settings. It's easy to
  run and config superset to any environment if we use environment than
  traditional config files
- For local development environment, we just expose postgres and redis
  to local host machine thus you can connect local port via `psql` or
  `redis-cli`
- Wrap start up command in a standard `docker-entrypoint.sh`, and use
  `tail -f /dev/null` combined with manually `superset runserver -d` to
  make sure that code error didn't cause the container to fail.
- Use volumes to share code between host and container, thus you can use
  your favourite tools to modify code and your code will run in
  containerized environment
- Use volumes to persistent postgres and redis data, and also
  `node_modules` data.
  - If we don't cache `node_modules` in docker volume, then every time
    run docker build, the `node_modules` directory, will is about 500 MB
    large, will be sent to docker daemon, and make the build quite slow.
- Wrap initialization commands to a single script `docker-init.sh`

After this dockerize setup, any developers who want to contribute to
superset, just follow three easy steps:

```
git clone https://github.com/apache/incubator-superset/
cd incubator-superset
cp contrib/docker/{docker-build.sh,docker-compose.yml,docker-entrypoint.sh,docker-init.sh,Dockerfile} .
cp contrib/docker/superset_config.py superset/
bash -x docker-build.sh
docker-compose up -d
docker-compose exec superset bash
bash docker-init.sh
```
@kakoni
Copy link
Contributor

kakoni commented Sep 4, 2018

Few comments;

  1. Instead of doing this in compose config
superset:
    image: apache/incubator-superset

why not change this to

  superset:
    build: .

and get rid of docker-build.sh ?

  1. entrypoint. Should this have dev option or even run local with something this;
elif [ "$SUPERSET_ENV" = "local" ]; then
    superset runserver -d &
    cd /home/work/incubator-superset/superset/assets
    npm run dev

@brylie
Copy link
Contributor

brylie commented Sep 4, 2018

@kakoni, would you mind opening a PR with your suggested changes?

@kakoni
Copy link
Contributor

kakoni commented Sep 4, 2018

Of course; #5802

wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018
This commit will try to dockerize superset in local development
environment.

The basic design is:
- Enable superset, redis and postgres service instead of using sqlite,
  just want to simulate production environment settings
- Use environment variables to config various app settings. It's easy to
  run and config superset to any environment if we use environment than
  traditional config files
- For local development environment, we just expose postgres and redis
  to local host machine thus you can connect local port via `psql` or
  `redis-cli`
- Wrap start up command in a standard `docker-entrypoint.sh`, and use
  `tail -f /dev/null` combined with manually `superset runserver -d` to
  make sure that code error didn't cause the container to fail.
- Use volumes to share code between host and container, thus you can use
  your favourite tools to modify code and your code will run in
  containerized environment
- Use volumes to persistent postgres and redis data, and also
  `node_modules` data.
  - If we don't cache `node_modules` in docker volume, then every time
    run docker build, the `node_modules` directory, will is about 500 MB
    large, will be sent to docker daemon, and make the build quite slow.
- Wrap initialization commands to a single script `docker-init.sh`

After this dockerize setup, any developers who want to contribute to
superset, just follow three easy steps:

```
git clone https://github.com/apache/incubator-superset/
cd incubator-superset
cp contrib/docker/{docker-build.sh,docker-compose.yml,docker-entrypoint.sh,docker-init.sh,Dockerfile} .
cp contrib/docker/superset_config.py superset/
bash -x docker-build.sh
docker-compose up -d
docker-compose exec superset bash
bash docker-init.sh
```
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.26.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.26.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants