Skip to content

Conversation

@tomasatdatabricks
Copy link
Contributor

Run tests in docker environment with Ubuntu 16.04.

@tomasatdatabricks tomasatdatabricks force-pushed the tomas/dockerize_tests branch 4 times, most recently from 1603dc2 to b1d8a57 Compare April 2, 2018 17:24
@thunterdb
Copy link
Contributor

Travis is under maintenance, I will check again later.


jdk: oraclejdk8

sudo: required
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need sudo at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I believe we do. It makes Travis to run inside VM instead of a container. I thought you needed in order to run docker.

I can double check what happens if I run with sudo required = False

Copy link
Contributor

@thunterdb thunterdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few changes for understanding better what is happening.

At a higher level, I am concern by the complexity of the whole thing, because we have all these layers now:

  • our code + spark
  • conda to control the python env
  • docker to control the conda env
  • travis to control the tests

The README should be update to reflect that the recommended (supported) way to build and test stuff is using the provided conda environment, and we should all default to that.

Also, do we need docker? Is this just for travis? I hope we can get rid of that as soon as possible, this adds an unwelcome layer of indirection.


cache:
directories:
- $HOME/.ivy2/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the logs, the maven deps don't seem cached anymore

export CONDA_URL="repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh";
export PYSPARK_PYTHON=python3;
fi
- docker run -e "JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you mean? The command starts a docker container. It might not be the best way, I am not that familiar with the docker. I am open to suggestions.

- docker exec -t ubuntu-test bash -c "apt-get update && apt-get upgrade -y"
- docker exec -t ubuntu-test bash -c "apt-get install -y curl bzip2 openjdk-8-jdk"
# download and set up miniconda
- docker exec -t ubuntu-test bash -c "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move these scripts into ./bin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but they were in the same place before. I only moved the execution inside docker.

# Run the python unit tests.
- sbt -Dspark.version=$SPARK_VERSION -Dscala.version=$SCALA_BINARY_VERSION tfs_testing/assembly
- SPARK_HOME=$HOME/.cache/spark-versions/$SPARK_BUILD ./python/run-tests.sh
- docker exec -t ubuntu-test bash -c "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing here

@tomasatdatabricks
Copy link
Contributor Author

Thanks for the review @tjhunter!

I am not sure I understand your concerns. To clarify the situation:

  1. What this PR does:
    • move Travis project build and test execution into docker with ubuntu 16.04 environment
      • I believe all of the code remained the same, e.g. the conda environment was used in
        the same way before this change.
  2. Why does it do it:
    • tf >= 1.5 requires Ubuntu >= 16.04, travis only supports 12.04 and 14.04
  3. What is affected:
  • only Travis; everything is now run inside a docker container.
  1. Are there other options:
    • the only other option I am aware of is to move the tests into different CI environment such as Jenkins.
  2. Can we revert this change in the future
  • yes, when travis makes ubuntu 16.04 available, but it's not clear when that is gonna happen.

I chose the docker path cause it seemed to be the least possible change required to get the tests running again. The docker code can definitely be simplified. I did it this way partly cause I am not a master of docker and partly because this closely mimics the previous state - i.e. all the commands are there in the same order, the only difference is we now send them to the docker container.

Personally, I don't see a big problem with running tests inside Docker. It gives us greater control of the environment and I believe it is quite common for people to do that, including tests run inside Travis. We can spend some time to figure out how to do it properly (e.g. get the maven cache working again) but in the short term, at least this way we can run the tests successfully.

But I am really open to suggestions. Feel free to scrap this PR or make changes to it if you have a better solution.

@thunterdb
Copy link
Contributor

@tomasatdatabricks for clarification, it is great that our tests are running again, and the cost of some extra complexity with docker is certainly worth this.

This looks good for me, but can you add the instructions in the README, like in this section:
https://github.com/databricks/tensorframes#how-to-compile-and-install-for-developers
to explain that conda is the recommended setup for developing and testing tensorframes? We should standardize on that environment so that we can easily debug issues with build when they occur.

@thunterdb
Copy link
Contributor

Feel free to merge after that, it looks good as far as I am concerned.

@tomasatdatabricks
Copy link
Contributor Author

Ok, will do. Thanks Tim!

@tomasatdatabricks tomasatdatabricks force-pushed the tomas/dockerize_tests branch 2 times, most recently from bad5f7b to 01f2d8d Compare April 8, 2018 23:25
@thunterdb
Copy link
Contributor

@tomasatdatabricks I am merging. Thanks for the work!

@thunterdb thunterdb merged commit eca77ca into databricks:master Apr 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants