Skip to content

Commit

Permalink
Add python tests and travis integration
Browse files Browse the repository at this point in the history
Add python tests and travis integration
  • Loading branch information
mforsyth authored and icexelloss committed Jul 14, 2017
1 parent 6d49fb3 commit 3d932b4
Show file tree
Hide file tree
Showing 19 changed files with 1,710 additions and 1,098 deletions.
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,8 @@ project/project
project/target
target
.idea

.vscode
metastore_db
derby.log
python/spark
**/.cache
39 changes: 39 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,42 @@ scala:
- 2.11.8
jdk:
- oraclejdk8
install:
# This work is to Run Conda on travis; based on https://conda.io/docs/travis.html
- sudo apt-get update
# We do this conditionally because it saves us some downloading if the
# version is the same.
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
wget https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh -O miniconda.sh;
else
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh;
fi
- bash miniconda.sh -b -p $HOME/miniconda
- export PATH="$HOME/miniconda/bin:$PATH"
- hash -r
- conda config --set always_yes yes --set changeps1 no
- conda update -q conda
# Useful for debugging any issues with conda
- conda info -a
# The parameter here is the number of batches to split into
# Depending on the total number of tests the matrix below
# has to have the same number of entries,
# i.e. 8 entries will have: aa, ab, ac, ad, ..., ah
- bash ./scripts/divide_scala_tests.sh 8
env:
matrix:
- TEST_DIR=. DEPS_CMD=':' TEST_CMD='./scripts/run_scala_test.sh aa'
- TEST_DIR=. DEPS_CMD=':' TEST_CMD='./scripts/run_scala_test.sh ab'
- TEST_DIR=. DEPS_CMD=':' TEST_CMD='./scripts/run_scala_test.sh ac'
- TEST_DIR=. DEPS_CMD=':' TEST_CMD='./scripts/run_scala_test.sh ad'
- TEST_DIR=. DEPS_CMD=':' TEST_CMD='./scripts/run_scala_test.sh ae'
- TEST_DIR=. DEPS_CMD=':' TEST_CMD='./scripts/run_scala_test.sh af'
- TEST_DIR=. DEPS_CMD=':' TEST_CMD='./scripts/run_scala_test.sh ag'
- TEST_DIR=. DEPS_CMD=':' TEST_CMD='./scripts/run_scala_test.sh ah'
- TEST_DIR=python DEPS_CMD='./travis/prepare_python_tests.sh' TEST_CMD='./travis/run_python_tests.sh'
before_script:
# this chmod command is a workaround for a current bug in travis image.
# see https://github.com/travis-ci/travis-ci/issues/7703
- sudo chmod +x /usr/local/bin/sbt
- (cd $TEST_DIR && $DEPS_CMD)
script: cd $TEST_DIR && $TEST_CMD
22 changes: 22 additions & 0 deletions python/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,20 @@
<!--
#
# Copyright 2017 TWO SIGMA OPEN SOURCE, LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
-->
ts-flint - Time Series Library for PySpark
==========================================

Expand Down Expand Up @@ -50,6 +67,11 @@ Documentation

The Flint python bindings are documented at https://ts-flint.readthedocs.io/en/latest

Run tests
---------

To run tests for the Python code see a separate [README](tests/README.md) file in the tests directory

Examples
--------

Expand Down
56 changes: 56 additions & 0 deletions python/tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
<!--
#
# Copyright 2017 TWO SIGMA OPEN SOURCE, LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
-->
# Python tests

## Overview
This directory contains the code to test the Python code. It uses the `unittest` module.

## Prerequisites
The tests need a spark distribution installed locally to run. An easy way to do it is to go to the
[Apache Spark download page](https://spark.apache.org/downloads.html) and select version 2.1.1 (May 02 2017), Pre-built for Apache Hadoop 2.7 and later.

Extract the tarball in a local directory and set the following environment variable:
```
export SPARK_HOME=<local-spark-directory>
```
One time preparation for running the python tests can be setup by runinning the following
from the root Flint directory:
```
scripts/prepare_python_tests.sh
```

## Running tests
To run the tests issue the following command from the root Flint directory:
```
scripts/run_python_tests.sh
```

## Code
The code for the tests is found in this `tests` directory. The content of the files here is as follows:

* `base_test_case.py` Contains code for the `BaseTestCase` abstract class that is the grandfather of all the testcases.
* `spark_test_case.py` Contains a concrete class, `SparkTestCase`, that inherits from `BaseTestCase` and sets up a local `SparkContext`. This is the default class to inherit test cases from.
* `test_dataframe.py` Contains about 50 test cases for the `TimeSeriesDataFrame`.
* `test_data.py` Contains constant data for the tests.
* `utils.py` Contains specialized assert functions and Pandas DataFrame creation.

## Extending
If the test setup done in the default class, `SparkTestCase`, does not fit the needs of a particular environment, a new class can be written. The name of the new class, say `MyTestCase` is then exported in the `BASE_CLASS` variable before the tests are run:
```
export FLINT_BASE_TESTCASE=<Name of new class>
```
Empty file added python/tests/ts/__init__.py
Empty file.
72 changes: 72 additions & 0 deletions python/tests/ts/base_test_case.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#
# Copyright 2017 TWO SIGMA OPEN SOURCE, LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
'''
The base class code for all Flint unit tests
'''
import unittest
from abc import ABCMeta, abstractclassmethod
import tests.utils as test_utils
from tests.ts.test_data import (FORECAST_DATA, PRICE_DATA, VOL_DATA, VOL2_DATA,
VOL3_DATA, INTERVALS_DATA)
from functools import lru_cache


class BaseTestCase(unittest.TestCase, metaclass=ABCMeta):
''' Abstract base class for all Flint tests
'''
@abstractclassmethod
def setUpClass(cls):
''' The automatic setup method for subclasses '''
return

@abstractclassmethod
def tearDownClass(cls):
''' The automatic tear down method for subclasses '''
return

@lru_cache(maxsize=None)
def forecast(self):
return self.flintContext.read.pandas(
test_utils.make_pdf(FORECAST_DATA, ["time", "id", "forecast"]))

@lru_cache(maxsize=None)
def vol(self):
return self.flintContext.read.pandas(
test_utils.make_pdf(VOL_DATA, ["time", "id", "volume"]))

@lru_cache(maxsize=None)
def vol2(self):
return self.flintContext.read.pandas(
test_utils.make_pdf(VOL2_DATA, ["time", "id", "volume"]))

@lru_cache(maxsize=None)
def vol3(self):
return self.flintContext.read.pandas(
test_utils.make_pdf(VOL3_DATA, ["time", "id", "volume"]))

@lru_cache(maxsize=None)
def price(self):
return self.flintContext.read.pandas(
test_utils.make_pdf(PRICE_DATA, ["time", "id", "price"]))

@lru_cache(maxsize=None)
def intervals(self):
return self.flintContext.read.pandas(
test_utils.make_pdf(INTERVALS_DATA, ['time']))

def clocks(self):
from ts.flint import clocks
return clocks
Loading

0 comments on commit 3d932b4

Please sign in to comment.