Skip to content
This repository has been archived by the owner on Oct 8, 2019. It is now read-only.

Implement system test #336

Closed
wants to merge 29 commits into from
Closed
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
2237215
add feature for testing UDF
amaya382 Aug 23, 2016
4d2bdfc
add usage
amaya382 Sep 5, 2016
6e31bbd
move systemtest into specific package
amaya382 Sep 5, 2016
d0108a9
mod project settings of systemtest
amaya382 Sep 5, 2016
1a0da21
add ordered/unordered result matching
amaya382 Sep 5, 2016
81efb8c
add license header
amaya382 Sep 5, 2016
6bc276c
enable to use external properties
amaya382 Sep 7, 2016
915cf78
update README about external properties
amaya382 Sep 7, 2016
bd0143c
suppress warnings
amaya382 Sep 7, 2016
57566e4
potential bug to exception
amaya382 Sep 7, 2016
02a9385
add ordered/unordered result matching for multiple queries
amaya382 Sep 7, 2016
bca5404
refine README
amaya382 Sep 7, 2016
242dfaf
Merge branch 'master' into 'feature/systemtest'
amaya382 Sep 7, 2016
48a4fc9
refine README
amaya382 Sep 8, 2016
e7b2c6b
support array and map in insert statement
amaya382 Sep 9, 2016
58b89ec
fix error messages
amaya382 Sep 9, 2016
71d89d2
make code tougher
amaya382 Sep 12, 2016
0b01896
avoid running with no runner
amaya382 Sep 12, 2016
b5f35c3
refresh null check
amaya382 Sep 13, 2016
d2f0370
mod name and inheritance structure
amaya382 Sep 13, 2016
62e3588
Update license headers
amaya382 Nov 16, 2016
1e8093b
Mod README
amaya382 Nov 17, 2016
f40ade0
Add exception
amaya382 Nov 17, 2016
635a0a1
Make dir name static
amaya382 Nov 17, 2016
06c88de
Mod assert methods
amaya382 Nov 17, 2016
4aa1cfc
Fix process of tdprop
amaya382 Nov 17, 2016
bdf38c2
Mod README
amaya382 Nov 17, 2016
25db0c0
Refine access modifiers/calls
amaya382 Nov 17, 2016
f2ae1f7
Update license header
amaya382 Nov 17, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions pom.xml
Expand Up @@ -45,6 +45,7 @@
<module>nlp</module>
<module>xgboost</module>
<module>mixserv</module>
<module>systemtest</module>
</modules>

<properties>
Expand Down
192 changes: 192 additions & 0 deletions systemtest/README.md
@@ -0,0 +1,192 @@
## Usage

### Initialization

Define `CommonInfo`, `Runner` and `Team` in each your test class.

#### `CommonInfo`

* `SystemTestCommonInfo`

`CommonInfo` holds common information of test class, for example,
you can refer to auto-defined path to resources. This should be defined as `private static`.


#### `Runner`

* `HiveSystemTestRunner`
* `TDSystemTestRunner`

`Runner` represents a test environment and its configuration. This must be defined with `@ClassRule`
as `public static` because of JUnit spec. You can add test class initializations by `#initBy(...)`
with class methods of `HQ`, which are abstract domain-specific hive queries, in instance initializer
of each `Runner`.


#### `Team`

* `SystemTestTeam`

`Team` manages `Runner`s each test method. This must be defined with `@Rule` as `public` because of
JUnit spec. You can set `Runner`s via constructor argument as common in class and via `#add(...)`
as method-local and add test method initializations by `#initBy(...)` and test case by `#set(...)`
with class methods of `HQ`. Finally, don't forget call `#run()` to enable set `HQ`s.
As an alternative method, by `#set(HQ.autoMatchingByFileName(${filename}))` with queries predefined in
`auto-defined/path/init/${filename}`, `auto-defined/path/case/${filename}` and
`auto-defined/path/answer/${filename}`, you can do auto matching test.


### External properties

You can use external properties at `systemtest/src/test/resources/hivemall/*`, default is `hiverunner.properties`
for `HiveSystemTestRunner` and `td.properties` for `TDSystemTestRunner`. Also user-defined properties file can
be loaded via constructor of `Runner` by file name.


## Notice

* DDL and insert statement should be called via class methods of `HQ` because of wrapping hive queries
and several runner-specific APIs, don't call them via string statement
* Also you can use low-level API via an instance of `Runner`, independent of `Team`
* You can use `IO.getFromResourcePath(...)` to get answer whose format is TSV
* Table created in initialization of runner should be used as immutable, don't neither insert nor update
* TD client configs in properties file prior to $HOME/.td/td.conf
* Don't use insert w/ big data, use file upload instead

## Quick example

```java
package hivemall;
// here is several imports
public class QuickExample {
private static SystemTestCommonInfo ci = new SystemTestCommonInfo(QuickExample.class);

@ClassRule
public static HiveSystemTestRunner hRunner = new HiveSystemTestRunner(ci) {
{
initBy(HQ.uploadByResourcePathAsNewTable("color", ci.initDir + "color.tsv",
new LinkedHashMap<String, String>() {
{
put("name", "string");
put("red", "int");
put("green", "int");
put("blue", "int");
}
})); // create table `color`, which is marked as immutable, for this test class

// add function from hivemall class
initBy(HQ.fromStatement("CREATE TEMPORARY FUNCTION hivemall_version as 'hivemall.HivemallVersionUDF'"));
}
};

@ClassRule
public static TDSystemTestRunner tRunner = new TDSystemTestRunner(ci) {
{
initBy(HQ.uploadByResourcePathAsNewTable("color", ci.initDir + "color.tsv",
new LinkedHashMap<String, String>() {
{
put("name", "string");
put("red", "int");
put("green", "int");
put("blue", "int");
}
})); // create table `color`, which is marked as immutable, for this test class
}
};

@Rule
public SystemTestTeam team = new SystemTestTeam(hRunner); // set hRunner as default runner

@Rule
public ExpectedException predictor = ExpectedException.none();


@Test
public void test0() throws Exception {
team.add(tRunner, hRunner); // test on HiveRunner -> TD -> HiveRunner (NOTE: state of DB is retained in each runner)
team.set(HQ.fromStatement("SELECT name FROM color WHERE blue = 255 ORDER BY name"), "azure\tblue\tmagenta", true); // ordered test
team.run(); // this call is required
}

@Test
public void test1() throws Exception {
// test on HiveRunner once only
String tableName = "users";
team.initBy(HQ.createTable(tableName, new LinkedHashMap<String, String>() {
{
put("name", "string");
put("age", "int");
put("favorite_color", "string");
}
})); // create local table in this test method `users` for each set runner(only hRunner here)
team.initBy(HQ.insert(tableName, Arrays.asList("name", "age", "favorite_color"), Arrays.asList(
new Object[]{"Karen", 16, "orange"}, new Object[]{"Alice", 17, "pink"}))); // insert into `users`
team.set(HQ.fromStatement("SELECT CONCAT('rgb(', red, ',', green, ',', blue, ')') FROM "
+ tableName + " u LEFT JOIN color c on u.favorite_color = c.name"), "rgb(255,165,0)\trgb(255,192,203)"); // unordered test
team.run(); // this call is required
}

@Test
public void test2() throws Exception {
// You can also use runner's raw API directly
for(RawHQ q: HQ.fromStatements("SELECT hivemall_version();SELECT hivemall_version();")) {
System.out.println(hRunner.exec(q).get(0));
}
// raw API doesn't require `SystemTestTeam#run()`
}

@Test
public void test3() throws Exception {
// test on HiveRunner once only
// auto matching by files which name is `test3` in `case/` and `answer/`
team.set(HQ.autoMatchingByFileName("test3", ci)); // unordered test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be

team.set(HQ.autoMatchingByFileName("test3"), ci);

team.run(); // this call is required
}

@Test
public void test4() throws Exception {
// test on HiveRunner once only
predictor.expect(Throwable.class); // you can use systemtest w/ other rules
team.set(HQ.fromStatement("invalid queryyy")); // this query throws an exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shows Cannot resolve method. We need to pass dummy expected argument here.

team.run(); // this call is required
// thrown exception will be caught by `ExpectedException` rule
}
}
```

The above requires following files

* `systemtest/src/test/resources/hivemall/HogeTest/init/color.tsv` (`systemtest/src/test/resources/${path/to/package}/${className}/init/${fileName}`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to rename HogeTest to QuickExample here.


```tsv
blue 0 0 255
lavender 230 230 250
magenta 255 0 255
violet 238 130 238
purple 128 0 128
azure 240 255 255
lightseagreen 32 178 170
orange 255 165 0
orangered 255 69 0
red 255 0 0
pink 255 192 203
```

* `systemtest/src/test/resources/hivemall/HogeTest/case/test3` (`systemtest/src/test/resources/${path/to/package}/${className}/case/${fileName}`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: HogeTest


```sql
-- write your hive queries
-- comments like this and multiple queries in one row are allowed
SELECT blue FROM color WHERE name = 'lavender';SELECT green FROM color WHERE name LIKE 'orange%'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to add semicolon (;) at the end of line, otherwise interpreted as 2 queries.

SELECT name FROM color WHERE blue = 255
```

* `systemtest/src/test/resources/hivemall/HogeTest/answer/test3` (`systemtest/src/test/resources/${path/to/package}/${className}/answer/${fileName}`)

tsv format is required

```tsv
230
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

250 instead of 230?

165 69
azure blue magenta
```
88 changes: 88 additions & 0 deletions systemtest/pom.xml
@@ -0,0 +1,88 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>io.github.myui</groupId>
<artifactId>hivemall</artifactId>
<version>0.4.2-rc.2</version>
<relativePath>../pom.xml</relativePath>
</parent>

<artifactId>hivemall-systemtest</artifactId>
<name>System test for Hivemall</name>
<packaging>jar</packaging>

<dependencies>
<dependency>
<groupId>io.github.myui</groupId>
<artifactId>hivemall-core</artifactId>
<version>0.4.2-rc.2</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
</dependency>
<dependency>
<groupId>com.klarna</groupId>
<artifactId>hiverunner</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>com.treasuredata.client</groupId>
<artifactId>td-client</artifactId>
<version>0.7.25</version>
<classifier>jar-with-dependencies</classifier>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-csv</artifactId>
<version>1.1</version>
</dependency>
<dependency>
<groupId>org.msgpack</groupId>
<artifactId>msgpack-core</artifactId>
<version>0.8.9</version>
</dependency>
<dependency>
<groupId>org.hamcrest</groupId>
<artifactId>hamcrest-library</artifactId>
<version>1.3</version>
</dependency>
</dependencies>

<build>
<directory>target</directory>
<outputDirectory>target/classes</outputDirectory>
<finalName>${project.artifactId}-${project.version}</finalName>
<testOutputDirectory>target/test-classes</testOutputDirectory>
<pluginManagement>
<plugins>
<plugin>
<groupId>com.mycila</groupId>
<artifactId>license-maven-plugin</artifactId>
<version>2.8</version>
<configuration>
<header>${project.parent.basedir}/resources/license-header.txt</header>
<properties>
<currentYear>${build.year}</currentYear>
<copyrightOwner>${project.organization.name}</copyrightOwner>
</properties>
<includes>
<include>src/main/**/*.java</include>
<include>src/test/**/*.java</include>
</includes>
<encoding>UTF-8</encoding>
<headerDefinitions>
<headerDefinition>${project.parent.basedir}/resources/header-definition.xml
</headerDefinition>
</headerDefinitions>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
</project>
33 changes: 33 additions & 0 deletions systemtest/src/main/java/com/klarna/hiverunner/Extractor.java
@@ -0,0 +1,33 @@
/*
* Hivemall: Hive scalable Machine Learning Library
*
* Copyright (C) 2016 Makoto YUI
* Copyright (C) 2013-2015 National Institute of Advanced Industrial Science and Technology (AIST)
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.klarna.hiverunner;

import com.klarna.hiverunner.config.HiveRunnerConfig;
import org.junit.rules.TemporaryFolder;

public class Extractor {
public static StandaloneHiveServerContext getStandaloneHiveServerContext(
TemporaryFolder basedir, HiveRunnerConfig hiveRunnerConfig) {
return new StandaloneHiveServerContext(basedir, hiveRunnerConfig);
}

public static HiveServerContainer getHiveServerContainer(HiveServerContext context) {
return new HiveServerContainer(context);
}
}