Skip to content

Commit

Permalink
Drop BigQuery and Teradata JDBC dependencies from the distribution; a…
Browse files Browse the repository at this point in the history
…dd tests, various improvements (#408)

* Use and configure license-maven-plugin (org.honton.chas)

* First setup of distribution verification integration test

* Use Java 17 for compilation, updates of test dependencies, update license validation config

* Update comment on CacioTest annotation

* Cleanup

* Add generating fat jars for WhiteRabbit and RabbitInAHat; lock hsqldb version for Java 1.8

* Enforce Java 1.8 for distributed dependencies

* Update main.yml

Project now requires Java 17 to build. Should still produce java 8 (1.8) compatible artifacts though.

* Bump org.apache.avro:avro from 1.11.2 to 1.11.3 in /rabbit-core

Bumps org.apache.avro:avro from 1.11.2 to 1.11.3.

---
updated-dependencies:
- dependency-name: org.apache.avro:avro
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Use jdk8 classifier for hsqldb 2.7.x

* Exclude older version of hsqldb

* Fix image crop when using stem table

* Update stem table image

* Decrease size of table panel when using stem table.

Without this change, the table panel height is always higher than
needed (when using stem table), because the stem table is counted
as one of the items in the components list. It is however shown
separately at the top, which is already accounted for by the
stem table margin.

* Add snowflake support (#37)

* Refactor RichConnection into separate classes, and add an abstraction for the JDBC connection. Implement a Snowflake connection with this abstraction

* Add unit tests for SnowflakeConnector

* Added Snowflake support for SourceDataScan; added minimal test for it; some refactorings to move database responsibility to rabbit-core/databases

* Move more database details to rabbit-core/databases

* Clearer name for method

* Ignore snowflake.env

* Create PostgreSQL container in the TestContainers way

* Refactored Snowflake tests + a bit of documentation

* Fix Snowflake test for Java 17, and make it into an automated integration test instead of a unit test

* Remove duplicate postgresql test

* Make TestContainers based database tests into automated integration tests

* Suppress some warnings when generating fat jars

* Let autimatic integration tests fail when docker is not available

* Allow explicit skipping of Snowflake integration tests

* Added tests for Snowflake, delimited text files

* Switch to fully verifying the scan results against a reference version (v0.10.7)

* Working integration test for Snowflake, and some refactorings

* Some proper logging, small code improvements and cleanup

* Remove unused interface

* Added tests, some changes to support testing

* Make automated test work reliably (way too many changes, sorry)

* Rudimentary support for Snowflake authenticator parameter (untested)

* review xmlbeans dependencies, remove conflict

* extend integration test for distribution

* Restructuring database configuration. Work in process, but unit and integration tests all OK

* Restructuring database configuration 2/x. Still work in process, but unit and integration tests all OK

* Restructuring database configuration 3/x. Still work in process, but unit and integration tests all OK

* Restructuring database configuration 4/x. Still work in process, but unit and integration tests all OK

* Restructuring database configuration 5/x. Still work in process, but unit and integration tests all OK

* Restructuring database configuration 6/x. Still work in process, but unit and integration tests all OK

* Restructuring database configuration 7/x. Still work in process, but unit and integration tests all OK

* Intermezzo: get rid of the package naming error (upper case R in whiteRabbit)

* Intermezzo: code cleanup

* Snowflake is now working from the GUI. And many small refactorings, like logging instead of printing to stout/err

* Refactor DbType into an enum, get rid of DBChoice

* Move DbType and DbSettings classes into configuration subpackage

* Avoid using a manually destructured DbSettings object when creating a RochConnection object

* Code cleanup, remove unneeded Snowflake references

* Refactoring, code cleanup

* More refactoring, code cleanup

* More refactoring, code cleanup and documentation

* Make sure that order of databases in pick list in GUI is the same as before, and enforce completeness of that list in a test

* Add/update copyright headers

* Add line to verify that a tooltip is shown for a DBConnectionInterface implementing class

* Test distribution for Snowflake JDBC issue with Java 17

* cleanup of build files

* Add verification that all JDBC drivers are in the distributed package

* Add/improve error reporting for Snowflake

* Disable screenshottaker in GuiTestExtension, hoping that that is what blocks the build on github. Fingers crossed

* Better(?) naming for database interface and implementing class

* Use our own GUITestExtension class

---------

Co-authored-by: Jan Blom <janblom@thehyve.nl>

* Add mysql test (#38)

* Fixed a bug in the comparison for sort; let comparison report report all differences before failing

* Allow the user to specify the port for a MySQL server

* Add tests for a MySQL source database

* Add sas test (#39)

* Add automated regression tests for SAS files

* Fix problems with comparisons of test results to references

* create bypass for value mismatch that only shows up in github actions so far

* create bypass for value mismatch that only shows up in github actions so far, 2nd

* Pom updates to enable building on MacOS

* Prepare release (#40)

* Add warehouse/database handling to StorageHandler class

* Show stdout/stderr from distribution verification when there are errors

* Pom updates to enable building on MacOS

* Update dependencies as far as possible without code changes

* Update README.md

---------

Co-authored-by: Jan Blom <janblom@thehyve.nl>

* Update whiterabbit/src/main/java/org/ohdsi/whiterabbit/WhiteRabbitMain.java

The sample size should start disabled, as the calculateNumericStats checkbox is unchecked by default.

Co-authored-by: Maxim Moinat <maximmoinat@gmail.com>

* Fixes from windows (#41)

* Fix problems blocking verification on Windows

* Avoid using bind mounts for TestContainers, copy files instead

* Remove file copy (was for debugging purposes)

* Oracle Tests: use the actual TestContainer hostname/ip address instead of localhost

* Remove debug print statement and stale imports

* Remove commented code

---------

Co-authored-by: Jan Blom <janblom@thehyve.nl>

* Use The Hyve fork of the caciocavello project (#42)

* Use The Hyve fork of the caciocavello project (Swing virtual graphics environment for testing) until the parent project has been fixed for JDK 18+

* Use updated cacio-tta version, should run fine when headless

* For developemnt, JDK versions 17-21 are supported

* Update docs (#44)

* Update documentation for Snowflake

* Add Snowflake.ini example file

* Add password field in Snowflake example

* Bump org.apache.commons:commons-compress in /rabbit-core (#47)

Bumps org.apache.commons:commons-compress from 1.25.0 to 1.26.0.

---
updated-dependencies:
- dependency-name: org.apache.commons:commons-compress
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump org.postgresql:postgresql from 42.7.1 to 42.7.2 in /rabbit-core (#46)

Bumps [org.postgresql:postgresql](https://github.com/pgjdbc/pgjdbc) from 42.7.1 to 42.7.2.
- [Release notes](https://github.com/pgjdbc/pgjdbc/releases)
- [Changelog](https://github.com/pgjdbc/pgjdbc/blob/master/CHANGELOG.md)
- [Commits](https://github.com/pgjdbc/pgjdbc/commits)

---
updated-dependencies:
- dependency-name: org.postgresql:postgresql
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Turn on mvn verify in github (#45)

* Turn on mvn verify in github

* Disable Oracle integration tests in Github workflow as they tend to generate a timeout

* Fix typo

* Avoid starting containers for Snowflake when configuration is not present

* Switch cacio back to parent project (#48)

* Use cacio-tta 1.18 instead of patched fork

* Remove license exception for The Hyve fork of cacio-tta

* Remove bigquery jars (#49)

* Initial setup for BigQuery integration test (WIP)

* Initial setup for BigQuery integration test (WIP)

* Removed BigQuery JDBC jars, added test to confirm it missing, and the way to make it work

* Make Teradata JDNC dependency a normal maven repo dependency. Include a basic tester, and remove lib directory

* Rename Teradata test, it only tests getting tablenames, does not perform a scan

* Add basic connection test for MS Sql Server

* Round up removing BigQuery and Teradata (licences do not allow redistribution). Better feedback to user, added to documentation

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Jan Blom <janblom@thehyve.nl>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Spayralbe <stefan@thehyve.nl>
Co-authored-by: Maxim Moinat <maximmoinat@gmail.com>
  • Loading branch information
5 people committed Feb 28, 2024
1 parent 9790cef commit 56eca88
Show file tree
Hide file tree
Showing 173 changed files with 638 additions and 737 deletions.
5 changes: 4 additions & 1 deletion .github/workflows/main.yml
@@ -1,6 +1,9 @@
# Continuous integration, including test and integration test
name: CI

env:
SKIP_ORACLE_TESTS: true

# Run in master and dev branches and in all pull requests to those branches
on:
push:
Expand Down Expand Up @@ -30,4 +33,4 @@ jobs:

# Gradle check
- name: Check
run: mvn test
run: mvn verify
2 changes: 2 additions & 0 deletions .gitignore
Expand Up @@ -24,3 +24,5 @@ data/

# contains authentication data for a Snowflake instance
snowflake.env
# contains authentication data for a BigQuery instance
bigquery.env
16 changes: 16 additions & 0 deletions README.md
Expand Up @@ -45,6 +45,22 @@ Requires Java 1.8 or higher for running, and read access to the database to be s
Dependencies
============
For the distributable packages, the only requirement is Java 8. For building the package, Java 17+ and Maven are needed.
There are exceptions for databases that use a JDBC driver with a license that does not allow distribution of the driver.
(BigQuery, Teradata)

**BigQuery**

If you want to use a BigQuery instance as the source database, after installing WhiteRabbit, you will need to download
a zip file with the BigQuery JDBC driver, and unzip it in de `repo` directory of the WhiteRabbit installation.
The latest version tested with WhiteRabbit is 1.5.2.1005 .
The zip file can be downloaded [here](https://storage.googleapis.com/simba-bq-release/jdbc/SimbaJDBCDriverforGoogleBigQuery42_1.5.2.1005.zip)

**Teradata**

If you want to use a Teradata instance as the source database, after installing WhiteRabbit, you will need to download
a zip file with the Teradata JDBC driver, and unzip it in de `repo` directory of the WhiteRabbit installation.
The latest version tested with WhiteRabbit is 20.00.00.16 .
The zip file can be downloaded [here](https://downloads.teradata.com/download/connectivity/jdbc-driver)

Getting Started
===============
Expand Down
7 changes: 7 additions & 0 deletions docs/WhiteRabbit.html
Expand Up @@ -496,6 +496,8 @@ <h4>PostgreSQL</h4>
</div>
<div id="google-bigquery" class="section level4">
<h4>Google BigQuery</h4>
<p>If you want to use a BigQuery instance as the source database, after installing WhiteRabbit, you will need to download a zip file with the BigQuery JDBC driver, and unzip it in de <code>repo</code> directory of the WhiteRabbit installation. The latest version tested with WhiteRabbit is 1.5.2.1005 .</p>
<p>The zip file can be downloaded <a href="https://storage.googleapis.com/simba-bq-release/jdbc/SimbaJDBCDriverforGoogleBigQuery42_1.5.2.1005.zip">here</a></p>
<p>Google BigQuery (GBQ) supports two different connection/authentication methods: application default credentials and service account authentication. The former method is considered more secure because it writes auditing events to stackdriver. The specific method used is determined by the arguments provided to the configuration panel as described below.</p>
<p>Authentication via application default credentials:</p>
<p>When using application default credentials authentication, you must run the following gcloud command in the user account only once: <code>gcloud auth application-default login</code> (do not include the single quote characters). An application key is written to <code>~/.config/gcloud/application_default_credentails.json</code>.</p>
Expand Down Expand Up @@ -534,6 +536,11 @@ <h4>Snowflake</h4>
</ul>
<p>Please note that the fields <em><strong>Password</strong></em> and <em><strong>Authentication method</strong></em> are mutually exclusive: for only one of these fields a value should be supplied. A warning will be given when a value is supplied for both fields.</p>
</div>
<div id="teradata" class="section level4">
<h4>Teradata</h4>
<p>If you want to use a Teradata instance as the source database, after installing WhiteRabbit, you will need to download a zip file with the Teradata JDBC driver, and unzip it in de <code>repo</code> directory of the WhiteRabbit installation. The latest version tested with WhiteRabbit is 20.00.00.16 .</p>
<p>The zip file can be downloaded <a href="https://downloads.teradata.com/download/connectivity/jdbc-driver">here</a></p>
</div>
</div>
</div>
<div id="scanning-a-database" class="section level2">
Expand Down
14 changes: 14 additions & 0 deletions docs/WhiteRabbit.md
Expand Up @@ -133,6 +133,12 @@ When the SQL Server JDBC drivers are installed, you can also use Windows authent

#### Google BigQuery

If you want to use a BigQuery instance as the source database, after installing WhiteRabbit, you will need to download
a zip file with the BigQuery JDBC driver, and unzip it in de `repo` directory of the WhiteRabbit installation.
The latest version tested with WhiteRabbit is 1.5.2.1005 .

The zip file can be downloaded [here](https://storage.googleapis.com/simba-bq-release/jdbc/SimbaJDBCDriverforGoogleBigQuery42_1.5.2.1005.zip).

Google BigQuery (GBQ) supports two different connection/authentication methods: application default credentials and service account authentication.
The former method is considered more secure because it writes auditing events to stackdriver.
The specific method used is determined by the arguments provided to the configuration panel as described below.
Expand Down Expand Up @@ -173,6 +179,14 @@ Authentication via service account credentials:
Please note that the fields _**Password**_ and _**Authentication method**_ are mutually exclusive: for only one of these fields
a value should be supplied. A warning will be given when a value is supplied for both fields.

#### Teradata

If you want to use a Teradata instance as the source database, after installing WhiteRabbit, you will need to download
a zip file with the Teradata JDBC driver, and unzip it in de `repo` directory of the WhiteRabbit installation.
The latest version tested with WhiteRabbit is 20.00.00.16 .

The zip file can be downloaded [here](https://downloads.teradata.com/download/connectivity/jdbc-driver).

## Scanning a Database

### Performing the Scan
Expand Down
10 changes: 10 additions & 0 deletions docs/runmarkdown.R
@@ -0,0 +1,10 @@
#!/usr/bin/env Rscript

#
# use this script (on linux/mac) to reformat the html pages from the markdown files
# only needed when you change .md files
#

#devtools::install_github("ropenscilabs/icon")
library(rmarkdown)
rmarkdown::render_site()
Binary file not shown.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Binary file not shown.

This file was deleted.

This file was deleted.

8 changes: 0 additions & 8 deletions lib/com/simba/googlebigquery/jdbc/avro/1.8.2/avro-1.8.2.pom

This file was deleted.

This file was deleted.

This file was deleted.

12 changes: 0 additions & 12 deletions lib/com/simba/googlebigquery/jdbc/avro/maven-metadata.xml

This file was deleted.

This file was deleted.

This file was deleted.

Binary file not shown.

This file was deleted.

This file was deleted.

8 changes: 0 additions & 8 deletions lib/com/simba/googlebigquery/jdbc/gax/1.42.0/gax-1.42.0.pom

This file was deleted.

This file was deleted.

This file was deleted.

12 changes: 0 additions & 12 deletions lib/com/simba/googlebigquery/jdbc/gax/maven-metadata.xml

This file was deleted.

This file was deleted.

This file was deleted.

Binary file not shown.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Binary file not shown.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Binary file not shown.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Binary file not shown.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Binary file not shown.

This file was deleted.

This file was deleted.

This file was deleted.

0 comments on commit 56eca88

Please sign in to comment.