Skip to content

Commit

Permalink
[KYUUBI #6281][PY] Initialize github action for python unit testing
Browse files Browse the repository at this point in the history
# 🔍 Description
## Issue References 🔗

This pull request fixes #6281

## Describe Your Solution 🔧

The change initialize a CI job to run unit testing on python client, including:
- Set up Github Action based on docker-compose
- Update test cases and test succeeded for dialect `presto` and `trino`
- Temporary disabled hive related test due to test cases are not valid, not about connection
- Update dev dependencies to support python 3.10
- Speed up testing with `pytest-xdist` plugin

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️
Not able to ran unit test in local and on CI

#### Behavior With This Pull Request 🎉
Able to run and partially cover a couple of test cases

#### Related Unit Tests
No

## Additional notes
Next action is about fixing failing test cases or considering skipping some of them if necessary

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6343 from sudohainguyen/ci/init.

Closes #6281

682e575 [Harry] Remove xdist out of scope
dc42ca1 [Harry] Pin pytest packages version
469f1d9 [Harry] Pin ubuntu version
00cef47 [Harry] Use v4 checkout action
96ef831 [Harry] Remove unnecessary steps
732344a [Harry] Add step to tear down containers
1e2c248 [Harry] Resolved trino and presto test
5b33e39 [Harry] Make tests runnable
1be033b [Harry] Remove randome flag which causes failed test run
2bc6dc0 [Harry] Switch action setup provider to docker
ea2a763 [Harry] Initialize github action for python unit testing

Authored-by: Harry <quanghai.ng1512@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
  • Loading branch information
sudohainguyen authored and pan3793 committed May 7, 2024
1 parent 28d8c8e commit 9075fbb
Show file tree
Hide file tree
Showing 26 changed files with 210 additions and 126 deletions.
65 changes: 65 additions & 0 deletions .github/workflows/python.yml
@@ -0,0 +1,65 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

name: Python Client

on:
push:
branches:
- master
- branch-*
pull_request:
branches:
- master
- branch-*

concurrency:
group: python-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
unit-test:
runs-on: ubuntu-22.04
strategy:
fail-fast: false
matrix:
python-version: ["3.8", "3.9", "3.10"]
env:
PYTHONHASHSEED: random
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Start Testing Containers
run: |
cd python/docker/
docker compose up -d --wait
docker compose exec hive-server /opt/hive/scripts/make_test_tables.sh
- name: Install dependencies
run: |
cd python
./scripts/install-deps.sh
- name: Run tests
run: |
cd python
pytest -v
- name: Tear down Containers
run: |
cd python/docker/
docker compose down --volumes
19 changes: 19 additions & 0 deletions python/.gitignore
@@ -0,0 +1,19 @@
cover/
.coverage
/dist/
/build/
.DS_Store
*.egg
/env/
/htmlcov/
.idea/
.project
*.pyc
.pydevproject
/*.egg-info/
.settings
.cache/
*.iml
/scripts/.thrift_gen
venv/
.envrc
8 changes: 3 additions & 5 deletions python/dev_requirements.txt
Expand Up @@ -2,13 +2,11 @@
flake8==3.4.1
mock==2.0.0
pycodestyle==2.3.1
pytest==3.2.1
pytest-cov==2.5.1
pytest-flake8==0.8.1
pytest-random==0.2
pytest-timeout==1.2.0
pytest==7.4.4
pytest-cov==5.0.0

# actual dependencies: let things break if a package changes
sqlalchemy>=1.3.0
requests>=1.0.0
requests_kerberos>=0.12.0
sasl>=0.2.1
Expand Down
2 changes: 2 additions & 0 deletions python/docker/conf/presto/catalog/hive.properties
@@ -0,0 +1,2 @@
connector.name=hive-hadoop2
hive.metastore.uri=thrift://hive-metastore:9083
File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 2 additions & 0 deletions python/docker/conf/trino/catalog/hive.properties
@@ -0,0 +1,2 @@
connector.name=hive-hadoop2
hive.metastore.uri=thrift://hive-metastore:9083
File renamed without changes.
File renamed without changes.
File renamed without changes.
61 changes: 61 additions & 0 deletions python/docker/docker-compose.yml
@@ -0,0 +1,61 @@
version: "3"

services:
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop2.7.4-java8
volumes:
- namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME=test
env_file:
- hadoop-hive.env
ports:
- "50070:50070"
datanode:
image: bde2020/hadoop-datanode:2.0.0-hadoop2.7.4-java8
volumes:
- datanode:/hadoop/dfs/data
env_file:
- hadoop-hive.env
environment:
SERVICE_PRECONDITION: "namenode:50070"
ports:
- "50075:50075"
hive-server:
image: bde2020/hive:2.3.2-postgresql-metastore
env_file:
- hadoop-hive.env
volumes:
- ../scripts:/opt/hive/scripts
environment:
HIVE_CORE_CONF_javax_jdo_option_ConnectionURL: "jdbc:postgresql://hive-metastore/metastore"
SERVICE_PRECONDITION: "hive-metastore:9083"
ports:
- "10000:10000"
hive-metastore:
image: bde2020/hive:2.3.2-postgresql-metastore
env_file:
- hadoop-hive.env
command: /opt/hive/bin/hive --service metastore
environment:
SERVICE_PRECONDITION: "namenode:50070 datanode:50075 hive-metastore-postgresql:5432"
ports:
- "9083:9083"
hive-metastore-postgresql:
image: bde2020/hive-metastore-postgresql:2.3.0
presto-coordinator:
image: shawnzhu/prestodb:0.181
ports:
- "8080:8080"
volumes:
- ./conf/presto/:/etc/presto
trino:
image: trinodb/trino:351
ports:
- "18080:18080"
volumes:
- ./conf/trino:/etc/trino

volumes:
namenode:
datanode:
30 changes: 30 additions & 0 deletions python/docker/hadoop-hive.env
@@ -0,0 +1,30 @@
HIVE_SITE_CONF_javax_jdo_option_ConnectionURL=jdbc:postgresql://hive-metastore-postgresql/metastore
HIVE_SITE_CONF_javax_jdo_option_ConnectionDriverName=org.postgresql.Driver
HIVE_SITE_CONF_javax_jdo_option_ConnectionUserName=hive
HIVE_SITE_CONF_javax_jdo_option_ConnectionPassword=hive
HIVE_SITE_CONF_datanucleus_autoCreateSchema=false
HIVE_SITE_CONF_hive_metastore_uris=thrift://hive-metastore:9083
HDFS_CONF_dfs_namenode_datanode_registration_ip___hostname___check=false

CORE_CONF_fs_defaultFS=hdfs://namenode:8020
CORE_CONF_hadoop_http_staticuser_user=root
CORE_CONF_hadoop_proxyuser_hue_hosts=*
CORE_CONF_hadoop_proxyuser_hue_groups=*

HDFS_CONF_dfs_webhdfs_enabled=true
HDFS_CONF_dfs_permissions_enabled=false

YARN_CONF_yarn_log___aggregation___enable=true
YARN_CONF_yarn_resourcemanager_recovery_enabled=true
YARN_CONF_yarn_resourcemanager_store_class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
YARN_CONF_yarn_resourcemanager_fs_state___store_uri=/rmstate
YARN_CONF_yarn_nodemanager_remote___app___log___dir=/app-logs
YARN_CONF_yarn_log_server_url=http://historyserver:8188/applicationhistory/logs/
YARN_CONF_yarn_timeline___service_enabled=true
YARN_CONF_yarn_timeline___service_generic___application___history_enabled=true
YARN_CONF_yarn_resourcemanager_system___metrics___publisher_enabled=true
YARN_CONF_yarn_resourcemanager_hostname=resourcemanager
YARN_CONF_yarn_timeline___service_hostname=historyserver
YARN_CONF_yarn_resourcemanager_address=resourcemanager:8032
YARN_CONF_yarn_resourcemanager_scheduler_address=resourcemanager:8030
YARN_CONF_yarn_resourcemanager_resource__tracker_address=resourcemanager:8031
16 changes: 10 additions & 6 deletions python/pyhive/tests/test_hive.py
Expand Up @@ -17,6 +17,7 @@
from decimal import Decimal

import mock
import pytest
import thrift.transport.TSocket
import thrift.transport.TTransport
import thrift_sasl
Expand All @@ -30,11 +31,12 @@
_HOST = 'localhost'


@pytest.mark.skip(reason="Temporary disabled")
class TestHive(unittest.TestCase, DBAPITestCase):
__test__ = True

def connect(self):
return hive.connect(host=_HOST, configuration={'mapred.job.tracker': 'local'})
return hive.connect(host=_HOST, port=10000, configuration={'mapred.job.tracker': 'local'})

@with_cursor
def test_description(self, cursor):
Expand Down Expand Up @@ -151,10 +153,11 @@ def test_no_result_set(self, cursor):
self.assertIsNone(cursor.description)
self.assertRaises(hive.ProgrammingError, cursor.fetchone)

@pytest.mark.skip
def test_ldap_connection(self):
rootdir = os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
orig_ldap = os.path.join(rootdir, 'scripts', 'travis-conf', 'hive', 'hive-site-ldap.xml')
orig_none = os.path.join(rootdir, 'scripts', 'travis-conf', 'hive', 'hive-site.xml')
orig_ldap = os.path.join(rootdir, 'scripts', 'conf', 'hive', 'hive-site-ldap.xml')
orig_none = os.path.join(rootdir, 'scripts', 'conf', 'hive', 'hive-site.xml')
des = os.path.join('/', 'etc', 'hive', 'conf', 'hive-site.xml')
try:
subprocess.check_call(['sudo', 'cp', orig_ldap, des])
Expand Down Expand Up @@ -209,11 +212,12 @@ def test_custom_transport(self):
with contextlib.closing(conn.cursor()) as cursor:
cursor.execute('SELECT * FROM one_row')
self.assertEqual(cursor.fetchall(), [(1,)])


@pytest.mark.skip
def test_custom_connection(self):
rootdir = os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
orig_ldap = os.path.join(rootdir, 'scripts', 'travis-conf', 'hive', 'hive-site-custom.xml')
orig_none = os.path.join(rootdir, 'scripts', 'travis-conf', 'hive', 'hive-site.xml')
orig_ldap = os.path.join(rootdir, 'scripts', 'conf', 'hive', 'hive-site-custom.xml')
orig_none = os.path.join(rootdir, 'scripts', 'conf', 'hive', 'hive-site.xml')
des = os.path.join('/', 'etc', 'hive', 'conf', 'hive-site.xml')
try:
subprocess.check_call(['sudo', 'cp', orig_ldap, des])
Expand Down
2 changes: 2 additions & 0 deletions python/pyhive/tests/test_presto.py
Expand Up @@ -13,6 +13,7 @@

import requests

import pytest
from pyhive import exc
from pyhive import presto
from pyhive.tests.dbapi_test_case import DBAPITestCase
Expand Down Expand Up @@ -231,6 +232,7 @@ def test_invalid_kwargs(self):
).cursor()
)

@pytest.mark.skip(reason='This test requires a proxy server running on localhost:9999')
def test_requests_kwargs(self):
connection = presto.connect(
host=_HOST, port=_PORT, source=self.id(),
Expand Down
2 changes: 2 additions & 0 deletions python/pyhive/tests/test_sqlalchemy_hive.py
Expand Up @@ -5,6 +5,7 @@
from pyhive.sqlalchemy_hive import HiveDecimal
from pyhive.sqlalchemy_hive import HiveTimestamp
from sqlalchemy.exc import NoSuchTableError, OperationalError
import pytest
from pyhive.tests.sqlalchemy_test_case import SqlAlchemyTestCase
from pyhive.tests.sqlalchemy_test_case import with_engine_connection
from sqlalchemy import types
Expand Down Expand Up @@ -60,6 +61,7 @@
# ]


@pytest.mark.skip(reason="Temporarily disabled")
class TestSqlAlchemyHive(unittest.TestCase, SqlAlchemyTestCase):
def create_engine(self):
return create_engine('hive://localhost:10000/default')
Expand Down
6 changes: 3 additions & 3 deletions python/pyhive/tests/test_trino.py
Expand Up @@ -70,10 +70,10 @@ def test_complex(self, cursor):
('timestamp', 'timestamp', None, None, None, None, True),
('binary', 'varbinary', None, None, None, None, True),
('array', 'array(integer)', None, None, None, None, True),
('map', 'map(integer,integer)', None, None, None, None, True),
('struct', 'row(a integer,b integer)', None, None, None, None, True),
('map', 'map(integer, integer)', None, None, None, None, True),
('struct', 'row(a integer, b integer)', None, None, None, None, True),
# ('union', 'varchar', None, None, None, None, True),
('decimal', 'decimal(10,1)', None, None, None, None, True),
('decimal', 'decimal(10, 1)', None, None, None, None, True),
])
rows = cursor.fetchall()
expected = [(
Expand Down
File renamed without changes.
File renamed without changes.
10 changes: 10 additions & 0 deletions python/scripts/install-deps.sh
@@ -0,0 +1,10 @@
#!/bin/bash -eux

source /etc/lsb-release

sudo apt-get -q update
sudo apt-get -q install -y g++ libsasl2-dev libkrb5-dev

pip install --upgrade pip
pip install -r dev_requirements.txt
pip install -e .
2 changes: 0 additions & 2 deletions python/scripts/travis-conf/presto/catalog/hive.properties

This file was deleted.

2 changes: 0 additions & 2 deletions python/scripts/travis-conf/trino/catalog/hive.properties

This file was deleted.

0 comments on commit 9075fbb

Please sign in to comment.