Skip to content

Commit

Permalink
Add initial HDFS tests (#1225)
Browse files Browse the repository at this point in the history
* Add initial HDFS tests

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

* Install Java and libhdfs.so for tests

* Fix kokorun

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
  • Loading branch information
yongtang committed Dec 12, 2020
1 parent 71d9603 commit 9b5e233
Show file tree
Hide file tree
Showing 5 changed files with 86 additions and 0 deletions.
15 changes: 15 additions & 0 deletions .github/workflows/build.wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,5 +29,20 @@ if [[ $(uname) == "Linux" ]]; then
apt-get -y -qq install $PYTHON_VERSION ffmpeg dnsutils libmp3lame0
curl -sSOL https://bootstrap.pypa.io/get-pip.py
$PYTHON_VERSION get-pip.py -q

# Install Java
apt-get -y -qq install openjdk-8-jdk
update-alternatives --config java
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

# Install Hadoop
curl -OL https://archive.apache.org/dist/hadoop/common/hadoop-2.7.0/hadoop-2.7.0.tar.gz
tar -xzf hadoop-2.7.0.tar.gz -C /usr/local
export HADOOP_HOME=/usr/local/hadoop-2.7.0

# Update environmental variable
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${JAVA_HOME}/jre/lib/amd64/server:${HADOOP_HOME}/lib/native
export CLASSPATH=$(${HADOOP_HOME}/bin/hadoop classpath --glob)
export
fi
run_test $PYTHON_VERSION
1 change: 1 addition & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,7 @@ jobs:
bash -x -e tests/test_sql/sql_test.sh
bash -x -e tests/test_gcloud/test_gcs.sh gcs-emulator
bash -x -e tests/test_pulsar/pulsar_test.sh
bash -x -e tests/test_hdfs/hdfs_test.sh
- name: Test Linux
run: |
set -x -e
Expand Down
1 change: 1 addition & 0 deletions .kokorun/io_cpu.sh
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ bash -x -e tests/test_azure/start_azure.sh
bash -x -e tests/test_sql/sql_test.sh sql
bash -x -e tests/test_elasticsearch/elasticsearch_test.sh start
bash -x -e tests/test_mongodb/mongodb_test.sh start
bash -x -e tests/test_hdfs/hdfs_test.sh

docker run -i --rm -v $PWD:/v -w /v --net=host \
buildpack-deps:20.04 bash -x -e .github/workflows/build.wheel.sh python${PYTHON_VERSION}
Expand Down
27 changes: 27 additions & 0 deletions tests/test_hdfs/hdfs_test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/usr/bin/env bash
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

set -e
set -o pipefail

HADOOP_VERSION=2.7.0
docker pull sequenceiq/hadoop-docker:$HADOOP_VERSION
docker run -d --rm --net=host --name=tensorflow-io-hdfs sequenceiq/hadoop-docker:$HADOOP_VERSION
echo "Waiting for 30 secs until hadoop is up and running"
sleep 30
docker logs tensorflow-io-hdfs
echo "Hadoop up"
exit 0
42 changes: 42 additions & 0 deletions tests/test_hdfs_eager.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not
# use this file except in compliance with the License. You may obtain a copy of
# the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations under
# the License.
# ==============================================================================
"""Tests for HDFS file system"""

import os
import sys
import socket
import time
import tempfile
import tensorflow as tf
import tensorflow_io as tfio
import pytest


@pytest.mark.skipif(
sys.platform in ("win32", "darwin"),
reason="TODO HDFS not setup properly on macOS/Windows yet",
)
def test_read_file():
"""Test case for reading HDFS"""

address = socket.gethostbyname(socket.gethostname())
print("ADDRESS: {}".format(address))

body = b"1234567"
tf.io.write_file("hdfse://{}:9000/file.txt".format(address), body)

content = tf.io.read_file("hdfse://{}:9000/file.txt".format(address))
print("CONTENT: {}".format(content))
assert content == body

0 comments on commit 9b5e233

Please sign in to comment.