Skip to content

[BEAM-3214] Add integration test for HBaseIO.#5499

Merged
iemejia merged 1 commit intoapache:masterfrom
szewi:hbase-io
Jun 6, 2018
Merged

[BEAM-3214] Add integration test for HBaseIO.#5499
iemejia merged 1 commit intoapache:masterfrom
szewi:hbase-io

Conversation

@szewi
Copy link
Contributor

@szewi szewi commented May 28, 2018

This PR contains:

  • kubernetes infrastructure for running integration tests of HBaseIO
  • support for running integration test of HBaseIO with Gradle
  • java code of integration test
  • by default test is being run on 100k records, but hashes are also calculated for 1k, 600k and 5M of records and can be set with pipeline option --numberOfRecords

Follow this checklist to help us incorporate your contribution quickly and easily:

  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

It will help us expedite review of your Pull Request if you tag someone (e.g. @username) to look at it.

@lgajowy
Copy link
Contributor

lgajowy commented May 29, 2018

R: @iemejia

@iemejia iemejia self-requested a review May 29, 2018 08:54
@iemejia
Copy link
Member

iemejia commented May 29, 2018

CC: @aromanenko-dev

Copy link
Member

@iemejia iemejia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultra minor nitpicks, execellent work @szewi. I will wait for your fixes and in the meantime try to reproduce the k8s scripts. Did you have any chance creating a multi node cluster or this is for another PR ?

spec:
containers:
- name: hbase
image: harisekhon/hbase:latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably pin this to version 1.2 since this is the official stable one and the one the connector uses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest points to 1.3 version currently, but to be strict with the version specified in build.gradle I will pin this to 1.2 as there is docker tag with this version available.

/* HBase */
@Description("HBase host")
@Default.String("HBase-host")
String getHbaseServerName();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to not expand this Options object anymore. Please create a PipelineOptions specific for HBase see HCatalogIO for ref.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I wasn't aware of that.

/**
* Produces test rows.
*/
public static class ConstructMutations extends DoFn<TestRow, Mutation> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private

/**
* Read rows from Table.
*/
public static class SelectNameFn extends DoFn<Result, String> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private

*/
@RunWith(JUnit4.class)
public class HBaseIOIT {
private static final Logger LOG = LoggerFactory.getLogger(HBaseIOIT.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is never used so maybe remove.

set -e

kubectl create -f hbase-single-node-cluster.yml

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this space and the one in line 28 (ultra minor nitpick)

# See the License for the specific language governing permissions and
# limitations under the License.
#
# Start HBase cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this one and the next line to be consistent with the other setup.sh files

external_ip="$(kubectl get svc hbase-external -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"

hbase_master_pod_name="$(kubectl get pods --selector=name=hbase -o jsonpath='{.items[*].metadata.name}')"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment for spaces as above

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the '# char so this does not get mingled with the ASF license.

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, remove # char

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove # char also

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove # char

* See the License for the specific language governing permissions and
* limitations under the License.
*/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove space

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove space

*
*/
@RunWith(JUnit4.class)
public class HBaseIOIT {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you use IntelliJ if so can you please format this file with the google-java-plugin (this is not mandatory but it is what most people here use and probably will be automatized in the future).

# should be added to /etc/hosts file in order to work with HBase cluster.
#

#!/bin/sh
Copy link
Member

@iemejia iemejia May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shebangs must be the first line in the file so probably just move it to the top in this in all the shell files.

#!/bin/sh
#
...

Copy link
Member

@iemejia iemejia May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra points if you open a subsequent PR to fix this in all the other shell scripts (the non-hbase ones) (+ eventual fixes for shellcheck issues).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this makes sense. In this PR I just wanted to be consistent with current script pattern, but this could be easily fixed.

hbase_master_namespace="$(kubectl get pods --selector=name=hbase -o jsonpath='{.items[*].metadata.namespace}')"

echo "For local tests please add the following entry to /etc/hosts file"
echo ${external_ip}$'\t'${hbase_master_pod_name}".hbase.${hbase_master_namespace}.svc.cluster.local"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SC2086: Double quote to prevent globbing and word splitting.
^-- SC2039: In POSIX sh, $'..' is undefined.
^-- SC2086: Double quote to prevent globbing and word splitting.

Pro tip: always pass a run of shellcheck in the shell scripts for specific issues like this.
https://www.shellcheck.net/

#!/bin/sh
set -e

kubectl create -f hbase-single-node.yml No newline at end of file
Copy link
Member

@iemejia iemejia May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when running the script it complaines on missing file, maybe it is hbase-single-node-cluster.yml ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly I was used different naming convention at the beginning of working at this PR

#!/bin/sh
set -e

kubectl delete -f hbase-single-node.yml No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when running the script it complaines on missing file, maybe it is hbase-single-node-cluster.yml ?

@szewi
Copy link
Contributor Author

szewi commented Jun 4, 2018

@iemejia this is ready for another round.
BTW to reproduce kubernetes infra you need to upgrade your cluster and kubectl client tool to at least 1.9.3, but latest 1.9.X is preferred, otherwise you may expect some errors with creation/deletion of the pods using StatefulSets.

Copy link
Member

@iemejia iemejia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks @szewi. Waiting for jenkins to merge.

@szewi
Copy link
Contributor Author

szewi commented Jun 4, 2018

Run Java PostCommit

@szewi
Copy link
Contributor Author

szewi commented Jun 4, 2018

Run Java PreCommit

@szewi
Copy link
Contributor Author

szewi commented Jun 4, 2018

@iemejia sorry I run post commit tests by mistake instead of precommit. Now waiting for 2 checks to be completed.

Copy link
Member

@iemejia iemejia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Thanks @szewi nice to have this one!

@iemejia iemejia merged commit f89e19b into apache:master Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants