Skip to content

Commit

Permalink
docs:Update README to use --enableSnappy flag to import snappy compre… (
Browse files Browse the repository at this point in the history
#3623)

…ssed snapshots.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
- [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/java-bigtable-hbase/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
- [ ] Ensure the tests and linter pass
- [ ] Code coverage does not decrease (if any source code was changed)
- [ ] Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> ☕️

If you write sample code, please follow the [samples format](
https://github.com/GoogleCloudPlatform/java-docs-samples/blob/main/SAMPLE_FORMAT.md).
  • Loading branch information
TracyCuiCan committed May 18, 2022
1 parent 8c1854d commit ea73be9
Showing 1 changed file with 8 additions and 9 deletions.
17 changes: 8 additions & 9 deletions bigtable-dataflow-parent/bigtable-beam-import/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ Exporting HBase snapshots from Bigtable is not supported.
```
1. Run the export.
```
java -jar bigtable-beam-import-2.0.0.jar export \
java -jar bigtable-beam-import-2.3.0.jar export \
--runner=dataflow \
--project=$PROJECT_ID \
--bigtableInstanceId=$INSTANCE_ID \
Expand Down Expand Up @@ -143,15 +143,15 @@ Please pay attention to the Cluster CPU usage and adjust the number of Dataflow

1. Run the import.
```
java -jar bigtable-beam-import-2.0.0.jar importsnapshot \
java -jar bigtable-beam-import-2.3.0.jar importsnapshot \
--runner=DataflowRunner \
--project=$PROJECT_ID \
--bigtableInstanceId=$INSTANCE_ID \
--bigtableTableId=$TABLE_NAME \
--hbaseSnapshotSourceDir=$SNAPSHOT_GCS_PATH/data \
--snapshotName=$SNAPSHOT_NAME \
--stagingLocation=$SNAPSHOT_GCS_PATH/staging \
--tempLocation=$SNAPSHOT_GCS_PATH/temp \
--gcpTempLocation=$SNAPSHOT_GCS_PATH/temp \
--maxWorkerNodes=$(expr 3 \* $CLUSTER_NUM_NODES) \
--region=$REGION
```
Expand All @@ -171,19 +171,18 @@ Please pay attention to the Cluster CPU usage and adjust the number of Dataflow

1. Run the import.
```
java -jar bigtable-beam-import-2.0.0.jar importsnapshot \
java -jar bigtable-beam-import-2.3.0.jar importsnapshot \
--runner=DataflowRunner \
--project=$PROJECT_ID \
--bigtableInstanceId=$INSTANCE_ID \
--bigtableTableId=$TABLE_NAME \
--hbaseSnapshotSourceDir=$SNAPSHOT_GCS_PATH/data \
--snapshotName=$SNAPSHOT_NAME \
--stagingLocation=$SNAPSHOT_GCS_PATH/staging \
--tempLocation=$SNAPSHOT_GCS_PATH/temp \
--gcpTempLocation=$SNAPSHOT_GCS_PATH/temp \
--maxWorkerNodes=$(expr 3 \* $CLUSTER_NUM_NODES) \
--region=$REGION \
--experiments=use_runner_v2 \
--sdkContainerImage=gcr.io/cloud-bigtable-ecosystem/unified-harness:latest
--enableSnappy=true
```

### Sequence Files
Expand All @@ -200,7 +199,7 @@ Please pay attention to the Cluster CPU usage and adjust the number of Dataflow
```
1. Run the import.
```
java -jar bigtable-beam-import-2.0.0.jar import \
java -jar bigtable-beam-import-2.3.0.jar import \
--runner=dataflow \
--project=$PROJECT_ID \
--bigtableInstanceId=$INSTANCE_ID \
Expand Down Expand Up @@ -228,7 +227,7 @@ check if there are any rows with mismatched data.
```
1. Run the sync job. It will put the results into `$SNAPSHOT_GCS_PATH/data-verification/output-TIMESTAMP`.
```
java -jar bigtable-beam-import-2.0.0.jar sync-table \
java -jar bigtable-beam-import-2.3.0.jar sync-table \
--runner=dataflow \
--project=$PROJECT_ID \
--bigtableInstanceId=$INSTANCE_ID \
Expand Down

0 comments on commit ea73be9

Please sign in to comment.