Skip to content

Commit

Permalink
[VL] Add support for GCS retry properties (#5858)
Browse files Browse the repository at this point in the history
The Velox GCS connector was recently added a couple of properties:

HiveConfig::kGCSMaxRetryCount: Retry until a number of transient errors is detected.
HiveConfig::kGCSMaxRetryTime: Retry until a time elapses.
This properties are useful to allow a client to either fail immediately or keep retring for several minutes.

If none of them is set, then it will use the defaults set by GCS.

This change was tested manually, by entering an unnaccesible GCS bucket then verifying that the properties are honored.
  • Loading branch information
tigrux committed May 24, 2024
1 parent 891ab83 commit 9f424a1
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 0 deletions.
13 changes: 13 additions & 0 deletions cpp/velox/utils/ConfigExtractor.cc
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,19 @@ std::shared_ptr<facebook::velox::core::MemConfig> getHiveConfig(std::shared_ptr<
}
}

// https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/master/gcs/CONFIGURATION.md#http-transport-configuration
// https://cloud.google.com/cpp/docs/reference/storage/latest/classgoogle_1_1cloud_1_1storage_1_1LimitedErrorCountRetryPolicy
auto gsMaxRetryCount = conf->get("spark.hadoop.fs.gs.http.max.retry");
if (gsMaxRetryCount.hasValue()) {
hiveConfMap[facebook::velox::connector::hive::HiveConfig::kGCSMaxRetryCount] = gsMaxRetryCount.value();
}

// https://cloud.google.com/cpp/docs/reference/storage/latest/classgoogle_1_1cloud_1_1storage_1_1LimitedTimeRetryPolicy
auto gsMaxRetryTime = conf->get("spark.hadoop.fs.gs.http.max.retry-time");
if (gsMaxRetryTime.hasValue()) {
hiveConfMap[facebook::velox::connector::hive::HiveConfig::kGCSMaxRetryTime] = gsMaxRetryTime.value();
}

// https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/master/gcs/CONFIGURATION.md#authentication
auto gsAuthType = conf->get("spark.hadoop.fs.gs.auth.type");
if (gsAuthType.hasValue()) {
Expand Down
14 changes: 14 additions & 0 deletions docs/get-started/VeloxGCS.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,4 +36,18 @@ spark.hadoop.fs.gs.auth.service.account.json.keyfile // path to the json file wi
For cases when a GCS mock is used, an optional endpoint can be provided:
```sh
spark.hadoop.fs.gs.storage.root.url // url to the mock gcs service including starting with http or https
```

## Configuring GCS max retry count

For cases when a transient server error is detected, GCS can be configured to keep retrying until a number of transient error is detected.
```sh
spark.hadoop.fs.gs.http.max.retry // number of times to keep retrying unless a non-transient error is detected
```

## Configuring GCS max retry time

For cases when a transient server error is detected, GCS can be configured to keep retrying until the retry loop exceeds a prescribed duration.
```sh
spark.hadoop.fs.gs.http.max.retry-time // a string representing the time keep retring (10s, 1m, etc).
```

0 comments on commit 9f424a1

Please sign in to comment.