Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UrlStorage blacklist of forwarded headers #246

Merged
merged 4 commits into from
May 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 21 additions & 7 deletions htsget-config/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,19 +171,21 @@ To use `S3Storage`, build htsget-rs with the `s3-storage` feature enabled, and s
`UrlStorage` is another storage backend which can be used to serve data from a remote HTTP URL. When using this storage backend, htsget-rs will fetch data from a `url` which is set in the config. It will also forward any headers received with the initial query, which is useful for authentication.
To use `UrlStorage`, build htsget-rs with the `url-storage` feature enabled, and set the following options under `[resolvers.storage]`:

| Option | Description | Type | Default |
|--------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|-----------------------------------------------------------------------------------------------------------------|
| <span id="url">`url`</span> | The URL to fetch data from. | HTTP URL | `"https://127.0.0.1:8081/"` |
| <span id="url">`response_url`</span> | The URL to return to the client for fetching tickets. | HTTP URL | `"https://127.0.0.1:8081/"` |
| `forward_headers` | When constructing the URL tickets, copy HTTP headers received in the initial query. Note, the headers received with the query are always forwarded to the `url`. | Boolean | `true` |
| `tls` | Additionally enables client authentication, or sets non-native root certificates for TLS. See [TLS](#tls) for more details. | TOML table | TLS is always allowed, however the default performs no client authentication and uses native root certificates. |
| Option | Description | Type | Default |
|--------------------------------------|------------------------------------------------------------------------------------------------------------------------------|--------------------------|-----------------------------------------------------------------------------------------------------------------|
| <span id="url">`url`</span> | The URL to fetch data from. | HTTP URL | `"https://127.0.0.1:8081/"` |
| <span id="url">`response_url`</span> | The URL to return to the client for fetching tickets. | HTTP URL | `"https://127.0.0.1:8081/"` |
| `forward_headers` | When constructing the URL tickets, copy HTTP headers received in the initial query. | Boolean | `true` |
| `header_blacklist` | List of headers that should not be forwarded | Array of headers | `[]` |
| `tls` | Additionally enables client authentication, or sets non-native root certificates for TLS. See [TLS](#tls) for more details. | TOML table | TLS is always allowed, however the default performs no client authentication and uses native root certificates. |

When using `UrlStorage`, the following requests will be made to the `url`.
* `GET` request to fetch only the headers of the data file (e.g. `GET /data.bam`, with `Range: bytes=0-<end_of_bam_header>`).
* `GET` request to fetch the entire index file (e.g. `GET /data.bam.bai`).
* `HEAD` request on the data file to get its length (e.g. `HEAD /data.bam`).

All headers received in the initial query will be included when making these requests.
By default, all headers received in the initial query will be included when making these requests. To exclude certain headers from being forwarded, set the `header_blacklist` option. Note that the blacklisted headers are removed from the requests made to `url` and from the URL tickets as well.


For example, a `resolvers` value of:
```toml
Expand Down Expand Up @@ -222,6 +224,18 @@ bucket = 'bucket'
```

`UrlStorage` can only be specified manually.
Example of a resolver with `UrlStorage`:
```toml
[[resolvers]]
regex = ".*"
substitution_string = "$0"

[resolvers.storage]
url = "http://localhost:8080"
response_url = "https://example.com"
forward_headers = true
header_blacklist = ["Host"]
```

There are additional examples of config files located under [`examples/config-files`][examples-config-files].

Expand Down
1 change: 1 addition & 0 deletions htsget-config/src/resolver.rs
Original file line number Diff line number Diff line change
Expand Up @@ -537,6 +537,7 @@ mod tests {
inner: InnerUrl::from_str("https://example.com/").unwrap(),
}),
true,
vec![],
client,
);

Expand Down
14 changes: 14 additions & 0 deletions htsget-config/src/storage/url.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ pub struct UrlStorage {
url: ValidatedUrl,
response_url: ValidatedUrl,
forward_headers: bool,
header_blacklist: Vec<String>,
#[serde(skip_serializing)]
tls: TlsClientConfig,
}
Expand All @@ -35,6 +36,7 @@ pub struct UrlStorageClient {
url: ValidatedUrl,
response_url: ValidatedUrl,
forward_headers: bool,
header_blacklist: Vec<String>,
client: Client,
}

Expand Down Expand Up @@ -63,6 +65,7 @@ impl TryFrom<UrlStorage> for UrlStorageClient {
storage.url,
storage.response_url,
storage.forward_headers,
storage.header_blacklist,
client,
))
}
Expand All @@ -74,12 +77,14 @@ impl UrlStorageClient {
url: ValidatedUrl,
response_url: ValidatedUrl,
forward_headers: bool,
header_blacklist: Vec<String>,
client: Client,
) -> Self {
Self {
url,
response_url,
forward_headers,
header_blacklist,
client,
}
}
Expand All @@ -99,6 +104,11 @@ impl UrlStorageClient {
self.forward_headers
}

/// Get the headers that should not be forwarded.
pub fn header_blacklist(&self) -> &[String] {
&self.header_blacklist
}

/// Get an owned client by cloning.
pub fn client_cloned(&self) -> Client {
self.client.clone()
Expand Down Expand Up @@ -142,6 +152,7 @@ impl UrlStorage {
url: InnerUrl,
response_url: InnerUrl,
forward_headers: bool,
header_blacklist: Vec<String>,
tls: TlsClientConfig,
) -> Self {
Self {
Expand All @@ -150,6 +161,7 @@ impl UrlStorage {
inner: response_url,
}),
forward_headers,
header_blacklist,
tls,
}
}
Expand Down Expand Up @@ -182,6 +194,7 @@ impl Default for UrlStorage {
url: default_url(),
response_url: default_url(),
forward_headers: true,
header_blacklist: vec![],
tls: TlsClientConfig::default(),
}
}
Expand All @@ -206,6 +219,7 @@ mod tests {
"https://example.com".parse::<InnerUrl>().unwrap(),
"https://example.com".parse::<InnerUrl>().unwrap(),
true,
vec![],
client_config,
));

Expand Down
1 change: 1 addition & 0 deletions htsget-search/src/htsget/from_storage.rs
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ impl<S> ResolveResponse for HtsGetFromStorage<S> {
url_storage_config.url().clone(),
url_storage_config.response_url().clone(),
url_storage_config.forward_headers(),
url_storage_config.header_blacklist().to_vec(),
));
searcher.search(query.clone()).await
}
Expand Down
Loading
Loading