Skip to content

Commit

Permalink
support shard-based routing (#89)
Browse files Browse the repository at this point in the history
* setup env & testing for shard routing

* setup two shard peer nodes

* set new .env for sharded nodes

* add intervalmap data structure

* add parsing of shard env variables

* rename HeightShardingProxies -> PruningOrDefaultProxies

* implement ShardProxies for shard routing

* scaffold new Proxies to handle shards

* handle shard routing

* update PruningOrDefaultProxies logs to Trace

* update failing test for shard routing

* add tests for shard backend responses

* add unit tests for shard routing proxies

* route "earliest" to first shard

also, prevent uint64 conversion underflows by routing any other block tag
that is encoded to a negative number to the default proxy

* configure shard routing in CI tests

* document shard-based routing

* make shards inclusive of end block

* add more validation to shard backend url map

parsing will error in cases of:
* unordered shards
* multiple shards for same endblock
  • Loading branch information
pirtleshell committed Mar 6, 2024
1 parent 09b2a83 commit e30b255
Show file tree
Hide file tree
Showing 17 changed files with 748 additions and 30 deletions.
6 changes: 5 additions & 1 deletion .env
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ TEST_DATABASE_ENDPOINT_URL=localhost:5432
TEST_PROXY_BACKEND_HOST_URL_MAP=localhost:7777>http://kava-validator:8545,localhost:7778>http://kava-pruning:8545
TEST_PROXY_HEIGHT_BASED_ROUTING_ENABLED=true
TEST_PROXY_PRUNING_BACKEND_HOST_URL_MAP=localhost:7777>http://kava-pruning:8545,localhost:7778>http://kava-pruning:8545
TEST_PROXY_SHARD_BACKEND_HOST_URL_MAP=localhost:7777>10|http://kava-shard-10:8545|20|http://kava-shard-20:8545
# What level of logging to use for service objects constructed during
# unit tests
TEST_SERVICE_LOG_LEVEL=ERROR
Expand Down Expand Up @@ -71,9 +72,12 @@ PROXY_BACKEND_HOST_URL_MAP=localhost:7777>http://kava-validator:8545,localhost:7
# otherwise, it falls back to the value in PROXY_BACKEND_HOST_URL_MAP
PROXY_HEIGHT_BASED_ROUTING_ENABLED=true
PROXY_PRUNING_BACKEND_HOST_URL_MAP=localhost:7777>http://kava-pruning:8545,localhost:7778>http://kava-pruning:8545
# enable shard routing for hosts defined in PROXY_SHARD_BACKEND_HOST_URL_MAP
PROXY_SHARDED_ROUTING_ENABLED=true
PROXY_SHARD_BACKEND_HOST_URL_MAP=localhost:7777>10|http://kava-shard-10:8545|20|http://kava-shard-20:8545
# PROXY_MAXIMUM_REQ_BATCH_SIZE is a proxy-enforced limit on the number of subrequest in a batch
PROXY_MAXIMUM_REQ_BATCH_SIZE=100
# Configuration for the servcie to connect to it's database
# Configuration for the service to connect to it's database
DATABASE_NAME=postgres
DATABASE_ENDPOINT_URL=postgres:5432
DATABASE_USERNAME=postgres
Expand Down
111 changes: 110 additions & 1 deletion architecture/PROXY_ROUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ Now suppose you want multiple backends for the same host.

The proxy service supports height-based routing to direct requests that only require the most recent
block to a different cluster.
This support is handled via the [`HeightShardingProxies` implementation](../service/shard.go#L16).
This support is handled via the [`PruningOrDefaultProxies` implementation](../service/shard.go#L17).

This is configured via the `PROXY_HEIGHT_BASED_ROUTING_ENABLED` and `PROXY_PRUNING_BACKEND_HOST_URL_MAP`
environment variables.
Expand Down Expand Up @@ -136,6 +136,114 @@ in `PROXY_BACKEND_HOST_URL_MAP`.

Any request made to a host not in the `PROXY_BACKEND_HOST_URL_MAP` map responds 502 Bad Gateway.

## Sharding

Taking the example one step further, suppose the backend consists of data shards each containing a set of blocks. Although sharded routing can be configured without pruning vs default cluster routing, this example assumes it is.

The above example supports fielding requests to a particular endpoint with pruning & archive clusters:
* request for tip-of-chain -> pruning cluster
* everything else -> archive cluster ("default")

The proxy service supports breaking down "everything else" further by defining "shards": clusters that contain a fixed set of block heights.

This is configured via the `PROXY_SHARDED_ROUTING_ENABLED` and `PROXY_SHARD_BACKEND_HOST_URL_MAP` environment variables:
* `PROXY_SHARDED_ROUTING_ENABLED` - flag to toggle this functionality
* `PROXY_SHARD_BACKEND_HOST_URL_MAP` - encodes the shard cluster urls and block ranges for a given endpoint.
This support is handled via the [`ShardProxies` implementation](../service/shard.go#L103).


The map is encoded as follows:
```
PROXY_SHARDED_ROUTING_ENABLED=true
PROXY_SHARD_BACKEND_HOST_URL_MAP=HOST_A>ENDBLOCK_A1|ROUTE_A1|ENDBLOCK_A2|ROUTE_A2,HOST_B>ENDBLOCK_B1|ROUTE_B1
```

This defines two shards for `HOST_A` and one shard for `HOST_B`:
* `HOST_A`'s shards:
* blocks 1 to `ENDBLOCK_A1` hosted at `ROUTE_A1`
* blocks `ENDBLOCK_A1`+1 to `ENDBLOCK_A2` hosted at `ROUTE_A2`
* `HOST_B`'s shard:
* blocks 1 to `ENDBLOCK_B1` hosted at `ROUTE_B1`

Shards are inclusive of their end blocks and they must collectively contain all data from block 1 to the end bock of the last shard.

Shards field requests that would route to the "Default" cluster in any of the above configurations:
* requests for `"earliest"` block are routed to the first defined shard
* any request for a specific height that is contained in a shard is routed to that shard.

All other requests continue to route to the default cluster. In this context, the default cluster is referred to as the "active" cluster (see below).

Requests for tx hashes or block hashes are routed to the "active" cluster.

### Shard Routing

When `PROXY_SHARDED_ROUTING_ENABLED` is `true`, "everything else" can be broken down further into clusters that contain fixed ranges of blocks.

As an example, consider a setup that has the following clusters:
* Pruning cluster (`http://kava-pruning:8545`)
* "Active" cluster - blocks 4,000,001 to chain tip (`http://kava-archive:8545`)
* Shard 2 - blocks 2,000,001 to 4,000,000 (`http://kava-shard-4M:8545`)
* Shard 1 - blocks 1 to 2,000,000 (`http://kava-shard-2M:8545`)

The proxy service can be configured to as follows:
```
PROXY_HEIGHT_BASED_ROUTING_ENABLED=true
PROXY_SHARDED_ROUTING_ENABLED=true
PROXY_BACKEND_HOST_URL_MAP=evm.data.kava.io>http://kava-archive:8545
PROXY_PRUNING_BACKEND_HOST_URL_MAP=evm.data.kava.io>http://kava-pruning:8545
PROXY_SHARD_BACKEND_HOST_URL_MAP=evm.data.kava.io>2000000|http://kava-shard-2M:8545|4000000|http://kava-shard-4M:8545
```

This value is parsed into a map that looks like the following:
```
{
"default": {
"evm.data.kava.io" => "http://kava-archive:8545",
},
"pruning": {
"evm.data.kava.io" => "http://kava-pruning:8545",
},
"shards": {
2000000 => "http://kava-shard-2M:8545",
4000000 => "http://kava-shard-4M:8545"
}
}
```

All requests that would route to the "default" cluster in teh "Default vs Pruning Backend Routing" example route as follows:
* requests for specific height between 1 and 2M -> `http://kava-shard-2M:8545`
* this includes requests for `"earliest"`
* requests for specific height between 2M+1 and 4M -> `http://kava-shard-4M:8545`
* requests for a block hash or tx hash -> the active cluster: `http://kava-archive:8545`.

Otherwise, requests are routed as they are in the "Default vs Pruning Backend Routing" example.

![Proxy service configured with shard-based routing](images/proxy_service_sharding.jpg)

### "Active" Cluster

In practice, a full-archive node can be used as the active cluster. However, the data can be slimmed down by accounting for the fact that it doesn't need the application data for blocks contained in the shards.

The optimally-sized active cluster runs on a unique data set that includes:
* At least one recent block - this will be the starting point for the node to begin syncing once spun up. Ideally, this is the last shard's end block + 1.
* A complete blockstore, cometbft state, and tx_index

The blockstore, cometbft state, and tx_index are required for fielding requests for data on unknown heights. These are requests for block hashes and transaction hashes. Because the proxy service can't know which height a particular hash is for (and therefore, to which shard the request should be routed), these complete databases are required to handle requests for the hashes.

The optimally-sized node data can be created from a full-archive node by pruning only the application state for the node. On Kava, this can be accomplished with the `--only-app-state` flag of the shard command:
```
kava shard --start <last-shard-end-block-plus-1> --end -1 --only-appstate-
```

The bulk of data on cosmos-sdk chains like Kava is in the application.db, so pruning the blocks allow for a much smaller cluster footprint than a full archive node.

### Shard Clusters

On Kava, data for shards can be created with the `shard` command of the Kava CLI from any node that contains the desired shard block range:
```
kava shard --home ~/.kava --start <shard-start-block> --end <shard-end-block>
```

## Metrics

When metrics are enabled, the `proxied_request_metrics` table tracks the backend to which requests
Expand All @@ -147,6 +255,7 @@ always `DEFAULT`.
When enabled, the column will have one of the following values:
* `DEFAULT` - the request was routed to the backend defined in `PROXY_BACKEND_HOST_URL_MAP`
* `PRUNING` - the request was routed to the backend defined in `PROXY_PRUNING_BACKEND_HOST_URL_MAP`
* `SHARD` - the request was routed to a shard defined in the `PROXY_SHARD_BACKEND_HOST_URL_MAP`

Additionally, the actual URL to which the request is routed to is tracked in the
`response_backend_route` column.
Binary file added architecture/images/proxy_service_sharding.jpg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions ci.docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,14 @@ services:
env_file: .env
environment:
PROXY_HEIGHT_BASED_ROUTING_ENABLED: "true"
PROXY_SHARDED_ROUTING_ENABLED: "true"
# use public testnet as backend origin server to avoid having
# to self-host a beefy Github Action runner
# to build and run a kava node each execution
PROXY_BACKEND_HOST_URL_MAP: localhost:7777>https://evmrpcdata.internal.testnet.proxy.kava.io,localhost:7778>https://evmrpc.internal.testnet.proxy.kava.io
PROXY_PRUNING_BACKEND_HOST_URL_MAP: localhost:7777>https://evmrpc.internal.testnet.proxy.kava.io
# fake the shards by defining shards with existing backends
PROXY_SHARD_BACKEND_HOST_URL_MAP: localhost:7777>10|https://evmrpc.internal.testnet.proxy.kava.io|20|https://evmrpc.internal.testnet.proxy.kava.io
EVM_QUERY_SERVICE_URL: https://evmrpc.internal.testnet.proxy.kava.io
ports:
- "${PROXY_HOST_PORT}:${PROXY_CONTAINER_PORT}"
Expand Down
69 changes: 69 additions & 0 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ type Config struct {
EnableHeightBasedRouting bool
ProxyPruningBackendHostURLMapRaw string
ProxyPruningBackendHostURLMap map[string]url.URL
EnableShardedRouting bool
ProxyShardBackendHostURLMapRaw string
ProxyShardBackendHostURLMap map[string]IntervalURLMap
ProxyMaximumBatchSize int
EvmQueryServiceURL string
DatabaseName string
Expand Down Expand Up @@ -65,6 +68,8 @@ const (
PROXY_BACKEND_HOST_URL_MAP_ENVIRONMENT_KEY = "PROXY_BACKEND_HOST_URL_MAP"
PROXY_HEIGHT_BASED_ROUTING_ENABLED_KEY = "PROXY_HEIGHT_BASED_ROUTING_ENABLED"
PROXY_PRUNING_BACKEND_HOST_URL_MAP_ENVIRONMENT_KEY = "PROXY_PRUNING_BACKEND_HOST_URL_MAP"
PROXY_SHARDED_ROUTING_ENABLED_ENVIRONMENT_KEY = "PROXY_SHARDED_ROUTING_ENABLED"
PROXY_SHARD_BACKEND_HOST_URL_MAP_ENVIRONMENT_KEY = "PROXY_SHARD_BACKEND_HOST_URL_MAP"
PROXY_MAXIMUM_BATCH_SIZE_ENVIRONMENT_KEY = "PROXY_MAXIMUM_REQ_BATCH_SIZE"
DEFAULT_PROXY_MAXIMUM_BATCH_SIZE = 500
PROXY_SERVICE_PORT_ENVIRONMENT_KEY = "PROXY_SERVICE_PORT"
Expand Down Expand Up @@ -220,6 +225,65 @@ func ParseRawProxyBackendHostURLMap(raw string) (map[string]url.URL, error) {
return hostURLMap, combinedErr
}

// ParseRawShardRoutingBackendHostURLMap attempts to parse backend host URL mapping for shards.
// The shard map is a map of host name => (map of end block => backend route)
// returning the mapping and error (if any)
func ParseRawShardRoutingBackendHostURLMap(raw string) (map[string]IntervalURLMap, error) {
parsed := make(map[string]IntervalURLMap)
hostConfigs := strings.Split(raw, ",")
for _, hc := range hostConfigs {
pieces := strings.Split(hc, ">")
if len(pieces) != 2 {
return parsed, fmt.Errorf("expected shard definition like <host>:<end-height>|<backend-route>, found '%s'", hc)
}

host := pieces[0]
endpointBackendValues := strings.Split(pieces[1], "|")
if len(endpointBackendValues)%2 != 0 {
return parsed, fmt.Errorf("unexpected <end-height>|<backend-route> sequence for %s: %s",
host, pieces[1],
)
}

prevMaxHeight := uint64(0)
backendByEndHeight := make(map[uint64]*url.URL, len(endpointBackendValues)/2)
for i := 0; i < len(endpointBackendValues); i += 2 {
endHeight, err := strconv.ParseUint(endpointBackendValues[i], 10, 64)
if err != nil || endHeight == 0 {
return parsed, fmt.Errorf("invalid shard end height (%s) for host %s: %s",
endpointBackendValues[i], host, err,
)
}
// ensure this is the only shard defined with this endBlock for this host
if _, exists := backendByEndHeight[endHeight]; exists {
return parsed, fmt.Errorf("multiple shards defined for %s with end block %d", host, endHeight)
}
// require height definitions to be ordered
// this is enforced because the shards are expected to cover the entire range
// from the previous shard's endBlock to the current shard's endBlock
if endHeight < prevMaxHeight {
return parsed, fmt.Errorf(
"shard map expects end blocks to be ordered. for host %s, shard for height %d found after shard for height %d",
host, endHeight, prevMaxHeight,
)
}

backendRoute, err := url.Parse(endpointBackendValues[i+1])
if err != nil || backendRoute.String() == "" {
return parsed, fmt.Errorf("invalid shard backend route (%s) for height %d of host %s: %s",
endpointBackendValues[i+1], endHeight, host, err,
)
}
backendByEndHeight[endHeight] = backendRoute
prevMaxHeight = endHeight
}

parsed[host] = NewIntervalURLMap(backendByEndHeight)
}

return parsed, nil
}

// ParseRawHostnameToHeaderValueMap attempts to parse mappings of hostname to corresponding header value.
// For example hostname to access-control-allow-origin header value.
func ParseRawHostnameToHeaderValueMap(raw string) (map[string]string, error) {
Expand Down Expand Up @@ -257,10 +321,12 @@ func ParseRawHostnameToHeaderValueMap(raw string) (map[string]string, error) {
func ReadConfig() Config {
rawProxyBackendHostURLMap := os.Getenv(PROXY_BACKEND_HOST_URL_MAP_ENVIRONMENT_KEY)
rawProxyPruningBackendHostURLMap := os.Getenv(PROXY_PRUNING_BACKEND_HOST_URL_MAP_ENVIRONMENT_KEY)
rawProxyShardedBackendHostURLMap := os.Getenv(PROXY_SHARD_BACKEND_HOST_URL_MAP_ENVIRONMENT_KEY)
// best effort to parse, callers are responsible for validating
// before using any values read
parsedProxyBackendHostURLMap, _ := ParseRawProxyBackendHostURLMap(rawProxyBackendHostURLMap)
parsedProxyPruningBackendHostURLMap, _ := ParseRawProxyBackendHostURLMap(rawProxyPruningBackendHostURLMap)
parsedProxyShardedBackendHostURLMap, _ := ParseRawShardRoutingBackendHostURLMap(rawProxyShardedBackendHostURLMap)

whitelistedHeaders := os.Getenv(WHITELISTED_HEADERS_ENVIRONMENT_KEY)
parsedWhitelistedHeaders := strings.Split(whitelistedHeaders, ",")
Expand All @@ -282,6 +348,9 @@ func ReadConfig() Config {
EnableHeightBasedRouting: EnvOrDefaultBool(PROXY_HEIGHT_BASED_ROUTING_ENABLED_KEY, false),
ProxyPruningBackendHostURLMapRaw: rawProxyPruningBackendHostURLMap,
ProxyPruningBackendHostURLMap: parsedProxyPruningBackendHostURLMap,
EnableShardedRouting: EnvOrDefaultBool(PROXY_HEIGHT_BASED_ROUTING_ENABLED_KEY, false),
ProxyShardBackendHostURLMapRaw: rawProxyShardedBackendHostURLMap,
ProxyShardBackendHostURLMap: parsedProxyShardedBackendHostURLMap,
ProxyMaximumBatchSize: EnvOrDefaultInt(PROXY_MAXIMUM_BATCH_SIZE_ENVIRONMENT_KEY, DEFAULT_PROXY_MAXIMUM_BATCH_SIZE),
DatabaseName: os.Getenv(DATABASE_NAME_ENVIRONMENT_KEY),
DatabaseEndpointURL: os.Getenv(DATABASE_ENDPOINT_URL_ENVIRONMENT_KEY),
Expand Down
36 changes: 36 additions & 0 deletions config/config_test.go
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
package config_test

import (
"net/url"
"os"
"testing"

"github.com/kava-labs/kava-proxy-service/config"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)

var (
Expand All @@ -14,6 +16,8 @@ var (
proxyServiceBackendHostURLMap = os.Getenv("TEST_PROXY_BACKEND_HOST_URL_MAP")
proxyServiceHeightBasedRouting = os.Getenv("TEST_PROXY_HEIGHT_BASED_ROUTING_ENABLED")
proxyServicePruningBackendHostURLMap = os.Getenv("TEST_PROXY_PRUNING_BACKEND_HOST_URL_MAP")
proxyServiceShardedRoutingEnabled = os.Getenv("TEST_PROXY_HEIGHT_BASED_ROUTING_ENABLED")
proxyServiceShardBackendHostURLMap = os.Getenv("TEST_PROXY_SHARD_BACKEND_HOST_URL_MAP")
)

func TestUnitTestEnvODefaultReturnsDefaultIfEnvironmentVariableNotSet(t *testing.T) {
Expand Down Expand Up @@ -53,10 +57,42 @@ func TestUnitTestParseHostMapReturnsErrEmptyHostMapWhenEmpty(t *testing.T) {
assert.ErrorIs(t, err, config.ErrEmptyHostMap)
}

func TestUnitTestParseRawShardRoutingBackendHostURLMap(t *testing.T) {
parsed, err := config.ParseRawShardRoutingBackendHostURLMap("localhost:7777>10|http://kava-shard-10:8545|20|http://kava-shard-20:8545")
require.NoError(t, err)
expected := map[string]config.IntervalURLMap{
"localhost:7777": config.NewIntervalURLMap(map[uint64]*url.URL{
10: mustUrl("http://kava-shard-10:8545"),
20: mustUrl("http://kava-shard-20:8545"),
}),
}
require.Equal(t, expected, parsed)

_, err = config.ParseRawShardRoutingBackendHostURLMap("no-shard-def")
require.ErrorContains(t, err, "expected shard definition like <host>:<end-height>|<backend-route>")

_, err = config.ParseRawShardRoutingBackendHostURLMap("invalid-shard-def>odd|number|bad")
require.ErrorContains(t, err, "unexpected <end-height>|<backend-route> sequence for invalid-shard-def")

_, err = config.ParseRawShardRoutingBackendHostURLMap("invalid-height>NaN|backend-host")
require.ErrorContains(t, err, "invalid shard end height (NaN) for host invalid-height")

_, err = config.ParseRawShardRoutingBackendHostURLMap("invalid-backend-host>100|")
require.ErrorContains(t, err, "invalid shard backend route () for height 100 of host invalid-backend-host")

_, err = config.ParseRawShardRoutingBackendHostURLMap("unsorted-shards>100|backend-100|50|backend-50")
require.ErrorContains(t, err, "shard map expects end blocks to be ordered")

_, err = config.ParseRawShardRoutingBackendHostURLMap("multiple-shards-for-same-height>10|magic|20|dino|20|dinosaur")
require.ErrorContains(t, err, "multiple shards defined for multiple-shards-for-same-height with end block 20")
}

func setDefaultEnv() {
os.Setenv(config.PROXY_BACKEND_HOST_URL_MAP_ENVIRONMENT_KEY, proxyServiceBackendHostURLMap)
os.Setenv(config.PROXY_HEIGHT_BASED_ROUTING_ENABLED_KEY, proxyServiceHeightBasedRouting)
os.Setenv(config.PROXY_PRUNING_BACKEND_HOST_URL_MAP_ENVIRONMENT_KEY, proxyServicePruningBackendHostURLMap)
os.Setenv(config.PROXY_SHARDED_ROUTING_ENABLED_ENVIRONMENT_KEY, proxyServiceShardedRoutingEnabled)
os.Setenv(config.PROXY_SHARD_BACKEND_HOST_URL_MAP_ENVIRONMENT_KEY, proxyServiceShardBackendHostURLMap)
os.Setenv(config.PROXY_SERVICE_PORT_ENVIRONMENT_KEY, proxyServicePort)
os.Setenv(config.LOG_LEVEL_ENVIRONMENT_KEY, config.DEFAULT_LOG_LEVEL)
}
41 changes: 41 additions & 0 deletions config/intervalmap.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
package config

import (
"net/url"
"sort"
)

// IntervalURLMap stores URLs associated with a range of numbers.
// The intervals are defined by their endpoints and must not overlap.
// The intervals are inclusive of the endpoints.
type IntervalURLMap struct {
UrlByEndHeight map[uint64]*url.URL
endpoints []uint64
}

// NewIntervalURLMap creates a new IntervalMap from a map of interval endpoint => url.
// The intervals are inclusive of their endpoint.
// ie. if the lowest value endpoint in the map is 10, the interval is for all numbers 1 through 10.
func NewIntervalURLMap(urlByEndHeight map[uint64]*url.URL) IntervalURLMap {
endpoints := make([]uint64, 0, len(urlByEndHeight))
for e := range urlByEndHeight {
endpoints = append(endpoints, e)
}
sort.Slice(endpoints, func(i, j int) bool { return endpoints[i] < endpoints[j] })

return IntervalURLMap{
UrlByEndHeight: urlByEndHeight,
endpoints: endpoints,
}
}

// Lookup finds the value associated with the interval containing the number, if it exists.
func (im *IntervalURLMap) Lookup(num uint64) (*url.URL, uint64, bool) {
i := sort.Search(len(im.endpoints), func(i int) bool { return im.endpoints[i] >= num })

if i < len(im.endpoints) && num <= im.endpoints[i] {
return im.UrlByEndHeight[im.endpoints[i]], im.endpoints[i], true
}

return nil, 0, false
}

0 comments on commit e30b255

Please sign in to comment.