A service proxy with fault injection capabilities for systematic resilience testing
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


A service proxy with failure injection API

This is a reference implementation of a client-side service proxy. It is meant to be used with Gremlin, a systematic resiliency testing framework. Every microservice instance making outbound API calls needs to have an associated gremlin proxy. Typically, it runs in the same VM or container alongside the calling process, and communicates over the loopback interface with the caller.

Remote services and their instances have to be statically configured in the configuration file. The service proxy acts as a HTTP/HTTPS request router to route requests that arrive at localhost:port to the remotehost:port. It has built in support for load balancing requests across remote service instances in a round robin manner. There is no support for sticky-sessions nor client-side TLS. Note that while the proxy can connect to HTTPS endpoints, the caller must connect to the proxy at the localhost via HTTP only. See the example-config.json for an example of how to support HTTPS upstream endpoints, while connecting to the proxy via http://localhost:port.

Failure injection

Requests that carry a pre-defined HTTP header, are subjected to various forms of fault injection. Requests can be aborted (caller gets back a HTTP 404, HTTP 503, etc.), delayed, or rewritten. The proxy can be controlled remotely using a REST API. Rules for various fault injection actions can be installed through this API. The [Gremlin resliency testing framework] (https://github.com/ResilienceTesting/gremlinsdk) provides a Python-based control plane library, to write high-level recipes, that will be automatically broken down into low-level fault injection commands to be executed by the gremlin proxy.



The services section of the config file describes a list remote services that need to be proxied. Each element in the list is a JSON dictionary object, describing a single service.

The proxy block under each service specifies the local port at which requests for the remote service will be received, the IP address to bind to (defaults to localhost), and the proxy protocol. The valid values are "http" or "tcp". While the proxy can work with HTTP/HTTPS and generic TCP endpoints, fault injection support for TCP endpoints is limited to aborting/delaying connections at the beginning of a TCP session.

The loadbalancer section configures the set of hosts that provide the remote service as well as the load balancing method (currently roundrobin and random load balancing modes are supported). When the proxy protocol is set to "http", you can specify hosts with or without a scheme prefix (i.e., http/https). When the scheme prefix is absent, "http" will be added to the host entry. For example, if a host entry is of the form, request URLs will be of the form If you would like to proxy requests to HTTPS endpoints, host entries in the loadbalancer section must be prefixed with "https://" (e.g., https://myacc.cloudant.com).

The router block configures the REST interface of the gremlin proxy. The port 9876 is the default port at which the service proxy exposes the REST API. The gremlinheader parameter specifies the HTTP header that triggers the fault injection actions. Requests that do not contain this header are left untouched. The name parameter indicates the name of the microservice for which this service proxy is being used.

Fields loglevel, logjson, and logstash configure the logging aspects of the service proxy. All logs from the service proxy can be directly sent to a logstash server, and then subsequently piped to Elasticsearch. The Gremlin framework's assertion engine can directly interface with Elasticsearch to execute assertions over the logs generated by the gremlinproxy.

An example configuration file is provided in example-config.json. It configures a proxy for a Client microservice (as indicated by the name parameter in the router block). The proxy listens for requests to the Server microservice at and forwards them to either or https://httpbin.org. All requests from the Client microservice, containing the HTTP header X-Gremlin-ID will be subjected to fault injection.

Building and running the proxy

  • Before you run the proxy, you need to run logstash server and elasticsearch. Run docker-compose -f compose-logstash-elasticsearch.yml up -d
  • Setup your go environment and GOPATH variable
  • Clone the repository to $GOPATH/go/src/github.com/gremlin folder.
  • Build: go get && go build
  • Run ./gremlinproxy -c yourconfig.json


GET /gremlin/v1: simple hello world test

POST /gremlin/v1/rules/add: add a Rule. Rule must be posted as a JSON. Format is as follows

  source: <source service name>,
  dest: <destination service name>,
  messagetype: <request|response|publish|subscribe|stream>
  headerpattern: <regex to match against the value of the X-Gremlin-ID trackingheader present in HTTP headers>
  bodypattern: <regex to match against HTTP message body>
  delayprobability: <float, 0.0 to 1.0>
  delaydistribution: <uniform|exponential|normal> probability distribution function

  mangleprobability: <float, 0.0 to 1.0>
  mangledistribution: <uniform|exponential|normal> probability distribution function

  abortprobability: <float, 0.0 to 1.0>
  abortdistribution: <uniform|exponential|normal> probability distribution function

  delaytime: <string> latency to inject into requests <string, e.g., "10ms", "1s", "5m", "3h", "1s500ms">
  errorcode: <Number> HTTP error code or -1 to reset TCP connection
  searchstring: <string> string to replace when Mangle is enabled
  replacestring: <string> string to replace with for Mangle fault

POST /gremlin/v1/rules/remove : remove the rule specified in the message body (see rule format above)

GET /gremlin/v1/rules/list: list all installed rules

DELETE /gremlin/v1/rules: clear all rules

GET /gremlin/v1/proxy/:service/instances: get list of instances for for :service

PUT /gremlin/v1/proxy/:service/:instances: set list of instances for :service. :instances is a comma separated list.

DELETE /gremlin/v1/proxy/:service/instances: clear list of instances under :service

PUT /gremlin/v1/test/:id: set new test :id, that will be logged along with request/response logs

DELETE /gremlin/v1/test/:id: remove the currently set test :id