Advanced Use

There is a flow to Counterfit, both in creating targets and in executing attacks. Here we will cover some advanced use cases and show how flexible it can be. You will come to learn that Counterfit does not care how you get traffic back and forth - it only cares outputs are returned to the backend framework in the correct way.

Hiding Malicious Queries

Malicious queries look malicious. On some platforms, the predicted image is presented to the ML engineer in a dashboard. If suddenly, the platform starts to receive queries that look like television static, it will set off some alarms. To hide malicious queries, create a function inside MyTarget that sends normal traffic to the endpoint and use it in the __call__ function. The example below shows that for every query the attack algorithm makes, a random number of normal queries will be sent. This ups your traffic but depending on the situation it could be worth it.

import random
       ...
    
    def normal_traffic(num_queries):
        for num in range(num_queries):
            random_sample = random.choice(self.X)
            request.post(self.model_endpoint, data=normal_data)
        return 

    def __call__(self, x):
        sample = x[0].tolist()
        
        num_benign_queries = random.randrange(1,25))
        self.normal_traffic(num_benign_queries)
        
        response = requests.post(self.endpoint, data={"input": sample})

        results = response.json()
        cat_proba = results["confidence"]
        not_a_cat_proba = 1-cat_proba
        
        return [cat_proba, not_a_cat_proba]
    ...

Using a Proxy

Most penetration testing tools can create proxies that allow arbitrary traffic to be passed into an internal network. Counterfit does not require any special configuration for this use case. Simply configure the proxy and point the model_endpoint to the target or proxy - just as you would for RDP or SSH. For example, using the requests library with a socks proxy.

Python Requests

Setup any proxy you like, then use requests to send traffic to the target.

from counterfit.core.interfaces import AbstractTarget

class MyTarget(AbstractTarget):
    ...
    endpoint = "https://10.10.2.11/predict"
    ...

    def request_proxy_session():
        session = requests.session()
        session.proxies = {
                            'http':  'socks5://10.10.1.3:9050',
                            'https': 'socks5://10.10.1.3:9051'
                          }
        return session

    def __call__(self, x):
        sample = x[0].tolist()
        session = request_proxy_session()
        response = session.post(self.model_endpoint, data=sample)
        ...

Sending Inputs and Collecting Outputs from a Different Location

Write a function to send a query, then write a function to collect the output. Sometimes APIs will provide a redirect or separate URI to collect the results from. The below is a fairly simple example, but we have used this technique to collect from a number of obscure places.

import requests
    ...

    def send_query(query_data):
        response = requests.post(self.model_endpoint, data=query_data)
        return response

    def collect_result(collection_endpoint):
        response = requests.get(collection_endpoint)
        return response

    def __call__(self, x):
        sample = x[0].tolist()
        
        response = send_query(sample)
        collection_endpoint = response.tojson()['location']

        result = collect_result(collection_endpoint)
        
        final_result = result.json()
        cat_proba = results["confidence"]
        not_a_cat_proba = 1-cat_proba
        
        return [cat_proba, not_a_cat_proba]

    ...

Adding Startup Commands

At some point you may want to load all frameworks or perform some checks on start. Counterfit uses cmd2 to load a startup script for this exact reason. Create a .counterfit file at the root of the project, Counterfit will execute these commands on start.

load art
load textattack

Overriding Functions in the Parent Target Class

You can technically override any of the functions in the parent target class – and you should be careful to not override functions unnecessarily. However, outputs_to_labels is one function that could be comfortably overridden for certain scenarios.

A primary reason to override outputs_to_labels is to incorporate any knowledge you have about the decision threshold for a target model. By default, outputs_to_labels in the parent class reports the class with highest confidence as the model output. For two classes, that corresponds to an implicit threshold of 0.5, for three classes it corresponds to an implicit threshold of 0.3333, etc.

As an example, suppose you learn through you investigations that a fraud classifier reports fraudulent only when confidence score exceeds 0.9. In this case, you could override outputs_to_labels as follows:

def outputs_to_labels(self, output, threshold=0.9):
  output = np.atleast_2d(output)
  return ['fraudulent' if score[0] > threshold else 'benign' for score in output]

Training A Local Model to Attack

Counterfit does not include any of the training functionality from the frameworks that are normally used for whitebox attacks. However, it is still possible to train a model inline and then attack the newly trained model. Put the training code inside the __init__ function. Beware that the target will fail to load if the __init__ function fails. Errors should be handled gracefully so that the reload command will continue to work.

def __init__(self):
  ...
  self.model = self.train_model(self.X, self.y)

def train_model(self, X, y):
  ...
  model.fit(X, y)
  ...
  return model

def __call__(self, x):
  results = self.model.predict(x)
  return results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly