New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warning while loading framework and some other questions about output #30
Comments
Hi @jiansuozhe, thanks for your questions. In v1.0 art is loaded dynamically and this warning comes from art,
Defining what is a success or failure is going to depend on the attack and framework you use. For example, in
Hopefully this is helpful! |
Thank you very much for your help @moohax |
Please don't hesitate to ask more questions! |
Hello @moohax, Could you please explain to me why I only got the adversarial input in my running output? Is it because of the estimator? [-] Running attack HopSkipJump with id 12a70390 on creditfraud) [-] Preparing attack... |
Additionally, could you please tell me how can we extract essential information from the adversarial input? For instance, find the useful numbers in a bunch of numbers to evaluate the model? @moohax |
@jiansuozhe That's what In terms of "evaluating a model". Counterfit is largely designed as a red team tool, and the traditional sort of "robustness" testing is not necessarily a feature that we put front and center. For this type of reporting, I would dig into what |
Hello @moohax, Thank you for your reply. You mean that Counterfit is developed to test the protection level of system and find the leaks right?Could you please tell me how to realize this function?For example, I run an attack on a target and get the running output, I should be able to get some useful information from the output, is it correct? Can I get some useful information, for instance where are the leaks in my AI system or how to improve my AI algorithm to make it safer, from my output? Or I can only get the feedback like "my system is not safe when facing evasion attack" or something like this. Thank you. |
The useful information is the output. If there is some metric or some output you would like to see, please let us know. You could collect all of the outputs (Adversarial Inputs) and use them in an adversarial retraining scheme. But Counterfit has no official retraining mechanism built in. It could be done by adding training code to your target, calling To explore targets and the completed attacks, from the
|
Hello @moohax, I found that when running HopSkipJump I can get the output, but when running the other attacks I only got the bugs. For instance, when running BoundaryAttack I got "Result too large", when running BasicIterativeMethod I got "no attribute 'predict wrapper'". Additionally, the most frequent problem is "object of type 'NoneType' has no len()". Is it because the input data(.npz) and the model file(.pkl) you provided can only be used in HopSkipJump? Or I need to switch my target? Thank you. |
@moohax, additionally, I have never created a new input data file before. Could you please give me some tips on how to create an input file (.npz)? Thank you. |
Each attack has varying requirements. The Boundary Attack likely ran successfully, but the output may have been too big for some auxiliary process. You can trace through A helpful debugging thing is to put You may need to switch the target, an attack can be either open-box (you have the model file), or closed-box (you have access to inference only). Hop Skip Jump is a closed-box attack and the Basic Iterative Method is an open-box attack. The implication is that the backend framework Adversarial Robustness Toolbox requires an estimator/classifier that inherits from Counterfit passes everything back to the framework to be built and run. The targets provided are for demo purposes, and we artificially force a particular ART loading process a For example, digits_keras vs digits_blackbox If you provide no
I run into this, It's a bug when the attack fails to run. Because the attack fails to run, |
This is a numpy zip file and is not explicitly a requirement for targets.
Similarly, in def predict(self, x):
for sample in x:
send sample to endpoint if you are working with a local model, or an api that can accept a batch, the predict would look like... def predict(self, x):
send x to endpoint
... You will also return a list of lists from |
Hello @moohax, I loaded the framework art and interacted the target satellite, but I found that almost all the attacks did not work. The most frequent two questions are the following: The second problem is like following: Could you please tell me how to fix it? Thank you. |
Hello @moohax, I found that if I do not use the latest version of counterfit, my attacks really worked. When I interacted the target tutorial, all the attacks worked. When I interacted the target satelliteimages, only the pixel attack and the threshold attack did not run properly. When running pixel attack, the problem was similar as before (no attribute XXX), when running threshold attack, the problem is like follows: I do not know if I can fix it by changing the settings of my system. Now that I would like to utilize counterfit to develop a system testing the security of AI models those are used to classify images, could you please tell me if I just need to consult the target tutorial and satelliteimages to design my own target? Thank you. |
Nice work! (I know it doesn't seem like it).
The memory allocation looks like a bug, seems as though the target is trying to process ALL images in the dataset as a sample. Double check to see if |
Hello @moohax, I downloaded the latest version of Counterfit and found that I could not load from config.json. Could you please tell me how to deal with this problem? Thank you. |
This is just a warning. You can provide a config that would limit the available attacks, or provide defaults. Otherwise Counterfit will just dynamically load all attacks. Each respective framework implementation can be found under the folder named after the framework, |
Hi there,
I downloaded the latest version of Counterfit last week and installed all the modules required but still have some problems. When I executed the command 'load art' I got a warning:
load art
The type of the provided estimator is not yet support for automated setting of logits difference loss. Therefore, this attack is defaulting to attacking the loss provided by the model in the provided estimator.
[+] art successfully loaded with defaults (no config file provided)
which did not exist before. When I executed the 'run' command I only got the adversarial input:
run
[-] Running attack HopSkipJump with id 12a70390 on creditfraud)
[-] Preparing attack...
[-] Running attack...
┌─────────┬──────────────┬──────────────────────────┐
│ Success │ Elapsed time │ Total Queries │
├─────────┼──────────────┼──────────────────────────┤
│ 1/1 │ 4.3 │ 24550 (5740.6 query/sec) │
└─────────┴──────────────┴──────────────────────────┘
┌┬┬┬┬────────────────────────────────────────────────────────────────────────┬┐
│││││ Adversarial Input ││
├┼┼┼┼────────────────────────────────────────────────────────────────────────┼┤
│││││ [4462.00 -2.30 1.76 -0.36 2.33 -0.82 -0.07 0.56 -0.40 -0.24 -1.53 2.03 ││
│││││ -6.56 0.17 -1.47 -0.70 -2.28 -4.78 -2.62 -1.34 -0.43 -0.30 -0.93 0.17 ││
│││││ -0.09 -0.15 -0.54 0.04 -0.15 239.93] ││
└┴┴┴┴────────────────────────────────────────────────────────────────────────┴┘
[+] Attack completed 12a70390 (HopSkipJump)
there are some differences in the scan summary as well (no 'queries'):
Additionally, could you please explain some meaning of values in the scan summary and the running output? Could you please tell me what does it mean by 'successes'? When can we say an attack is a 'success' or 'failure'? I guess that 'best score' means the success percentage of samples in an attack, is it correct?
In the running output, I guess the sample index means the number of samples in the attack right? What is the meaning of label and attack label? What is the meaning of % Eucl. dist. and Elapsed Time [sec]? I think 'queries' means the number of the attack on target, is it correct? I saw a list of decimal numbers in the adversarial input value, are they only random numbers or they have some similarities? Where are they from? Do they have some relations with my attack type?
Thank you very much for your help and patience.
The text was updated successfully, but these errors were encountered: