How to run your code with SLEIPNIR dataset #4

vietvo89 · 2021-03-23T07:56:54Z

Hi Zay

I have got SLEIPNIR dataset from the author. But your sample code uses a data format differrent from SLEIPNIR dataset which consists of several individual files. So how can I run your malGAN with SLEIPNIR dataset?

Thanks

ZaydH · 2021-03-23T08:20:24Z

It has been a few years since I worked on this code, and I am going off of memory.

The basic idea is you need to convert the SLEIPNIR files into a NumPy ndarray tensor. I found the old code I believe I used and uploaded it to a gist for you. Please try that. You may need to modify it to make it work.

vietvo89 · 2021-03-23T11:24:24Z

Thank you so much. Let me try your code. But one more thing, if I train MalGAN and have a model, how can I use your code to generate malware to evaluate the success rate of your method against the black-box detector? Is it right if I only use the trained Generator to produce benign samples from malware?

ZaydH · 2021-03-24T10:00:05Z

I am not sure exactly what you mean. I will answer what is my best guess of what you mean. If this is off base, let me know.

The MalwareGAN code serial trains a blackbox detector (you can specify the type) as well as the GAN. I am not sure what you mean by "have a model". You could in theory replace my blackbox detector with your own if you wanted, but you would need to handle that integration.

To determine teh success rate as I did, I recommend splitting the training set into three parts: training, validation, and test. You use the training set to train the model (with validation for hyperparameter selection). Only then you use the held out test set to see how well your model performed on totally unseen data. This is the standard flow.

vietvo89 · 2021-03-28T23:00:56Z

Thank Zay.

I read other papers and they demonstrated how to do attack with GAN. But I want to double check with you that if I have trained GAN model, do I need Generator to attack or to make malware evade detectors? The flow may be feeding malware to the generator and then evaluate how its output evade the detector.

Thanks

ZaydH · 2021-03-29T11:38:45Z

Yes.

After you train the model, you take a new malware vector, run it through the generator. This will yield a new vector that should evade the detector. To verify your workflow, you can then run that modified vector though the detector to see if it is marked as clean. This secondary sanity check is clearly not possible in practice but works for scientific evaluation/debugging.

ZaydH mentioned this issue Mar 24, 2021

installation+implementation #5

Closed

ZaydH closed this as completed Mar 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to run your code with SLEIPNIR dataset #4

How to run your code with SLEIPNIR dataset #4

vietvo89 commented Mar 23, 2021

ZaydH commented Mar 23, 2021

vietvo89 commented Mar 23, 2021

ZaydH commented Mar 24, 2021

vietvo89 commented Mar 28, 2021

ZaydH commented Mar 29, 2021

How to run your code with SLEIPNIR dataset #4

How to run your code with SLEIPNIR dataset #4

Comments

vietvo89 commented Mar 23, 2021

ZaydH commented Mar 23, 2021

vietvo89 commented Mar 23, 2021

ZaydH commented Mar 24, 2021

vietvo89 commented Mar 28, 2021

ZaydH commented Mar 29, 2021