Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trained model usage #1

Closed
goooroooX opened this issue Jan 12, 2018 · 3 comments
Closed

Trained model usage #1

goooroooX opened this issue Jan 12, 2018 · 3 comments

Comments

@goooroooX
Copy link

Hi,
Could you please post a few lines of code with a sample of checking domain name against trained model and returning result (generated/non-generated)?
Thanks!

@drhyrum
Copy link

drhyrum commented Jan 12, 2018

Thanks for your interest. This code is meant to reproduce the figures in the paper
https://arxiv.org/abs/1611.00791

but you can also query the trained model directly, as follows.

After you've trained the model using data X,y and have valid_chars
https://github.com/endgameinc/dga_predict/blob/master/dga_classifier/lstm.py#L28-L46

you may query the model using the following steps ("domain.xyz")
(1) remove the TLD from the domain
(2) encode domain characters as integer tokens and pad
(3) query the model

# assumes you've already trained the model and have access to "valid_chars"
import tldextract
from keras.preprocessing import sequence
query_domain = 'domain.xyz'
query_domain_stripped = tldextract.extract(query_domain).domain
query = sequence.pad_sequences( [[valid_chars[y] for y in query_domain_stripped]], maxlen=maxlen) 
print( model.predict(query) )

>> [[0.00203814]]

You can find more information in a related blog post:
https://www.endgame.com/blog/technical-blog/using-deep-learning-detect-dgas

@goooroooX
Copy link
Author

Thank you for a sample.
Is it possible to avoid external libraries usage (keras)? I'm trying to implement a light-weight solution for monitoring and limited with native Python libraries in sandbox.
Thanks!

@drhyrum
Copy link

drhyrum commented Jan 16, 2018

This isn't straightforward, and beyond the scope of this repo.

One option: export the keras model as a tensorflow model, then investigate using something like https://github.com/riga/tfdeploy to make numpy as the only dependency. I'm not aware of a fail-safe method to do the first step (export keras to tensorflow), but you might find some resources here:

Another route would be to create your own model from scratch using another framework that you find suitable. For example, I believe that numpy is the CPU backend for https://github.com/chainer/chainer. In that case, this repo would only serve as a guide (and data) to you rewriting and training your own model.

@drhyrum drhyrum closed this as completed Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants