Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] can it be install on docker? #25

Closed
dianwoshishi opened this issue Feb 24, 2022 · 10 comments
Closed

[feature] can it be install on docker? #25

dianwoshishi opened this issue Feb 24, 2022 · 10 comments

Comments

@dianwoshishi
Copy link
Contributor

this project is really cool!
Is there a way to install your work on docker?

@vaclavbartos
Copy link
Collaborator

No, there's currently no support for Docker.

If a VM is an option for you, there is a Vagrantfile which sets up a basic installation of NERD (intended for a quick test, not production deployment). See https://github.com/CESNET/NERD/wiki/Installation-and-running

There's also a script to perform full installation on a clean CentOS system (although some things still have to be configured manually), see https://raw.githubusercontent.com/CESNET/NERD/master/install/install_centos7.sh

Notice: Although NERD is open-source, you're probably the first one trying to install it outside the development team. I'm willing to help you with everything, but expect the installation and maintentance might not by as easy as with some well-known mature projects.
Also, the primary input data NERD expects are alerts from CESNET's Warden system, which is not public/open. However, we recently added support for various blakclists and AV OTX as primary inputs (and MISP is there for some time as well), so it should be useful even without Warden data. Just the web interface still needs to be updated, it's now too focused on Warden (we plan to change it soon).

@dianwoshishi
Copy link
Contributor Author

Thanks for your kindly reply. I am interested in this job.

I am working on making it supported in Docker. Now, it can be run in a centos7 container. https://github.com/dianwoshishi/NERD/tree/development

But there are some problems that confuse me.

  • are there restrictions on IP access to the web url /nerd. In the container, I can access the url use command links,
    image

but when i access it outside the container, errors occur in the httpd logs.
image
image
image

errors above:

  • RecursionError: maximum recursion depth exceeded while calling a Python object
  • NameError: name 'len' is not defined

but,i am sure the mongod.servie, rabbitmq-server, are running. The Docker's port mapping works too.
image

image

the NERDd is also working.
image

the reason errors occur may be that:

  • I have no data in the database
  • other errors in the container environment

Maybe you can help me analyze the root reason with your knowledge on this job.

Thanks for your attention again.

@dianwoshishi
Copy link
Contributor Author

I appreciate your ‘everything’ help.

I have some problems with the fmp scores:

the features used by the ML algorithm to calculate the fmp scores are 21 dim, less than mentioned in the paper on Future Generation Computer Systems'19 ? is it the latest?

the second question is that: without real-world data, it is difficult for others to train a model to taste the fmp feature.

Thanks for your job, I really like it!

@vaclavbartos
Copy link
Collaborator

There are no access restrictions, it should work the same from within the container and from the outside.

The first error seems to be related to pymongo package. I can see they released version 4.0.x quite recently. All my installations use 3.x, so maybe it's because of some incompatilibity between versions (I really have to specify versions in the requirements files!). Try to downgrade to pymongo==3.12.3.

You can ignore the second error (undefined len). It's just a problem during cleanup when the process is shutting down, with no unwanted consequences. I know about it, it just wasn't a priority to fix it, since it's not really an issue.

Regarding "Exitted too quickly" errors - the reason is a missing configuration. Some modules need to configure API keys to access the corresponding data sources. You can confirm this in the log files (/var/log/nerd/*).
For Shodan and OTX, simply register a free account on their websites and get the API key. Then, fill it in the corresponding key in /etc/nerd/nerd.yml.

I can't help you with Warden, since the data are only available to members of the sharing community (it's not impossible to become part of it, but it generally means to have some kind of detection tool (e.g. a honeypot, IDS, ...) and share its results, and some basic level of trust, i.e. we would need to know who you are - I'll have to discuss it with my colleages who run Warden).

@vaclavbartos
Copy link
Collaborator

vaclavbartos commented Feb 27, 2022

Regarding FMP:
TL;DR: If this is the only/main reason you're trying to run NERD, then stop, it's (currently) useless.

Long explanation: The FMP scoring method for the FGCS'19 paper was implemented and evaluated "offline" using a set of scripts and a static dataset. It was later implemented into NERD, but only in a (too) simplified form. Many of the features which are used in the paper are too difficult or impossible to compute in real time in NERD with its current architecture and data model. So, for the first version, I used just a small subset of features which were easy to get. It turned out that the results of this simplified model are not very good, so, although the implementation is still there, the results are quite hidden from normal users (it appears only as "fmp" attribute in IP detail; the "reputation score" (the number on a coloured background) is based on an unrelated simple formula, no ML).
Since I had to work on other projects in the last two years, it stayed like that until now.

However, very recently, I started to work on this again with one of my colleagues. We're reviving the old scripts, thinking about how to compute all the needed features, and plan to do the proper implementation of the FMP score into NERD. Maybe not exactly the same as in the paper, but as much close as possible. However, it can take a few months.

Anyway, as you point out, the current implementation of FMP (as well as the planned one) is designed to work with data from Warden only, and without access to such data, it's indeed impossible for others to use or test it. At least in "online"/"real-time" mode in NERD.

For research purposes, a static dataset can be used. There is a public dataset of anonymized data from Warden here: https://data.mendeley.com/datasets/p6tym3fghz/1
This is one week of data only, which is too short for training the FMP model, but in case you're an researcher and want to do some experiments with it, I think I can provide you a bigger dataset.

Regarding FMP in NERD - as a long-term plan I want to make whole NERD more general, i.e. Warden data should be just one of the possible sources, not the main one. The same holds for IP scoring mechanism - it should utilize all the available data. However, this needs a lot of changes (even the IP scoring will probably work very differently than how FMP is designed now), so it won't be finished soon.

If you want to further discuss this (or anything else) outside this public issue, write me at bartos@cesnet.cz (but let's leave the issues regarding NERD installation here).

@dianwoshishi
Copy link
Contributor Author

So kind you are !
I’ll try the method you mentioned to solve running errors.

There are no access restrictions, it should work the same from within the container and from the outside.

The first error seems to be related to pymongo package. I can see they released version 4.0.x quite recently. All my installations use 3.x, so maybe it's because of some incompatilibity between versions (I really have to specify versions in the requirements files!). Try to downgrade to pymongo==3.12.3.

You can ignore the second error (undefined len). It's just a problem during cleanup when the process is shutting down, with no unwanted consequences. I know about it, it just wasn't a priority to fix it, since it's not really an issue.

Regarding "Exitted too quickly" errors - the reason is a missing configuration. Some modules need to configure API keys to access the corresponding data sources. You can confirm this in the log files (/var/log/nerd/*). For Shodan and OTX, simply register a free account on their websites and get the API key. Then, fill it in the corresponding key in /etc/nerd/nerd.yml.

I can't help you with Warden, since the data are only available to members of the sharing community (it's not impossible to become part of it, but it generally means to have some kind of detection tool (e.g. a honeypot, IDS, ...) and share its results, and some basic level of trust, i.e. we would need to know who you are - I'll have to discuss it with my colleages who run Warden).

@dianwoshishi
Copy link
Contributor Author

Regarding FMP: TL;DR: If this is the only/main reason you're trying to run NERD, then stop, it's (currently) useless.

Long explanation: The FMP scoring method for the FGCS'19 paper was implemented and evaluated "offline" using a set of scripts and a static dataset. It was later implemented into NERD, but only in a (too) simplified form. Many of the features which are used in the paper are too difficult or impossible to compute in real time in NERD with its current architecture and data model. So, for the first version, I used just a small subset of features which were easy to get. It turned out that the results of this simplified model are not very good, so, although the implementation is still there, the results are quite hidden from normal users (it appears only as "fmp" attribute in IP detail; the "reputation score" (the number on a coloured background) is based on an unrelated simple formula, no ML). Since I had to work on other projects in the last two years, it stayed like that until now.

However, very recently, I started to work on this again with one of my colleagues. We're reviving the old scripts, thinking about how to compute all the needed features, and plan to do the proper implementation of the FMP score into NERD. Maybe not exactly the same as in the paper, but as much close as possible. However, it can take a few months.

Anyway, as you point out, the current implementation of FMP (as well as the planned one) is designed to work with data from Warden only, and without access to such data, it's indeed impossible for others to use or test it. At least in "online"/"real-time" mode in NERD.

For research purposes, a static dataset can be used. There is a public dataset of anonymized data from Warden here: https://data.mendeley.com/datasets/p6tym3fghz/1 This is one week of data only, which is too short for training the FMP model, but in case you're an researcher and want to do some experiments with it, I think I can provide you a bigger dataset.

Regarding FMP in NERD - as a long-term plan I want to make whole NERD more general, i.e. Warden data should be just one of the possible sources, not the main one. The same holds for IP scoring mechanism - it should utilize all the available data. However, this needs a lot of changes (even the IP scoring will probably work very differently than how FMP is designed now), so it won't be finished soon.

If you want to further discuss this (or anything else) outside this public issue, write me at bartos@cesnet.cz (but let's leave the issues regarding NERD installation here).

The idea of fmp is cool. Thanks for your persistent work on it.i’ll discuss relate work using email in the future.

@dianwoshishi
Copy link
Contributor Author

I've solved the problem that can not access the /nerd outside the container.

This error has nothing to do with your project, it is the browser's security policy that prevents me from accessing it. Sorry for that.


I mapped port 80 of the container to port 10080 on my host.

but the port 10080 is on the restricted port list of Edge Browser. This is the root reason for preventing access to the /nerd.
https://docs.microsoft.com/en-us/deployedge/microsoft-edge-policies#explicitlyallowednetworkports

@dianwoshishi
Copy link
Contributor Author

So far, I've now finished getting your project supported by Docker.

there are some changes:


this work can be found at: https://github.com/dianwoshishi/NERD/tree/development

@dianwoshishi
Copy link
Contributor Author

I've solved the problem that can not access the /nerd outside the container.

This error has nothing to do with your project, it is the browser's security policy that prevents me from accessing it. Sorry for that.

I mapped port 80 of the container to port 10080 on my host.

but the port 10080 is on the restricted port list of Edge Browser. This is the root reason for preventing access to the /nerd. https://docs.microsoft.com/en-us/deployedge/microsoft-edge-policies#explicitlyallowednetworkports

Safari is also restricted to access the port!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants