Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Java Python Shell
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
[Introduction] Fluxbuster is designed to cluster and classify candidate fast flux networks from SIE DNS traffic data. Data is read from timestamped compressed data files and stored in a relational database. The clustering process uses an agglomerative hierarchical clustering algorithm. Classification involves a pre-constructed decision tree. [Compilation] Build Requirements: 1. Java JDK 1.6 or greater 2. ant To compile Fluxbuster from the directory containing the build.xml file execute the following command: ant build This will build all of the class files and place them into the bin directory. To clean up the build from the same directory execute the following command: ant clean NOTE: all of the libraries necessary to build and execute Fluxbuster are included in the lib directory. [Documentation] The javadocs for Fluxbuster are located in the doc directory. They can be generated from the source code by executing: ant doc To clean up the javadocs execute: ant clean-doc [Installation] As Fluxbuster only currently supports PostgreSQL, PostgreSQL version 8.4 or greater should be installed to your database host. After this is accomplished complete the following steps to install the Fluxbuster database. 1. Create a database user to own the Fluxbuster database. If Fluxbuster will be installed on the same host as the database the Fluxbuster user would only need local access to the database, otherwise the user must be given permission to access the database from the host running Fluxbuster. 2. Create a database to house the Fluxbuster db schema. 3. Grant your database user ownership of the newly created database. 4. As the Fluxbuster user execute the fluxbuster_schema.sql file found in the resources directory on the Fluxbuster database. This will create the tables, indexes, etc. required by Fluxbuster. [Configuration] Fluxbuster is configured through several .properties files in the bin directory once the program has been built. What follows is a description of each of the attributes in the .properties files. fluxbuster.properties SELECTED_CFD_FILE : The path to a list of domains that should be clustered regardless of their candidate score. If this is properties is not specified it is ignored. GOOD_CANDIDATE_THRESHOLD : The candidate score threshold. Those candidate domains with a candidate score greater than the threshold are considered for clustering. This value should be between 0.0 and 1.0 ex. 0.5 MAX_CANDIDATE_DOMAINS : The total number of candidate domains to cluster. ex. 1000 DIST_MATRIX_MULTITHREADED : Should the calculation of the distance matrix be multithreaded. Valid values are 'true' or 'false' DIST_MATRIX_NUMTHREADS : The number of threads to use in multithreaded distance matrix calculation. This value must be a positive integer. If DIST_MATRIX_MULTITHREADED is set to 'false' this value is ignored. LINKAGE_TYPE : The linkage type to used during hierarchical clustering. Valid values are 'Single' or 'Complete' MAX_CUT_HEIGHT : The maximum cut height used during hierarchical clustering. The value should be between 0.0 and 1.0. ex. 0.75 CANDIDATE_FLUX_DIR : The directory containing the SIE source files. DBINTERFACE_CONNECTINFO : The JDBC connection string to the fluxbuster database. ex. jdbc:postgresql://host.example.com:54321/fluxbuster?user=sample&password=secret DBINTERFACE_DRIVER : The full class name of the JDBC driver class to use. ex. org.postgresql.Driver DBINTERFACE_CLASS : The full class name of the DBInterface implementation to use. ex. edu.uga.cs.fluxbuster.db.PostgresDBInterface The following are options related BoneCP. In most cases the default options will suffice. For further information about each option see: http://jolbox.com/bonecp/downloads/site/apidocs/com/jolbox/bonecp/BoneCPConfig.html DBINTERFACE_PARTITIONS : BoneCPConfig.setPartitionCount DBINTERFACE_MIN_CON_PER_PART : BoneCPConfig.setMinConnectionsPerPartition DBINTERFACE_MAX_CON_PER_PART : BoneCPConfig.setMaxConnectionsPerPartition DBINTERFACE_RETRY_ATTEMPTS : BoneCPConfig.setAcquireRetryAttempts DBINTERFACE_RETRY_DELAY : BoneCPConfig.setAcquireRetryDelayInMs commons-logging.properties org.apache.commons.logging.Log : Sets the logging implementation class. By default this is set to 'org.apache.commons.logging.impl.SimpleLog' which can be configured in the simplelog.properties file simplelog.properties org.apache.commons.logging.simplelog.defaultlog : the default minimum logging level org.apache.commons.logging.simplelog.showdatetime : display a timestamp for each log entry org.apache.commons.logging.simplelog.showlogname : Set to true if you want the Log instance name to be included in output messages. [Usage] You can execute Fluxbuster via the fluxbuster.sh bash script. usage: fluxbuster.sh If none of the options g, f, s, c are specified then the program will execute as if all of them have been specified. Otherwise, the program will only execute the options specified. -?,--help Print help message. -c,--classify-clusters Classify clusters. (Optional) -d,--start-date <arg> The start date of the input data. Should be in yyyyMMdd format. -e,--end-date <arg> The end date of the input data. Should be in yyyyMMdd format. -f,--calc-features Calculate cluster features. (Optional) -g,--generate-clusters Generate clusters. (Optional) -s,--calc-similarity Calculate cluster similarities. (Optional) To access the results of Fluxbuster you can use the following SQL examples to query the Fluxbuster database. 1. Get all the domains and to which cluster they belong for a the run on 2010-11-13. select clusters.cluster_id, domains.domain_name, domains.second_level_domain_name from clusters join domains on clusters.log_date = domains.log_date and clusters.domain_id = domains.domain_id where clusters.log_date = '2010-11-13' 2. Get all of the cluster features for each cluster for the run on 2010-11-13. select cluster_feature_vectors.* from clusters join cluster_feature_vectors on clusters.log_date = cluster_feature_vectors.log_date and clusters.cluster_id = cluster_feature_vectors.cluster_id where clusters.log_date = '2010-11-13' 3. Get all of the classified clusters for the run on 2010-11-15. select distinct(cluster_classes.*) from clusters join cluster_classes on clusters.log_date = cluster_classes.log_date and clusters.cluster_id = cluster_classes.cluster_id where clusters.log_date = '2010-11-15'