SecGen is a Ruby application that uses virtualization software to create vulnerable virtual machines so students can learn security penetration testing techniques.
Boxes like Metasploitable2 are always the same, this project uses Vagrant, Puppet, and Ruby to quickly create randomly vulnerable virtual machines that can be used for learning or CTF events.
Computer security students benefit from engaging in hacking challenges. Practical lab work and pre-configured hacking challenges are common practice both in security education and also as a pastime for security-minded individuals. Competitive hacking challenges, such as capture the flag (CTF) competitions have become a mainstay at industry conferences and are the focus of large online communities. Virtual machines (VMs) provide an effective way of sharing targets for hacking, and can be designed in order to test the skills of the attacker. Websites such as Vulnhub host pre-configured hacking challenge VMs and are a valuable resource for those learning and advancing their skills in computer security. However, developing these hacking challenges is time consuming, and once created, essentially static. That is, once the challenge has been "solved" there is no remaining challenge for the student, and if the challenge is created for a competition or assessment, the challenge cannot be reused without risking plagiarism, and collusion.
Security Scenario Generator (SecGen) generates randomised vulnerable systems. VMs are created based on a scenario specification, which describes the constraints and properties of the VMs to be created. For example, a scenario could specify the creation of a system with a remotely exploitable vulnerability that would result in user-level compromise, and a locally exploitable flaw that would result in root-level compromise. This would require the attacker to discover and exploit both randomly selected vulnerabilities in order to obtain root access to the system. Alternatively, the scenario that is defined can be more specific, specifying certain kinds of services (such as FTP or SMB) or even exact vulnerabilities (by CVE).
SecGen is a Ruby application, with an XML configuration language. SecGen reads its configuration, including the available vulnerabilities, services, networks, users, and content, reads the definition of the requested scenario, applies logic for randomising the scenario, and leverages Puppet and Vagrant to provision the required VMs.
SecGen is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
SecGen contains modules, which install various software packages. Each SecGen module may contain or remotely source software, and each module defines its own license in the accompanying secgen_metadata.xml file.
You will need to install the following:
Ruby: https://www.ruby-lang.org/en/
Vagrant: http://www.vagrantup.com/
Virtual Box: https://www.virtualbox.org/
Puppet: http://puppet.com/
And the required Ruby Gems (including Nokogiri and Librarian-puppet)
curl -o vagrant.deb https://releases.hashicorp.com/vagrant/1.8.4/vagrant_1.8.4_x86_64.deb
sudo dpkg -i vagrant.deb
sudo apt-get install ruby-dev zlib1g-dev liblzma-dev build-essential patch virtualbox ruby-bundler
Copy SecGen to a directory of your choosing, such as /home/user/bin/SecGen, then:
cd /home/user/bin/SecGen
bundle install
Basic usage:
ruby secgen.rb run
This will use the default scenario to randomly generate VM(s).
SecGen accepts arguments to change the way that it behaves, the currently implemented arguments are:
ruby secgen.rb [--options] <command>
OPTIONS:
--scenario [xml file], -s [xml file]: set the scenario to use
(defaults to scenarios/default_scenario.xml)
--project [output dir], -p [output dir]: directory for the generated project
(output will default to projects/SecGen_DATEandTIME)
--help, -h: shows this usage information
COMMANDS:
run, r: builds project and then builds the VMs
build-project, p: builds project (vagrant and puppet config), but does not build VMs
build-vms, v: builds VMs from a previously generated project
(use in combination with --project [dir])
SecGen generates VMs based on a scenario specification, which describes the constraints and properties of the VMs to be created.
Existing scenarios make SecGen's barrier for entry low: when invoking SecGen, a scenario can be specified as a command argument, and SecGen will then read the appropriate scenario definition and go about randomisation and VM generation. This removes the requirement for end users of the framework to understand SecGen's configuration specification.
Scenarios can be found in the scenarios/ directory. For example, to spin up a VM that has any random vulnerability:
ruby secgen.rb --scenario scenarios/simple_examples/simple_any_random_vulnerability.xml run
Writing your own scenarios enables you to define a VM or set of VMs with a configuration as specific or general as desired.
SecGen's scenario specification is a powerful interface for specifying the constraints of the vulnerable systems to generate. Scenarios are defined in XML configuration files that specify systems in terms of a base, services/utilities, vulnerabilities, and networks.
- system: a VM
- base: a SecGen module that defines the OS platform (VM template) used to build the VM
- vulnerability: a SecGen module that adds an insecure, hackable, state (including realistic software vulnerabilities known to be in the wild or fabricated hacking challenges)
- service: a SecGen module that adds a (relatively secure) network service
- utility: a SecGen module that adds (relatively secure) software or configuration changes
- network: a virtual network card
- generator: generates output, such as random text
- encoder: receives input, such as random text, performs operations on that to produce output (such as, encoding/encryption/selection)
The selection logic for choosing the modules to fulfill the specified constraints can filter on any of the attributes in each module's secgen_metadata.xml file (for example, difficulty level and/or CVE), and any ambiguity results in a random selection from the remaining matching options (for example, any vulnerability matching a specified difficulty level).
For example, scenarios/simple_examples/simple_any_random_vulnerability.xml specifies one system with a Debian Linux base, and a vulnerability. In this case the base module is specified by module name, so this selection is predefined (there is only one possible module that matches), and the vulnerability is randomly selected from the entire set of vulnerabilities because no attribute filters are specified, which could have limited down the potential matches.
<?xml version="1.0"?>
<scenario xmlns="http://www.github/cliffe/SecGen/scenario"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.github/cliffe/SecGen/scenario">
<system>
<system_name>random_server</system_name>
<base module_path="modules/bases/debian_puppet_32"/>
<vulnerability />
</system>
</scenario>
Note that the filters specified are regular expression (regexp) matches. For example, here the module_path is any that matches anything followed by "distcc":
<?xml version="1.0"?>
<scenario xmlns="http://www.github/cliffe/SecGen/scenario"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.github/cliffe/SecGen/scenario">
<system>
<system_name>distcc_server</system_name>
<base platform="linux"/>
<vulnerability module_path=".*distcc" />
<network type="private_network" range="dhcp" />
</system>
</scenario>
Here scenarios/default_scenario.xml defines a scenario with a remotely exploitable vulnerability that grants access to a user account, and a locally exploitable root-level privilege escalation vulnerability.
<?xml version="1.0"?>
<scenario xmlns="http://www.github/cliffe/SecGen/scenario"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.github/cliffe/SecGen/scenario">
<!-- an example remote storage system, with a remotely exploitable vulnerability that can then be escalated to root -->
<system>
<system_name>storage_server</system_name>
<base platform="linux"/>
<vulnerability privilege="user_rwx" access="remote" />
<vulnerability privilege="root_rwx" access="local" />
<service/>
<network type="private_network" range="dhcp"/>
</system>
</scenario>
Note that with the exception of <system_name>, all of the XML elements within <system> will resolve to the addition of a SecGen module (a single module, plus any dependencies and default values). The attributes specified filter down the set of modules to randomly select from. For example, the network card is selected from the available SecGen network card modules that are private_networks with dhcp.
Some modules can be fed input. For example, a vulnerability can be fed information to leak as output. In this case, a NFS share will host a publicly exported file containing the leaked text:
<?xml version="1.0"?>
<scenario xmlns="http://www.github/cliffe/SecGen/scenario"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.github/cliffe/SecGen/scenario">
<system>
<system_name>file_server</system_name>
<base platform="linux"/>
<vulnerability module_path=".*nfs_overshare">
<input into="strings_to_leak">
<value>Leak this text, and a randomly generated flag</value>
<generator type="flag_generator"/>
</input>
</vulnerability>
<network type="private_network" range="dhcp"/>
</system>
</scenario>
Encoders, generators, and literal values can be nested. For example, as above, but the message and flag are first base64 encoded:
<?xml version="1.0"?>
<scenario xmlns="http://www.github/cliffe/SecGen/scenario"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.github/cliffe/SecGen/scenario">
<system>
<system_name>file_server</system_name>
<base platform="linux"/>
<vulnerability module_path=".*nfs_overshare">
<input into="strings_to_leak">
<encoder name="BASE64 Encoder">
<input into="strings_to_encode">
<value>Leak this text, and a randomly generated flag</value>
<generator type="flag_generator"/>
</input>
</encoder>
</input>
</vulnerability>
<network type="private_network" range="dhcp"/>
</system>
</scenario>
SecGen is designed to be easily extendable with modules that define vulnerabilities and other kinds of software, configuration, and content changes.
As stated above, the types of modules supported in SecGen are:
- base: a SecGen module that defines the OS platform (VM template) used to build the VM
- vulnerability: a SecGen module that adds an insecure, hackable, state (including realistic software vulnerabilities known to be in the wild or fabricated hacking challenges)
- service: a SecGen module that adds a (relatively secure) network service
- utility: a SecGen module that adds (relatively secure) software or configuration changes
- network: a virtual network card
- generator: generates output, such as random text
- encoder: receives input, such as text, performs operations on that to produce output (such as, encoding/encryption/selection)
Each vulnerability module is contained within the modules/vulnerabilies directory tree, which is organised to match the Metasploit Framework (MSF) modules directory structure. For example, the distcc_exec vulnerability module is contained within: modules/vulnerabilities/unix/misc/distcc_exec/.
The root of the module directory always contains a secgen_metadata.xml file and also contains puppet files, which are used to make a system vulnerable.
The secgen_metadata.xml file defines the attributes of the module. In the case of vulnerability modules, this file contains information about the vulnerability, including CVE, privilege level the successful attacker gains, access level required in order to attack (remote vs local), metasploit module that can be used to exploit the vulnerability, CVSS score and vector string, difficulty level, and description.
This information is used to filter module selection for scenarios, and also used to specify modules that conflict with each other or to satisfy dependencies between modules.
Example:
<?xml version="1.0"?>
<vulnerability xmlns="http://www.github/cliffe/SecGen/vulnerability"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.github/cliffe/SecGen/vulnerability">
<name>DistCC Daemon Command Execution</name>
<author>Lewis Ardern</author>
<module_license>MIT</module_license>
<description>Distcc has a documented security weakness that enables remote code execution.</description>
<type>distcc</type>
<privilege>user_rwx</privilege>
<access>remote</access>
<platform>unix</platform>
<!--optional vulnerability details-->
<difficulty>medium</difficulty>
<cve>CVE-2004-2687</cve>
<cvss_base_score>9.3</cvss_base_score>
<cvss_vector>AV:N/AC:M/Au:N/C:C/I:C/A:C</cvss_vector>
<reference>https://www.rapid7.com/db/modules/exploit/unix/misc/distcc_exec</reference>
<reference>OSVDB-13378</reference>
<software_name>distcc</software_name>
<software_license>GPLv2</software_license>
<!--optional hints-->
<msf_module>exploit/unix/misc/distcc_exec</msf_module>
<hint>On a non-standard port</hint>
<solution>Distcc is vulnerable, and on a high port number.</solution>
<!--Cannot co-exist with other installations-->
<conflict>
<software_name>distcc</software_name>
</conflict>
</vulnerability>
The name of the module, with spaces and Title Caps.
Repeated one or more times for authors of the SecGen module and to acknowledge any authors of adapted Puppet modules from PuppetForge.
The free and open source license the module is released under.
A description of the module and what it does.
A general category, in terms of the network protocol used (for example, ftp) if relevant.
The level of privilege a successful attacker ends up with when exploitation is successful.
Information leakage: info_leak (e.g. nfs/nfs_overshare) Shell access: root_rwx, user_rwx (e.g. local/setuid_nmap, smb/samba_symlink_traversal) Read and write access: root_rw, user_rw (e.g. access_control_misconfigurations/uid_vi_root, smb/samba_public_writable_share) Read access: root_r (e.g. access_control_misconfigurations/uid_less_root)
As other challenges are added database leaks will be added as a privilege level option.
The level of access the attacker needs to carry out the attack. Local access, such as an existing shell or user account, or remote, such as a vulnerable network service.
What OS(s) the module is compatible with.
How hard the challenge is.
For real vulnerabilies, the CVE where available.
The CVSS v2 Base Score. The score as calculated based on the CVSS vector.
The CVSS v2 vector string, for example: 'AV:L/AC:H/Au:N/C:N/I:P/A:C'
Access Vector (AV): L = Local access, A = adjacent access, N = network access
Access Complexity (AC): H = High, M = Medium, L = Low
Authentication (Au): N = None required, S = Single instance, M = Multi instance
Confidentiality Impact (C): N = None, P = Partial, C = Complete
Integrity Impact (I): N = None, P = Partial, C = Complete
Availability Impact: N = None, P = Partial, C = Complete
NIST provide a handy online tool.
Repeated for URLs with further information about the vulnerability, exploit, and software. For example, information about the vulnerability, links to exploits, and so on.
Package name of software installed by the puppet modules (as named in software repositories).
The license of the installed/bundled software.
A Metasploit module (if one exists) to compromise the vulnerability. For example, "exploit/unix/misc/distcc_exec".
A hint to direct the attacker in the right direction.
A solution to the challenge.
A module may conflict with other modules based on matches to attributes or module_path. Each conflict can have multiple conditions which must all be met for this to be considered a conflict.
For example, to conflict with modules that provide a web server and install apache:
<conflict>
<type>httpd</type>
<software_name>apache</software_name>
<conflict>
That example would not conflict with other web servers that don't include "apache" in the software_name.
If multiple <conflict> elements are specified, it only takes any one conflict to prevent a conflicting module to be selected.
When creating modules, conflicts should be avoided wherever possible, as they can significantly reduce the randomisation options for complex scenarios, and can cause complications in the resolution of scenarios (which is currently solved via bruteforce).
A module can include <requires> tags to require other modules that satisfy a set of conditions are also added to the scenario. When selecting a module, each of these dependencies is resolved by checking if a module has already been selected that satisfies the condition, in which case nothing happens, otherwise a module that satisfies all the conditions is randomly selected and added to the scenario. This is recursive so a module can require modules that require modules.
When conflicts occur (say for example, a previously selected module conflicts with all the valid options for resolving a dependency) the scenario is regenerated. This bruteforce approach is fairly effective, but <conflict> tags should be avoided wherever possible because they add complexity and reduce randomisation possibilities.
A module can have multiple <requires>, each of which will ensure a single module fulfills all of the conditions, which are regexp matches against attributes.
For example, for a module that needs to have a repo refresh (apt-get update) first:
<requires>
<type>update</type>
</requires>
Or for a module that requires apache be installed by another module (rather than the module itself installing apache, alternatively):
<requires>
<type>httpd</type>
<software_name>apache</software_name>
</requires>
In this (silly) example, writable_shadow requires apache which requires update:
In another silly example, here apache requires ftp, but all ftp modules conflict with writable_shadow:
A module can declare that it uses input it receives. The most common input parameters are "strings_to_encode", and "strings_to_leak". read_fact can also be repeated for any other configuration parameters for the module.
A module definition can specify default inputs to be used when none is specified via the scenario. This means that if a vulnerability module is selected without input (for example, randomly selected from all vulnerabilies), input for the parameters for that module can be generated automatically.
For example:
secgen_metadata.xml:
<read_fact>strings_to_leak</read_fact>
<!--if an input is not specified in the scenario-->
<default_input into="strings_to_leak">
<value>Plain text from the metadata default, destined for strings_to_leak...</value>
</default_input>
<default_input into="some_random_setting">
<value>true</value>
</default_input>
Note that the scenario could select on and pass through specific parameters:
scenario.xml:
<vulnerability read_fact="strings_to_leak">
<input into="strings_to_leak">
<value>LEAK THIS!</value>
</input>
</vulnerability>
In the above case the "some_random_setting" parameter would take on it's default value (["true"]), and the strings leaked would be the value coming from the scenario (["LEAK THIS!"]).
Parameter values can be randomly selected between using the random selection encoder module. For example:
secgen_metadata.xml:
<default_input into="some_random_setting">
<encoder name="Random String Selector">
<value>true</value>
<value>false</value>
</encoder>
</default_input>
As a result, any time the module is used it would randomly be configured, unless specifically specified in the scenario.
The default inputs can also be constructed using complex nested generators and encoders:
secgen_metadata.xml:
<read_fact>strings_to_leak</read_fact>
<!--if an input is not specified in the scenario-->
<default_input into="strings_to_leak">
<value>Plain text from the metadata default, destined for strings_to_leak...</value>
<encoder type="string_encoder">
<input into="strings_to_encode">
<!--encode the following strings-->
<value>Encoded text from the metadata default, destined for strings_to_leak...</value>
<value>More encoded text from the metadata default, destined for strings_to_leak...</value>
<generator module_path=".*random.*"/>
</input>
</encoder>
</default_input>
Each vulnerability, service, and utility module contains Puppet files which are used to provision the software onto the VMs.
The module directory contains
- a Puppet module
- Puppet entry point (same file name as the module directory, .pp)
This example should help illustrate. Distcc has a documented security weakness that enables remote code execution. The below example comes from modules/vulnerabilities/misc/distcc_exec.
A manifest/ directory contains the Puppet files for a distcc_exec Puppet class.
As is convention, one file for Installation:
class distcc_exec::install{
package { 'distcc':
ensure => installed
}
}
One file for configuration (plus a template file):
class distcc_exec::config{
file { '/etc/default/distcc':
require => Package['distcc'],
ensure => present,
owner => 'root',
group => 'root',
mode => '0777',
content => template('distcc_exec/distcc.erb')
}
}
One file for ensuring the service starts:
class distcc_exec::service{
service { 'distcc':
ensure => running
}
}
So far this is all typical Puppet.
Finally, we add a module entry point, with the same name as the directory .pp:
include distcc_exec::install
include distcc_exec::config
include distcc_exec::service
To learn more about Puppet and understand the how to write modules check out the SecGen Wiki and also http://puppetlabs.com/
Encoders and generators have code that is evaluated at project build time, such as encoding text, and generating flags and other content. In each case, this is a ruby script located within the module directory in local/secgen_local.rb. Although normally called by SecGen, secgen_local.rb scripts can be executed directly, and accept all the parameter inputs as command line arguments, and returns the output in JSON format to stdout. Other human readable output is written to stderr.
#ruby modules/encoders/string/base64/secgen_local/local.rb --strings_to_encode "encode this" --strings_to_encode "and this"
BASE64 Encoder
Encoding '["encode this", "and this"]'
Encoded: ["ZW5jb2RlIHRoaXM=", "YW5kIHRoaXM="]
["ZW5jb2RlIHRoaXM=","YW5kIHRoaXM="]
By default output is to projects/SecGen_[CurrentTime]/
The project output includes:
- A vagrant configuration for spinning up the boxes.
- A directory containing all the required puppet modules. A Librarian-Puppet file is created to manage modules, and some required modules may be obtained via PuppetForge, and therefore an Internet connection is required when building the project.
- A de-randomised scenario XML file. This is a XML scenario file that can be used to replay these systems. Any randomisation that has been applied should be un-randomised in this output (compared to the original scenario file). This can also be used later for grading, scoring, or giving hints.
The VM building process takes the project output and builds the VMs.
- more modules!
- Windows basebox and vulnerabilities
- CTF-style modules
- automated scoring
- Web-frontend
- variables/datastore
Development team:
- Dr Z. Cliffe Schreuders http://z.cliffe.schreuders.org
- Tom Shaw
- Jason Keighley
- Lewis Ardern -- author of the first proof-of-concept release of SecGen
- Connor Wilson
Many thanks to everyone who have contributed to the project. The above list is not complete or exhaustive, please refer to the GitHub history.
This project is supported by a Higher Education Academy (HEA) learning and teaching in cyber security grant (2015-2017).
We encourage contributions to the project, please see the wiki for guidance on how to contribute.
Briefly, please fork from github.com/cliffe/SecGen, create a branch, make and commit your changes, then create a pull request.