Skip to content

SELinux Policy originator design

twitch153 edited this page Jun 7, 2012 · 2 revisions

Purpose

When working with a SELinux-enabled system, additional privileges are
constantly being checked to see if a specific action is allowed. These
actions are most often shown as allow rules, like so:

allow sendmail_t kmail_t:process sigchld;

These rules are not written and maintained in their raw format. Instead, a
higher-level, M4-based language construct is used in which interfaces are
defined with more human readable names, like so:

mta_send_mail(kmail_t)

This interface then translates the call into either more interfaces, or
raw SELinux policy rules:

interface(`mta_send_mail',
gen_require(``
attribute mta_user_agent;
type system_mail_t;
attribute mta_exec_type;
’)

allow $1 mta_exec_type:lnk_file read_lnk_file_perms;
corecmd_read_bin_symlinks($1)
domtrans_pattern($1, mta_exec_type, system_mail_t)

allow mta_user_agent $1:fd use;
allow mta_user_agent $1:process sigchld;
allow mta_user_agent $1:fifo_file rw_fifo_file_perms;

dontaudit mta_user_agent $1:unix_stream_socket rw_socket_perms;

’)

Most policy development currently focuses on adding additional privileges,
and not removing existing privileges. However, it is important that we can
query where a privilege comes from, not only to query the existing
policies and verify if the rules are still needed, but also to look at
potential updates on general policy lines.

This also makes it easier for users to understand how policies become as they are.

Requirement

The originator project should be able to map source calls towards their
final SELinux policy rules, including the intermediate steps. This includes:

- source file from which the mapping starts (like kmail.te)
- line of the source file that is involved in a call (like kmail.te:42)
- the name of the call (like “mta_send_mail(kmail_t)”) on that line
- the intermediate steps involved when calling this interface (such as “corecmd_read_bin_symlinks(kmail_t)”)
- the final SELinux policy rules (like “allow mta_user_agent kmail_t:process sigchld”)

This data should be queryable in reverse order, so by providing the final SELinux policy rule, it must show how the SELinux policy rule came to be.

Use Case 1 – Adding privileges

Look at the following policy statement:

mta_send_mail(kmail_t)

Perhaps during the regular working of the kmail_t domain, you notice that you need to add the following:

allow sendmail_t kmail_t:process signal;

This can be done as a one-liner locally, or added to a policy. While investigating, you find out that the following rule already exists, and considers that the new policy is probably very much related and as such needs to be added to the same statement:

allow sendmail_t kmail_t:process sigchld;

Now, until now, the policy developer has no means to find out where this line comes from, so he needs to use a more resource intensive method (and lots of knowledge) to find out where this comes from, and then add it.

Having the following information available would speed things up considerably:

 
~$ originator -A -s sendmail_t -t kmail_t -c process -p sigchld;
kmail.te:42
`- mta_send_mail(kmail_t)
   `- allow mta_user_agent kmail_t:process sigchld;

In this output, he finds that the privilege comes from the mta_send_mail(kmail_t) statement in kmail.te file, line 42. This also sees how the statement came to be, through the “allow” statement using the mta_user_agent attribute.

Use Case 2 – Reviewing privileges

While reviewing privileges, you notice that a domain has a privilege you are not familiar with or do not understand why it holds this privilege, like the following one:

allow dhcpd_t urandom_device_t:chr_file read;

You are wondering why a DHCP daemon would need to read the random device, so you ask this to the originator:


~$ originator -A -s dhcpd_t -t random_device_t -c chr_file -p read;
dhcp.te:78
+- dev_read_urand(dhcpd_t)
|  `- read_chr_files_pattern(dhcpd_t, device_t, urandom_device_t)
|     `- allow dhcpd_t urandom_device_t:chr_file read_chr_file_perms;
`- sysnet_use_ldap(dhcpd_t) [optional]
   `- dev_read_urand(dhcpd_t)
      `- read_chr_files_pattern(dhcpd_t, device_t, urandom_device_t)
         `- allow dhcpd_t urandom_device_t:chr_file read_chr_file_perms;
domain.te:115
`- dev_read_urand(domain) [tunable: global_ssp=yes]
   `- read_chr_files_pattern(dhcpd_t, device_t, urandom_device_t)
      `- allow dhcpd_t urandom_device_t:chr_file read_chr_file_perms;

From this, you notice that there are 3 distinct reasons that the privilege is assigned:

1. The dhcpd_t domain is explicitly asked to support reading the urandom device.
2. The dhcpd_t domain might want to use ldap, in which case it needs to read urandom device.
3. If the global_ssp boolean is set, all domains are granted read access to urandom.

With this information at hand, you can better understand the need for the rule. Of course, it doesn’t tell you why it is needed (for this you’ll need to look at the commit history of the policies or read the descriptions of the policy), but gives you good pointers where to look for it.

High-level Design

To support the originator, we expect that there is need for three “parts” in the application framework.

Capture plugin

The first part is what is called the “capture plugin”. Its purpose is to take the macro expansions that m4 does on the policy statements (like mta_send_mail), format it in a useful manner and send it towards the “workflow” component. During the build process of the SELinux policies, m4 is used to expand the macros (SELinux policy statements) into SELinux rules. This expansion is done by m4; m4 supports tracing of the expansion

- this trace information might be adequate enough to capture the output and process it further.

The capture plugin should be as lightweight as possible, and easy to “inject” in the build framework used by the reference policy. This easiness can be accomplished by, for instance, editing the main Makefile in the build framework to enable tracing and redirect the trace output to the capture plugin. The lightweightness of the plugin should try to filter the output in a format that is easy to read by the next component, which is the “workflow” component. However, it is important that it also identifies the module(s) being built, i.e. base, kmail, ftp, selinuxutil, … because that will act as a key for the later processing.

Workflow component

The “workflow component” takes the output of the capture plugin and stores it in the database. It must take into account that input (from the capture plugin) can be given from different runs (in which case older information should be removed again) or from different modules (in which case the data is added instead). Because it takes the data from the capture plugin, it is the workflow component that interprets the output and structures it correctly. By using a good, uniform structure to work with, it should be possible to manage updates to the SELinux policy statement language easily (and perhaps even support CIL in the future). The other requirement for the workflow component is to act as an interface towards the database. The database itself should never be accessed directly by other components – the workflow component is the only component responsible for handling the database calls. This also means that the workflow component must be able to capture query requests that come from the third component, the “query interface”

Query interface

The “query interface” is the tool that users will use the most to query the captured information. In the above use cases, it is called “originator”. This tool takes the requests from the user on what to query, and then passes the information on to the workflow component which does the necessary database lookups in order to provide the necessary feedback. This component, because it is the one the users interact with, must be properly documented use-wise and support the necessary userside features (like argument parsing, —help output, debugging, manual page, etc.).

Database

The database should hold at least the following information:

  • “Statement” is information on either a direct SELinux rule (allow, typeattribute, dontaudit, …) or an intermediate statement (m4 macro or definition).
  • The “File + Line number + Statement” tuple gives information about the source of a statement (the top-level info) as well as the statement available on that line.
  • “Statement + Expanded” gives information about the expanded information for a statement. This statement not only contains interfaces or templates, but also definitions such as support directives (see refpolicy/policy/support).

It is important to understand that SELinux policy lines can contain multiple entries, like so:
allow source_t {dest_t other_t}:file { foo bar bleh };

Also, it might support exclusion parameters, such as the following which includes all privileges on processes except sigkill:

allow unconfined_domain domain:process ~{ sigkill };

Possible questions

Why not just parse the M4 files ourselves?:

This is a valid alternative, but it means that you need to use the same parsing logic as M4. By using the debugging output of M4, you are certain that you interpret it exactly the same as M4. Also, because you focus on a mere translation towards a certain output format, it is also easier to support other input formats. For instance, if the policy is later written in a non-M4 language, the plugin just needs to be updated to take into account the parsing of that language:


    statement=allow
    source={sendmail_t | mta_user_agent | $} (lookup of attributes through seinfo or other definitions)
    target={kmail_t | $} # means any possible argument
    class={fd | $}
    privileges={fd | $} # rather than having to write another parser.

Technical Design

Capture plugin

The capture plugin preferably uses the trace or debugging output of m4, parses it and transforms it into stripped, parseable information for the workflow component. The output should mainly be the source information as well as definitions. The expanded macros themselves are possibly not that useful, unless coding the argument parsing within the workflow component proves to be difficult.

The output that the capture plugin must send through is:

_- the source file and line (as that is the top of the information we need; _

also, the source file name acts as an index for changes)

- any definitions that are made within the SELinux policy files

Based on this information, the workflow component can then fill in the database.

The output can be captured from m4 debugging (m4 -dtaefl -o debug.log) without impact on the translation itself. This output then needs to be translated into parseable output for the framework component. As this is text expression matching, any serious scripting language can be used for this (including Python).

It is suggested that the output is either sent as files towards the workflow component (i.e. saved in a temporary location after which the workflow component takes it on) to make debugging easier. But for performance reasons, it must also be possible to use standard output/standard input (for instance, write to stdout and pipe to the workflow component).

Workflow component

The workflow component captures the information from the capture plugin and inserts it in the database.

For our component, we suggest using a lightweight yet performant database such as sqlite. This database has the necessary APIs available and is also easily manageable by other means (such as live queries for debugging during development).

It parses input (such as source file / line / call) and translates it into manageable chunks.

Take the following as input:

kmail.te:24:mta_send_mail(kmail_t)

This should get translated into chunks such as:


file=kmail.te
linenumber=24
statement=mta_send_mail
arguments={kmail_t}

A statement definition input could be:


        mta_send_mail:startdef
        gen_require(`
                attribute mta_user_agent;
                type system_mail_t;
                attribute mta_exec_type;
        ')
 
        allow $1 mta_exec_type:lnk_file read_lnk_file_perms;
        corecmd_read_bin_symlinks($1)
        domtrans_pattern($1, mta_exec_type, system_mail_t)
 
        allow mta_user_agent $1:fd use;
        allow mta_user_agent $1:process sigchld;
        allow mta_user_agent $1:fifo_file rw_fifo_file_perms;
 
        dontaudit mta_user_agent $1:unix_stream_socket rw_socket_perms;
enddef

This then gets translated into:


statementdef=mta_send_mail
statement=gen_require
arguments={attribute mta_user_agent;\ntype system_mail_t;\nattribute 
mta_exec_type;}
 
statementdef=mta_send_mail
statement=allow
source={$1}
target={mta_exec_type}
class={lnk_file}
privileges={read_lnk_file_perms}
 
statementdef=mta_send_mail
statement=corecmd_read_bin_symlinks
arguments={$1}
 
etc.

For querying, the workflow component can then work inversely. For instance, given “allow sendmail_t kmail_t:fd use”, it looks for statementdef’s that have


  statement=allow
  source={sendmail_t | mta_user_agent | $} (lookup of attributes through seinfo or other definitions)
  target={kmail_t | $} # means any possible argument
  class={fd | $}
  privileges={fd | $} 

It captures the various statement definitions (including the one from mta_send_mail) that might fit. It further traverses the tree (where is each statement definition used: in other statement definitions, or in sources), until it gets the source for the call. From the source, it gets the arguments used, fills in in to find out if this is a valid statement to work with.

Query interface

As the query interface is mainly a shell on top of the workflow component, it focuses mainly on formatting and representation.