# Introduction

This notebook introduces the library SPAN, a small library designed to make using setools 4 simple in a Jupyter notebook such as this one.

Jupyter notebooks are an interactive environment that let's us write text (in Markdown) and code together. What's powerful is that the code is executable (unless you are viewing this on the web in a read-only mode). That let's you write queries and text together at the same time. You can get a feel for what's possible in this awesome notebook on [Regex Golf from XKCD](http://nbviewer.jupyter.org/url/norvig.com/ipython/xkcd1313.ipynb). There is also the more official (and boring) [introduction](https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/).

Using setools within Jupyter notebook is an amazingly productive way to do policy analysis. It becomes simple to keep notes alongside any queries you do or, almost more importantly, write simple scripts that allow you to do more powerful policy analysis.

To get started, let's import the library and load a Fedora 25 binary and source policy as an example:

In [None]:
# Import span - this complication is just to handle running this in the development tree
try:
    import span as se
except:
    import os
    path = os.path.dirname(os.getcwd())
    import sys
    sys.path.insert(0, path)
    import span as se
p = se.load_policy("fedora-25-policy.30")
ps = se.load_refpolicy_source("serefpolicy-fedora-25")

# Example - Protecting Passwords

We'll get to the details of how to use the library soon. But first, let's start with an example to demonstrate some of the power that we get from this environment.

Let's do that by answering a common security question: what applications can write to the shadow file and are any of those applications controllable by users?

But first, did I mention that we can include images in these notebooks?

<img src="https://i.imgur.com/D5LidQ1.jpg">

## Domains That Can Write /etc/shadow

Anyway, for the first part of that question, we can do a simple search for rules using the method terules_query:

In [None]:
passwd_writers = p.terules_query(target="shadow_t", tclass=["file"], perms=["write", "append", "relabelto"])
passwd_writers

A few things to note about this if you are new to Jupyter notebook. First, by default Jupyter will display the output from the last expression, which is why just putting the variable we assigned the results to on it's own line caused the display. If we didn't need to use the output later we could have just omitted assigning the output to a variable.

Next, you'll notice that the output is a nicely formatted table. The results are actually in a Pandas [DataFrame](http://pandas.pydata.org/pandas-docs/stable/dsintro.html). [Pandas](http://pandas.pydata.org) is a common and very powerful data analysis tool for Python. Here it lets us display the data nicely, including allowing sorting (try clicking on the column titles to sort by that column).

It also let's us further search. For example, let's see which domains where allowed access directly to `shadow_t` rather than to an attribute that includes `shadow_t`.

In [None]:
passwd_writers[passwd_writers.target == "shadow_t"]

Just to make certain this is the right type, let's check the filecontexts from the source:

In [None]:
print(ps.file_contexts("shadow_t"))

That looks right . . . I guess. I wonder what the heck `/etc/security/opasswd` is??!? Oh well, it covers good, old-fashioned `/etc/shadow`, so let's move on.

## Finding Domain Transitions from Login Domains

Now we have all of the domains that can write to shadow, so let's answer the second part by determining whether any of the login domains can transition to these domains. Just to keep this short, I'm going to just check for the 3 standard login types (normally you would need to figure out whether there were more).

So let's make a list of those types:

In [None]:
login_domains = ["user_t", "sysadm_t", "secadm_t"]

# Just verify that these are in the policy
for domain in login_domains:
    print(p.lookup_type(domain))

Now we can find the domain transitions that are allowed:

In [None]:
user_transitions = p.terules_query(source="user_t", tclass=["process"], perms=["transition"])
user_transitions

## Putting It Together - Accessible Domains That Can Write /etc/shadow
And now we can see if `user_t` is allowed to transition to any of the domains that can write shadow passwords. To do this, we are going to leverage the built-in sets from Python (which are fantastic). You can get a single column from a DataFrame with `DataFrame.column_name` and, because that is iterable, build a set from that. So we build a set from the targets of the user transitions and the source of the password writers.

Once you have the sets it's simple to perform set intersection (with the `&` operator) to find the types that are both sets.

In [None]:
user_passwd_writers = set(user_transitions.target) & set(passwd_writers.source)
user_passwd_writers

So, as expected, `passwd_t` `updpwd_t` can write to shadow and is accessible from `user_t`. `sandbox_domain` is more surprising.

Let's see what types are in that attribute.

In [None]:
p.types_in_attribute("sandbox_domain")

## Automating Checks for All Login Domains

Wait, you should be saying, we're only answering the question for one login domain! Let's build a simple function to do this for all of the login domains (because that nicely shows the power of having a full programming language right here for analysis).

In [None]:
# Yeah for stupidly long function names
def check_accessible_domains_that_can_write_shadow(login_domains):
    for login_domain in login_domains:
        # These are the same queries we did above, just using the passed in login domain types as appropriate.
        writers = p.terules_query(target="shadow_t", tclass=["file"], perms=["write", "append", "relabelto"])
        accessible_domains = p.terules_query(source=login_domain, tclass=["process"], perms=["transition"])
        ad_set = set(accessible_domains.target)
        # Add the login domain to see if it has direct access.
        #
        # Since we are doing comparisons it must be the object here and _not_ the
        # string for the type. These kinds of issues crop up occasionally so keep an eye out for them.
        ad_set.add(p.lookup_type(login_domain))
        print("Shadow writers accessible by " + login_domain + ":")
        print(ad_set & set(writers.source))
        
# Notice how I'm referring to a list that we created way up the page?
check_accessible_domains_that_can_write_shadow(login_domains)

Look at that - both `secadm_t` and `sysadm_t` have direct access. Let's see what that looks like.

In [None]:
passwd_writers[passwd_writers.source == "sysadm_t"]

And the moral of the story, kids, is never forget about broad relabeling privileges. Being able to relabel a file to a type has the same security implications as writing the same type. Though, come to think of it, you have to also be able to create a file in `/etc` with the correct name. Let's check that.

In [None]:
p.terules_query(source="sysadm_t", target="etc_t", perms=["add_name"])

## Checking Automatic Transitions

Notice that we only checked for _allowed_ transitions. We didn't see if any where automatic. Let's do that now.

In [None]:
automatic_transitions = p.transrules_query(source="user_t", tclass=["process"])
automatic_transitions

And using the same approach with sets, we can see if any of those transitions are automatic (just for `user_t` for now).

In [None]:
set(automatic_transitions.default) & user_passwd_writers

Hmmm - wasn't one of those an attribute? And the default type of a transition rule can't be an attribute.

Let's try that again, but expand attributes.

In [None]:
expanded_writers = p.expand_attributes(user_passwd_writers)
print("with attributes expanded: " + str(expanded_writers))
        
set(automatic_transitions.default) & expanded_writers

Well - doesn't look like that's an automatic transition (which isn't surprising). But I included this example to remind you to be careful about attributes. The rule searching will check attributes for you by default, but you have to be careful in your own code.

## Wrapping Up - Entrypoints and Userspace Checks

Two last things.

First - let's see the entrypoints for these domains. I'm including this because I _never_ get tired of bringing this up. It's critical to know what code runs in a domain because that's how you know whether you should trust that code with the access granted to the domain.

In [None]:
# This will handle the sandbox_domain attribute for us automatically in that the domain will be matched as well
# as any types with that attribute with rules explicitly referencing them.
p.terules_query(source=user_passwd_writers, tclass=["file"], perms=["entrypoint"])

Well - sandbox_domain is certainly concerning. I'm certain that the magic of containers is all good though.

Let's check the file contexts for the normal ones.

In [None]:
print(ps.file_contexts("passwd_exec_t"))
print(ps.file_contexts("updpwd_exec_t"))

Last thing - there is a userspace permission for changing _other_ users' passwords (which is what really matters here). I know that /bin/passwd checks this, but I'm not certain about things like /sbin/unix_update. But that's what this analysis is for - it tells us what code to go off an audit for trustworthiness.

So let's check the userspace permission for our login types.

In [None]:
p.terules_query(source=login_domains, tclass=["passwd"])

Just what we would expect - `user_t` is not allowed to change other users' passwords.

So at the end of this things are basically what I would have expected with the exception of sandbox_domain (which I'm pretty sure is fine, but I don't understand well enough to know for sure).

# Reference Documentation

Some documentation on what's possible. This isn't exhaustive - mainly because it doesn't cover everything that Setools offers. One important note is that the policy object returned by `se.load_policy` is a subclass of the Setools policy object. All of the public methods from that class are available - you can see them at https://github.com/TresysTechnology/setools/blob/master/setools/policyrep/selinuxpolicy.py.


## Type and Attribute Searching
Find types by name

In [None]:
p.lookup_type("smbd_t")

Find types by regex

In [None]:
p.types_re("smbd")

The return from these functions is an object (even though it is rendered as a string here). You can, for example, show the attributes for a type by calling a method on the returned object.

In [None]:
sorted(p.lookup_type("smbd_t").attributes())

Notice that the output was sorted. This is both because it's nice for the output to be sorted, but also because most of the results from setools return generator functions instead of lists. That makes their output less convenient for use in jupyter notebook. For example, this is the output from the previous example not sorted.

In [None]:
p.lookup_type("smbd_t").attributes()

Because of this, we provide some convenience fucntions that simply make the output niecer. For example, find the attributes for a types:

In [None]:
p.attributes_for_type("smbd_t")

Find all of the types in an attribute:

In [None]:
p.types_in_attribute("files_unconfined_type")

Attributes by regex

In [None]:
p.attributes_re("unconfined")

Lookup an attribute by name

In [None]:
p.lookup_typeattr("domain")

Expand attribtutes in a list (this will be really long). The list can contain both types and attributes - it just returns the types unchanged into the output list.

In [None]:
p.expand_attributes([p.lookup_type("smbd_t"), p.lookup_typeattr("domain")])

Lookup types or attributes from a list of strings.

In [None]:
p.lookup_type_or_attrs(["smbd_t", "domain"])

## Roles

Find roles - this is just a convenience wrapper around https://github.com/TresysTechnology/setools/blob/master/setools/rolequery.py.

In [None]:
p.roles_query(name="sysadm_r")

In [None]:
p.types_in_role("sysadm_r")

In [None]:
p.roles_for_type("smbd_t")

## Rule Searching

These two methods are wrappers around an implementation that matches the API for Setools TERuleQuery, so the best documentation is at https://github.com/TresysTechnology/setools/blob/master/setools/terulequery.py.

One major API difference is that source and target paramaters can take a single type/attribute, string, or list.

The other difference is speed. This implementation is often 30x faster. It does fully pass the unit tests for the Setools implementation, so it is fast and API compliant. The speedup comes from the use of an index, so the first rule search after a policy is loaded will build the index (which can take a few seconds). Subsequent queries reuse the index.

In [None]:
p.terules_query(target="shadow_t", perms=se.file_w_perms)

In [None]:
# Search with a list for the target
p.terules_query(target=["ssh_home_t", "sshd_key_t"], perms=["write", "append"])

In [None]:
p.transrules_query(source="initrc_t", default="smbd_t", tclass=["process"])

## Information Flow

The information flow queries allow you to focus more on the types and object classes without worrying so much about the details of the permissions. You can, instead, think in terms of read, write, or both.

For exampe, `domain_info_flow` shows all of the object types that a domain can read, write, or both.

In [None]:
# By default this shows writes.
p.domain_info_flow("smbd_t", tclass=["file"])

In [None]:
# Show reads instead
p.domain_info_flow("smbd_t", tclass=["file"], direction="r")

The information flow is weighted by bandwidth on a scale from 1 to 10. 10 would be something like `read` or `write`, while lower bandwidth permissions, like `getattr` would be lower.

Here we set the minimum weight lower and show the additional types that returns.

In [None]:
set(p.domain_info_flow("NetworkManager_t", min_weight=1).Type) - set(p.domain_info_flow("NetworkManager_t").Type)

You can also look from the other direction - the perspective of the object.

In [None]:
p.object_info_flow("bin_t")

The concept of read and write works for non-file-like object classes as well.

In [None]:
p.domain_info_flow("smbd_t", tclass=["packet"])

You can see which permissions are included with `info_flow_perms`.

In [None]:
p.info_flow_perms(tclass=["dir"], min_weight=1)

In [None]:
p.info_flow_perms(tclass=["dir"], min_weight=10)

## Summaries

These are a quick way to gather related information about something in the policy.

In [None]:
p.types_summary(p.types_re("smb"))

In [None]:
p.domain_summary("httpd_t")

In [None]:
p.attribute_summary("domain")

In [None]:
p.file_summary("bin_t")

In [None]:
p.packet_summary("dns_client_packet_t")

# Policy Source

Find a type definition

In [None]:
# note the use of print to make this look nice
print(ps.type_def("kernel_t"))

In [None]:
print(ps.attr_def("domain"))

In [None]:
ps.genfscon("selinuxfs")

In [None]:
print(ps.file_contexts("httpd_exec_t"))

Search for rules (this is just grep really)

In [None]:
print(ps.rules_search("allow sshd_t"))

Show an entire module

In [None]:
print(ps.get_module("services/ssh.te"))