Skip to content
ben-jones edited this page Sep 17, 2014 · 9 revisions

Threat Model

In order to determine what security properties are necessary, we need to define a threat model. To do this, we are going to list a variety of example threats and coalesce these examples into the threat model for our system.

Who are we defending against?

  • censors
    • How do we ensure that the censor can't interfere with our measurements?
    • The only potentially harm here is bad measurements
  • surveillance apparatus
    • How do we ensure that the surveillance apparatus can't identify our vantage points?
    • How do we ensure that the surveillance apparatus can't show that the vantage point is measuring censorship? (this is subtly different from the previous point because we are slightly relaxing the problem and letting the surveillance apparatus potentially identify the user, but we prevent them from being able to prove anything)
    • The potential harm here is harm to users, i.e. getting disappeared.
  • experimenters (willfully or not)
    • How do we ensure that collaborators don't allow the censor to perturb their measurements? (we answer this with default measurement primitives that are robust to censorship)
    • How do we ensure that collaborators don't allow the surveillance apparatus to identify users? (Maker should prevent most careless problems here, but if the experiment collaborates with the surveillance apparatus, (i.e. send GET requests for a page owned by the surviellance people), there is very little we can do here.)

Example Attacks/ Censors

I don't have a great grasp of how governments view censorship measurements, so what I have written below is probably wrong. Hopefully the technical elements are correct, but please update the political and technical elements as needed.

Also, is there an updated site that has info about the capabilities of different censors? I know that the ONI is over, but are their measurements recent enough to be applicable?

  • China:

    • censorship: high risk government is likely to actively try to stop/perturb censorship measurement. The Great Firewall (GFW) is already difficult to measure because a) they use RSTs for blocking and b) they retain some state from previous connections so it can be difficult to tell whether a specific request was supposed to be blocked or not. The GFW also uses DPI, so if they can identify our protocol and they are willing to pay the cost of blocking whoever we fate share with (no one now?), then they can block us
    • surveillance: moderate risk the government is actively opposed to censorship measurement, but their strategy seems to be targeted at high profile people rather than "needle in the haystack" analysis of the GFW logs.
    • experimenters: low risk the government goes to some length to block censorship, but they don't seem to care enough to infiltrate organizations. We are more likely to have a careless collaborator than an infiltrator.
  • Pakistan

    • censorship: low to moderate risk censorship seems to vary by ISP
    • surveillance: low risk the government seems to have the ability to surveil users, but they generally don't seem to care if people measure Internet censorship.
    • experimenters: low to no risk
  • Rogue Experimenters: here are some potential attacks if the censor/surveillance apparatus tries to run measurements through our platform

    • Destination-based attacks- the censor forces the vantage point/user to connect to a unique URL that they control. This enables them to identify users of the tool/ vantage points.
    • Metadata based attacks-if experiments connect to a very improbable/ unique set of sites, then the censor could identify users of the tool/vantage points by finding users who have connected to all sites. For example, if we test amnesty.org, bbc.com, and whitehouse.gov on each device and it is very improbable for a user to visit any of these sites, the censor may be able to fingerprint any user who connects to all of these sites as a Centinel user. Allowing arbitrary experimenters makes this worse because the censor gets to control part of the list so they can pick many improbable sites for their country.
    • Content based attacks- if the censor can create their own experiment, they could ship content that uniquely identifies the user. This could be header like "X-This-is-a-Centinel-user" or something more subtle, like a base64 nonce for the DPI box to look out for.

Centinel Threat Model

Our threat model will answer the following questions. We may include a concept of varying risk per user, so the threat model would change per user, but our system needs to be able to handle the most aggressive threat model.

  • censorship
    • What can the censor see? (i.e. use DPI, store 3 packets of state, etc.)
      • I say that we let the censor see everything at the international gateway, let them use DPI, and let them store a pretty significant amount of memory (say O(n) storage?)
    • How can the censor modify traffic? (i.e. in-path machine that can rewrite packets or on-path machine that has to inject packets)
      • I say we let the censor operate in path. This will give us more interesting problems to deal with and accurately models censors like Iran.
    • Is there certain traffic that the censor can't or won't scan? (fate sharing)
      • For now, let's assume that the censor can't censor all of large services like Amazon EC2. (I am willing to concede this point and doing something p2p for Centinel, but this would be exponentially more work)
  • surveillance
    • What can the surveillance apparatus see? (i.e. do they have different vantage points than the censor)
      • Let's assume the same vantage points for now.
    • What can the surveillance apparatus store? (i.e. can the surveillance apparatus store full take or just metadata)
      • Let's give the surveillance apparatus the same rules as the released rules for NSA taps-> they don't have capacity to store full takes forever, so they trash p2p traffic, store metadata (i.e. bro logs) forever, and full capture for 3 days
    • What does the surveillance apparatus care about? (both at the machine, i.e. incomplete or error transactions, and at the policy level, i.e. don't care what malware does)
      • Will throw away everything that is high volume and uninteresting-> i.e. p2p traffic, maybe throw away error transactions
  • experimenters
    • Do we want to address the possibility of a rogue experimenter? (i.e. a GFW operator creates a censorship experiment to identify users)
    • Do we want to restrict what experimenters can do? This is an important question not only in terms of the content of traffic but also the destination. Unlike the other cases, we control what the experimenter is capable of. Perhaps the experimenter attacks make the most sense in the context of examples.