Skip to content

Latest commit

 

History

History
827 lines (685 loc) · 42.1 KB

design-document.md

File metadata and controls

827 lines (685 loc) · 42.1 KB

gittuf Design Document

Last Modified: November 27, 2023

Introduction

This document describes gittuf, a security layer for Git repositories. gittuf applies several key properties part of the The Update Framework (TUF) such as delegations, secure key distribution, key revocation, trust rotation, read / write access control, and namespaces to Git repositories. This enables owners of the repositories to distribute (and revoke) contributor signing keys and define policies about which contributors can make changes to some namespaces within the repository. gittuf also protects against reference state attacks by extending the Reference State Log design which was originally described in an academic paper. Finally, gittuf can be used as a foundation to build other desirable features such as cryptographic algorithm agility, the ability to store secrets, storing in-toto attestations pertaining to the repository, and more.

This document is scoped to describing how TUF's access control policies are applied to Git repositories. It contains the corresponding workflows for developers and their gittuf clients. Note that gittuf is designed in a manner that enables other security features. These descriptions will be in standalone specifications alongside this one, and will describe modifications or extensions to the "default" workflows in this document.

Definitions

This document uses several terms or phrases in specific ways. These are defined here.

Git Reference (Ref)

A Git reference is a "simple name" that typically points to a particular Git commit. Generally, development in Git repositories are centered in one or more refs, and they're updated as commits are added to the ref under development. By default, Git defines two of refs: branches (heads) and tags. Git allows for the creation of other arbitrary refs that users can store other information as long as they are formatted using Git's object types.

Repository
|
|-- refs
|   |
|   |-- heads
|   |   |-- main (refers to commit C)
|   |   |-- feature-x (refers to commit E)
|   |
|   |-- tags
|   |   |-- v1.0 (refers to tag v1.0)
|   |
|   |-- arbitrary
|       |-- custom-ref (formatted as Git object type)
|
|-- objects
    |-- A [Initial commit]
    |-- B [Version 1.0 release]
    |-- C [More changes on main]
    |-- D [Initial commit on feature-x]
    |-- E [More changes on feature-x]
    |-- v1.0 [Tag object referring to commit B]

Actors

In a Git repository, an "actor" is any user who makes changes to the repository. These changes can involve any part of the repository, such as modifying files, branches or tags. In the gittuf system, each actor is identified by a unique signing key that they use to sign their contributions. This means that when a policy allows an actor to make certain changes, it's actually allowing the person who has the specific signing key to make those changes. To maintain security, all actions made in the repository, such as adding or modifying files, are checked for authenticity. This is done by verifying the digital signature attached to the action, which must match the public key associated with the actor who is supposed to have made the change.

State

The term "state" refers to the latest values or conditions of the tracked references (like branches and tags) in a Git repository. These are determined by the most recent entries in the reference state log. Note that when verifying changes in the repository, a workflow may only verify specific references rather than all state updates in the reference state log.

Threat Model

gittuf considers a standard Git setting, i.e., a version control system that manages a repository in a distributed fashion. Individual developers have local copies of the repository, to which they commit their work. There is also a remote repository, hosted on a Git server, which serves as a synchronization point: developers push their contributions to the remote repository, and fetch other developers' contributions from the remote repository. Most organizations that own a repository find it useful to define the developers' read and write permissions on various portions of the repository, based on an access control policy. Unfortunately, Git repositories do not support fine-grained read and write access control policies. This is one place where gittuf adds value.

The threat model departs from this standard Git setting in that gittuf reduces the trust needed in the various system actors. Such a model accounts for the fact that some of these actors may act maliciously.

First, gittuf's design assumes that the Git server which manages the remote Git repository is not trustworthy. The server does respond to push and pull requests from clients (i.e., developers), since a non-responsive server would be easily detected and replaced. However, the server may try to modify data that is stored in the repository or may try to tamper with the client pull and push requests, for example by dropping push requests, or serving stale data to pull requests. Such behavior models a compromised or faulty server.

Second, gittuf's design assumes that developers are not trusted to act within the confines defined by the access control policy. For example, developers may try to write into folders where they are not authorized to write, or may try to merge changes into branches where they do not have permission to do so. They may also try to read parts of the repository which they are not allowed to read. Such behavior models developers who make mistakes, or developers whose account has been compromised, or even disgruntled employees.

Note: the threat model considers violations of read access control policies as in-scope, but gittuf currently does not protect against such violations. This feature is part of gittuf's roadmap.

gittuf

To begin with, gittuf carves out a namespace for itself within the repository. All gittuf-specific metadata and information are tracked in a separate Git ref, refs/gittuf.

Reference State Log (RSL)

Note: This document presents only a summary of the academic paper and a description of gittuf's implementation of RSL. A full read of the paper is recommended.

The Reference State Log contains a series of entries that each describe some change to a Git ref. Each entry contains the ref being updated, the new location it points to, and a hash of the parent RSL entry. The entry is signed by the actor making the change to the ref.

Given that each entry in effect points to its parent entry using its hash, an RSL is a hash chain. gittuf's implementation of the RSL uses Git's underlying Merkle graph. Generally, gittuf is designed to ensure the RSL is linear but a privileged attacker may be able to cause the RSL to branch, resulting in a forking attack.

The RSL is tracked at refs/gittuf/reference-state-log, and is implemented as a distinct commit graph. Each commit corresponds to one entry in the RSL, and standard Git signing mechanisms are employed for the actor's signature on the RSL entry. The latest entry is identified using the tip of the RSL Git ref.

Note that the RSL and liveness of the repository in Git remove the need for some traditional TUF roles. As the RSL records changes to other Git refs in the repository, it incorporates TUF's snapshot role properties. At present, gittuf does not include an equivalent to TUF's timestamp role to guarantee the freshness of the RSL. This is because the timestamp role in the context of gittuf at most provides a non-repudiation guarantee for each claim of the RSL's tip. The use of an online timestamp does not guarantee that actors will receive the correct RSL tip. This may evolve in future versions of the gittuf design.

RSL Reference Entries

These entries are the standard variety described above. They contain the name of the reference they apply to and a target ID. As such, they have the following structure.

RSL Entry

ref: <ref name>
targetID: <target ID>

The targetID is typically the ID of a commit for references that are branches. However, for entries that record the state of a Git tag, targetID is the ID of the annotated tag object.

RSL Annotation Entries

Apart from regular entries, the RSL can include annotations that apply to prior RSL entries. Annotations can be used to add more information as a message about a prior entry, or to explicitly mark one or more entries as ones to be skipped. This semantic is necessary when accidental or possibly malicious RSL entries are recorded. Since the RSL history cannot be overwritten, an annotation entry must be used to communicate to gittuf clients to skip the corresponding entries. Annotations have the following schema.

RSL Annotation

entryID: <RSL entry ID 1>
entryID: <RSL entry ID 2>
...
skip: <true/false>
-----BEGIN MESSAGE-----
<message>
------END MESSAGE------

Example Entries

TODO: Add example entries with all commit information. Create a couple of regular entries and annotations, paste the outputs of git cat-file -p <ID> here.

Actor Access Control Policies

Note: This section assumes some prior knowledge of the TUF specification.

There are several aspects to how defining the access privileges an actor has. First, actors must be established in the repository unambiguously, and gittuf uses TUF's mechanisms to associate actors with their signing keys. TUF metadata distributes the public keys of all the actors in the repository and if a key is compromised, new metadata is issued to revoke its trust.

Second, TUF allows for defining namespaces for the repository. TUF's notion of namespaces aligns with Git's, and TUF namespaces can be used to reason about both Git refs and files tracked within the repository. Namespaces are combined with TUF's delegations to define sets of actors who are authorized to make changes to some namespace. As such, the owner of the repository can use gittuf to define actors representing other contributors to the repository, and delegate to them only the necessary authority to make changes to different namespaces of the repository.

Policies for gittuf access are defined using a subset of TUF roles. The owners of the repository hold the keys used to sign the Root role that delegates trust to the other roles. The top level Targets role and any Targets roles it delegates to contain restrictions on protected namespaces. The specifics of the delegation structure vary from repository to repository as each will have its own constraints.

A typical TUF delegation connects two TUF Targets roles. Therefore, delegations can be represented as a directed graph where each node is a Targets role, and each edge connects the delegating role to a delegatee role for some specified namespace. When verifying or fetching a target, the graph is traversed using the namespaces that match the target until a Targets entry is found for it. The Targets entry contains, among other information, the hashes and length of the target. gittuf applies this namespaced delegations graph traversal to Git and also incorporate RSLs and Git's implicit change tracking mechanisms.

In gittuf, the delegations graph is similarly traversed, except that it explicitly does not expect any Targets metadata to contain a target entry. Instead, the delegation mechanism is used to identify the set of keys authorized to sign the target such as an RSL entry or commit being verified. Therefore, the delegation graph is used to decide which keys git actions should trust, but no targets entries are used. Any key which delegated trust up to this part of the namespace (including the last delegation), is trusted to sign the git actions.

Policy delegations In this example, the repository administrator grants write permissions to Carol for the main branch, to Alice for the alice-dev branch, and to Bob for the /tests folder (under any of the existing branches).

This mechanism is employed when verifying both RSL entries for Git ref updates and when verifying the commits introduced between two ref updates. The latter option allows for defining policies to files and directories tracked by the repository. It also enables repository owners to define closed sets of developers authorized to make changes to the repository. Note that gittuf does not by default use Git commit metadata to identify the actor who created it as that may be trivially spoofed.

Another difference between standard TUF policies and those used by gittuf is a more fundamental difference in expectations of the policies. Typical TUF deployments are explicit about the artifacts they are distributing. Any artifact not listed in TUF metadata is rejected. In gittuf, policies are written only to express restrictions. As such, when verifying changes to unprotected namespaces, gittuf must allow any key to sign for these changes. This means that after all explicit policies (expressed as delegations) are processed, and none apply to the namespace being verified, an implicit allow-rule is applied, allowing verification to succeed.

In summary, a repository secured by gittuf stores the Root role and one or more Targets roles. Further, it embeds the public keys used to verify the Root role's signatures, the veracity of which are established out of band. The metadata and the public keys are stored as Git blobs and updates to them are tracked through a standalone Git commit graph. This is tracked at refs/gittuf/policy. The RSL MUST track the state of this reference so that the policy namespace is protected from reference state attacks. Further, RSL entries are used to identify historical policy states that may apply to older changes.

Attestations

gittuf makes use of the signing capability provided by Git for commits and tags significantly. However, it is sometimes necessary to attach more than a single signature to a Git object or repository action. For example, a policy may require more than one developer to sign off and approve a change such as merging something to the main branch. To support these workflows (while also remaining compatible with standard Git clients), gittuf uses the concept of "detached authorizations", implemented using signed in-toto attestations. Attestations are tracked in the custom gittuf namespace: refs/gittuf/attestations.

Reference Authorization

A reference authorization is an attestation that accompanies an RSL reference entry, allowing additional developers to issue signatures authorizing the change to the Git reference in question. Its structure is similar to that of a reference entry:

TargetRef    string
FromTargetID string
ToTargetID   string

The TargetRef is the Git reference the authorization is for, while FromTargetID and ToTargetID record the change in the state of the reference authorized by the attestation (as Git hashes). The information pertaining to the prior state of the Git reference is explicitly recorded in the attestation unlike a standard RSL reference entry. This is because this information can be implicitly identified using the RSL by examining the previous entry for the reference in question. If the authorization is for a brand new reference (say a new branch or tag), FromTargetID must be set to zero.

Reference authorizations are stored in a directory called reference-authorizations in the attestations namespace. Each authorization must have the in-toto predicate type: https://gittuf.dev/reference-authorization/v<VERSION>.

Authentication Evidence Attestations

In certain workflows, it is necessary to authenticate an actor outside of the context of gittuf. For example, later in this document is a description of a recovery mechanism where a gittuf user must create an RSL entry on behalf of another non-gittuf user after authenticating them. gittuf requires evidence of this authentication to be recorded in the repository using an attestation.

Primarily, this attestation is recorded for pushes that are not accompanied by RSL reference entries. As such, this attestation workflow focuses on that scenario. It has the following format:

TargetRef    string
FromTargetID string
ToTargetID   string
PushActor    string
EvidenceType string
Evidence     object

Note that this attestation's schema is a superset of the reference authorization attestation. While that one allows for detached authorizations for a reference update, this one is focused on providing evidence for a push. As such, to identify the push in question, the schema consists of many of the same fields.

The PushActor field identifies the actor performing the push, but did not create an RSL entry. EvidenceType is a string that identifies the type of evidence gathered. It dictates how Evidence must be parsed, as this field is an opaque object that differs from one evidence type to another.

TODO: PushActor has this notion of tracking actors in the policy even if they're not gittuf users. This is somewhat reasonable as this could just be a key ID, which is used just with Git. However, we're fast approaching a separation of actor identifier from their key ID. There's also a TAP for this that we should look at, and think about how OIDC bits can also connect here.

TODO: Add some example evidence types for common scenarios. Push certificate and GitHub API result (subset) ought to do the trick.

Authentication evidence attestations are stored in a directory called authentication-evidence in the attestations namespace. Each attestation must have the in-toto predicate type: https://gittuf.dev/authentication-evidence/v<VERSION>.

Example

Consider project foo's Git repository maintained by Alice and Bob. Alice and Bob are the only actors authorized to update the state of the main branch. This is accomplished by defining a TUF delegation to Alice and Bob's keys for the namespace corresponding to the main branch. All changes to the main branch's state MUST have a corresponding entry in the repository's RSL signed by either Alice or Bob.

Further, foo has another contributor, Clara, who does not have maintainer privileges. This means that Clara is free to make changes to other Git branches but only Alice or Bob may merge Clara's changes from other unprotected branches into the main branch.

Over time, foo grows to incorporate several subprojects with other contributors Dave and Ella. Alice and Bob take the decision to reorganize the repository into a monorepo containing two projects, bar and baz. Clara and Dave work exclusively on bar and Ella works on baz with Bob. In this situation, Alice and Bob retain their privileges to merge changes to the main branch. Further, they set up delegations for each subproject's path within the repository. Clara and Dave are only authorized to work on files within bar/* and Ella is restricted to baz/*. As Bob is a maintainer of foo, he is not restricted to working only on baz/*.

Actor Workflows

gittuf does not modify the underlying Git implementation itself. For the most part, developers can continue using their usual Git workflows and add some gittuf specific invocations to update the RSL and sync gittuf namespaces.

Managing gittuf root of trust

The gittuf root of trust is a TUF Root stored in the gittuf policy namespace. The keys used to sign the root role are expected to be securely managed by the owners of the repository. TODO: Discuss detached roots, and root specific protections for the policy namespace.

The root of trust is responsible for managing the root of gittuf policies. Each gittuf policy file is a TUF Targets role. The top level Targets role's keys are managed in the root of trust. All other policy files are delegated to directly or indirectly by the top level Targets role.

$ gittuf trust init
$ gittuf trust add-policy-key
$ gittuf trust remove-policy-key

Note: the commands listed here are examples and not exhaustive. Please refer to gittuf's help documentation for more specific information about gittuf's usage.

Managing gittuf policies

Developers can initialize a policy file if it does not already exist by specifying its name. Further, they must present its signing keys. The policy file will only be initialized if the presented keys are authorized for the policy. That is, gittuf verifies that there exists a path in the delegations graph from the top level Targets role to the newly named policy, and that the delegation path contains the keys presented for the new policy. If this check succeeds, the new policy is created with the default allow-rule.

After a policy is initialized and stored in the gittuf policy namespace, new protection rules can be added to the file. In each instance, the policy file is re-signed, and therefore, authorized keys for that policy must be presented.

$ gittuf policy init
$ gittuf policy add-rule
$ gittuf policy remove-rule

Note: the commands listed here are examples and not exhaustive. Please refer to gittuf's help documentation for more specific information about gittuf's usage.

Recording updates in the RSL

The RSL records changes to the policy namespace automatically. To record changes to other Git references, the developer must invoke the gittuf client and specify the reference. gittuf then examines the reference and creates a new RSL entry.

Similarly, gittuf can also be invoked to create new RSL annotations. In this case, the developer must specify the RSL entries the annotation applies to using the target entries' Git identifiers.

$ gittuf rsl record
$ gittuf rsl annotate

Note: the commands listed here are examples and not exhaustive. Please refer to gittuf's help documentation for more specific information about gittuf's usage.

Syncing gittuf namespaces with the main repository

gittuf clients uses the origin Git remote to identify the main repository. As the RSL must be linear with no branches, gittuf employs a variation of the Secure_Fetch and Secure_Push workflows described in the RSL academic paper.

Using gittuf with legacy servers Note that gittuf can be used even if the main repository is not gittuf-enabled. The repository can host the gittuf namespaces which other gittuf clients can pull from for verification. In this example, a gittuf client with a changeset to commit to the dev branch (step 1), creates in its local repository a new commit object and the associated RSL entry (step 2). These changes are pushed next to a remote Git repository (step 3), from where other gittuf or legacy Git clients pull the changes (step 4).

RSLFetch: Receiving remote RSL changes

Before local RSL changes can be made or pushed, it is necessary to verify that they are compatible with the remote RSL state. If the remote RSL has entries that are unavailable locally, entries made locally will be rejected by the remote. For example, let the local RSL tip be entry A and the new entry be entry C. If the remote has entry B after A with B being the tip, attempting to push C which also comes right after A will fail. Instead, the local RSL must first fetch entry B and then create entry C. This is because entries in the RSL must be made serially. As each entry includes the ID of the previous entry, a local entry that does not incorporate the latest RSL entries on the remote is invalid. The workflow is as follows:

  1. Fetch remote RSL to the local remote tracker refs/remotes/origin/gittuf/reference-state-log.
  2. If the last entry in the remote RSL is the same as the local RSL, terminate successfully.
  3. Perform the verification workflow for the new entries in the remote RSL, incorporating remote changes to the local policy namespace. The verification workflow is performed for each Git reference in the new entries, relative to the local state of each reference. If verification fails, abort and warn user. Note that the verification workflow must fetch each Git reference to its corresponding remote tracker, refs/remotes/origin/<ref>. TODO: discuss if verification is skipped for entries that work with namespaces not present locally.
  4. For each modified Git reference, update the local state. As all the refs have been successfully verified, each ref's remote state can be applied to the local repository, so refs/heads/<ref> matches refs/remotes/origin/<ref>.
  5. Set local RSL to the remote RSL's tip.

RSLPush: Submitting local RSL changes

  1. Execute RSLFetch repeatedly until there are no new RSL entries in the remote RSL. Every time there is a remote update, the user must be prompted to fetch and re-apply their changes to the RSL. This process could be automated but user intervention may be needed to resolve conflicts in the refs they modified. Changes to the gittuf policy must be fetched and applied locally.
  2. Verify the validity of the RSL entries being submitted using locally available gittuf policies to ensure the user is authorized for the changes. If verification fails, abort and warn user.
  3. For each new local RSL entry:
    1. Push the RSL entry to the remote. At this point, the remote is in an invalid state as changes to modified Git references have not been pushed. However, by submitting the RSL entry first, other gittuf clients that may be pushing to the repository must wait until this push is complete.
    2. If the entry is a normal entry, push the changes to the remote.
    3. TODO: discuss if RSL entries must be submitted one by one. If yes, RSLFetch probably needs to happen after each push. On the other hand, if all RSL entries are submitted first, other clients can recognize a push is in progress while other Git references are updated.

Invoking RSLFetch and RSLPush

While RSLFetch and RSLPush are invoked directly by the user to sync changes with the remote, gittuf executes RSLFetch implicitly when a new RSL entry is recorded. As RSL entries are typically recorded right before changes are submitted to the remote, this ensures that new entries are created using the latest remote RSL.

Verification Workflow

There are several aspects to verification. First, the right policy state must be identified by walking back RSL entries to find the last change to that namespace. Next, authorized keys must be identified to verify that commit or RSL entry signatures are valid.

Identifying Authorized Signers for Protected Namespaces

When verifying a commit or RSL entry, the first step is identifying the set of keys authorized to sign a commit or RSL entry in their respective namespaces. With commits, the relevant namespaces pertain to the files they modify, tracked by the repository. On the other hand, RSL entries pertain to Git refs. Assume the relevant policy state entry is P and the namespace being checked is N. Then:

  1. Validate P's Root metadata using the TUF workflow, ignore expiration date checks.
  2. Begin traversing the delegations graph rooted at the top level Targets metadata. Set current to top level Targets and parent to Root metadata.
  3. Create empty set K to record keys authorized to sign for N.
  4. While K is empty:
    1. Load and verify signatures of current using keys provided in parent. Abort if signature verification fails.
    2. Identify delegation entry that matches N, D.
    3. If D is the allow-rule:
      1. Explicitly indicate any key is authorized to sign changes as N is not protected. Returning empty K alone is not sufficient.
    4. Else:
      1. If repository contains metadata with the role name in D:
        1. Set parent to current, current to delegatee role.
        2. Continue to next iteration.
      2. Else:
        1. Set K to keys authorized in the delegations entry.
  5. Return K.

Verifying Changes Made

In gittuf, verifying the validity of changes is relative. Verification of a new state depends on comparing it against some prior, verified state. For some ref X that is currently at verified entry S in the RSL and its latest available state entry is D:

  1. Fetch all changes made to X between the commit recorded in S and that recorded in D, including the latest commit into a temporary branch.
  2. Walk back from S until a state entry P is found that updated the gittuf policy namespace. This identifies the policy that was active for changes made immediately after S.
  3. Validate P's metadata using the TUF workflow, ignore expiration date checks.
  4. Walk back from D until S and create an ordered list of all state updates that targeted either X or the gittuf policy namespace. During this process, all state updates that affect X and the policy namespace must be recorded. Entries pertaining to other refs MAY be ignored. Additionally, all annotation entries must be recorded using a dictionary where the key is the ID of the entry referred to and the value the annotation itself. Each entry referred to in an annotation, therefore, must have a corresponding entry in the dictionary.
  5. The verification workflow has an ordered list of states [I1, I2, ..., In, D] that are to be verified.
  6. For each set of consecutive states starting with (S, I1) to (In, D):
    1. Check if an annotation exists for the second state. If it does, verify if the annotation indicates the state is to be skipped. It true, proceed to the next set of consecutive states.
    2. If second state changes gittuf policy:
      1. Validate new policy metadata using the TUF workflow and P's contents to established authorized signers for new policy. Ignore expiration date checks. If verification passes, update P to new policy state.
    3. Verify the second state entry was signed by an authorized key as defined in P. If the gittuf policy requires more than one signature, search for a reference authorization attestation for the same change. Verify the signatures on the attestation are issued by authorized keys to meet the threshold, ignoring any signatures from the same key as the one used to sign the entry.
    4. Enumerate all commits between that recorded in the first state and the second state with the signing key used for each commit. Verify each commit's signature using public key recorded in P.
    5. Identify the net or combined set of files modified between the commits in the first and second states as F.
    6. If all commits are signed by the same key, individual commits need not be validated. Instead, F can be used directly. For each path:
      1. Find the set of keys authorized to make changes to the path in P.
      2. Verify key used is in authorized set. If not, terminate verification workflow with an error.
    7. If not, iterate over each commit. For each commit:
      1. Identify the file paths modified by the commit. For each path:
        1. Find the set of keys authorized to make changes to the path in P.
        2. Verify key used is in authorized set. If not, check if path is present in F, as an unauthorized change may have been corrected subsequently. This merely acts as a hint as path may have been also changed subsequently by an authorized user, meaning it is in F. If path is not in F, continue with verification. Else, request user input, indicating potential policy violation.
    8. Set trusted state for X to second state of current iteration.

Recovery

If every user were using gittuf and were performing each operation by generating all of the correct metadata, following the specification, etc., then the procedure for handling each situation is fairly straightforward. However, an important property of gittuf is to ensure that a malicious or erroneous party cannot make changes that impact the state of the repository in a negative manner. To address this, this section discusses how to handle situations where something has not gone according to protocol. The goal is to recover to a "known good" situation which does match the metadata which a set of valid gittuf clients would generate.

Recovery Mechanisms

gittuf uses two basic mechanisms for recovery. We describe these core building blocks for recovery before we discuss the exact scenarios when they are applied and why they provide the desired security properties.

M1: Removing information to reset to known good state

This mechanism is utilized in scenarios where some change is rejected. For example, one or more commits may have been pushed to a branch that do not meet gittuf policy. The repository is updated such that these commits are neutralized and all Git refs match their latest RSL entries. This can take two forms:

  1. The rejected commit is removed and the state of the repo is set to the prior commit which is known to be good. This is used when all rejected commits are together at the end of the commit graph, making it easy to remove all of them.

  2. The rejected commit is reverted where a new commit is introduced that reverses all the changes made in the reverted commit. This is needed when "good" commits that must be retained are interspersed with "bad" commits that must be rejected.

In both cases, new RSL entries and annotations must be used to record the incident and skip the invalid RSL entries corresponding to the rejected changes.

gittuf, by default, prefers the second option, with an explicit revert commit that is tree-same as the last good commit. This ensures that a client can always fast-forward to a fix rather than rewind. By resetting the affected branch to a prior good commit, Git clients that have already pulled in the invalid commit will not reset as well. Instead, they will assume they are ahead of the remote in question and will continue to use the bad commit as the latest commit.

When the gittuf verification workflow encounters an RSL entry for some Git reference that does not meet policy, it looks to see if a subsequent entry for the same reference contains a fix that aligns with the last known good state. Any intermediate entries between the original invalid entry and the fix for the reference in question are also considered to be invalid. Therefore, in addition to the fix RSL entry, gittuf also expects skip annotations for the original invalid entry and intermediate entries for the reference.

M2: Create RSL entry on behalf of another user

This mechanism is necessary for adoptions where a subset of developers do not use gittuf. When they submit changes to the main copy of the repository, they do not include RSL entries. Therefore, when a change is pushed to a branch by a non-gittuf user A, a gittuf user B can submit an RSL entry on their behalf. Additionally, the entry must identify the original user and include some evidence about why B thinks the change came from A.

The evidence that the change came from A may be of several types, depending on the context. If user B completely controls the infrastructure hosting that copy of the repository, the evidence could be the communication of A to B that submitted the change. For example, if A pushes to B's repository using an SSH key associated with A, B has reasonable guarantees the change was indeed pushed by A. Here, B may be another developer managing a "local" copy of the repository or an online bot used by a self hosted Git server, where the bot can reason about the communication from A. In cases where this degree of control is unavailable, for example when using a third party forge such as GitHub, B has no means to reason directly about A's communication with the remote repository. In such cases, B may rely on other data to determine the push was from A, such as the GitHub API for repository activity which logs all pushes after authenticating the user performing the push.

Note that if A is a Git user who still signs their commits, a commit signature signed with A's key is not sufficient to say A performed the push. Creating a commit is distinct from pushing it to a remote repository, and can be performed by different users. When creating an RSL entry on behalf of another user in gittuf, the push event (which is captured in the RSL) is more important than the commit event.

Recovery Scenarios

These scenarios are some examples where recovery is necessary.

A change is made without an RSL entry

Bob does not use gittuf and pushes to a branch. Alice notices this as her gittuf client detects a push to the branch without an accompanying RSL entry. She validates that the change came from Bob and creates an RSL entry on his behalf, identifying him and including information about how she verified it was him. Alice applies M2.

An incorrect RSL entry is added

There are several ways in which an RSL entry can be considered "incorrect". If an entry is malformed (structurally), Git may catch it if it's not a valid commit. In such instances, the push from a buggy client is rejected altogether, meaning other users are not exposed to the malformed commit.

Invalid entries that are not rejected by Git must be caught by gittuf. Some examples of such invalid entries are:

  • RSL entry is for a non-existing Git reference
  • Commit recorded in RSL entry does not exist
  • Commit recorded in RSL entry does not match the tip of the corresponding Git reference
  • RSL annotation contains references to RSL entries that do not exist or are not RSL entries (i.e. the annotation points to other commits in the repository)

Note that as invalid RSL entries are only created by buggy or malicious gittuf clients, these entries cannot be detected prior to them being pushed to the main repository.

As correctly implemented gittuf clients verify the validity of RSL entries when they pull from the main repository, the user is warned if invalid entries are encountered. Then, the user can then use M1 to invalidate the incorrect entry. Other clients with the invalid entry only need to fetch the latest RSL entries to recover. Additionally, the client that created the invalid entries must switch to a correct implementation of gittuf before further interactions with the main repository.

If the main repository is also gittuf enabled, such incidents can be caught before other users receive the incorrect RSL entries. The repository, though, must not behave like a typical gittuf client. Instead, gittuf's repository behavior is slightly different as RSL entries are submitted before the changes they represent. The repository must wait to receive the full changes rather than immediately rejecting the RSL entry. TODO: the repository-specific behavior needs further discussion.

Consider this example as a representative of this scenario. Bob has a buggy gittuf client and pushes an invalid entry to the main repository. Alice pulls and receives a warning from her gittuf client. Alice reverses Bob's changes, creating an RSL entry for the affected branch if needed, and includes an RSL annotation skipping Bob's RSL entry.

A gittuf access control policy is violated

Bob creates an RSL entry for a branch he's not authorized for by gittuf policy. He pushes a change to that branch. Alice notices this (TODO: decide if alice needs to be authorized). Alice reverses Bob's change, creating a new RSL entry reflecting that. Alice also creates an RSL annotation marking Bob's entry as one to be skipped. Alice, therefore, uses M1.

Attacker modifies or deletes historical RSL entry

Overwriting or deleting an historical RSL entry is a complicated proposition. Git's content addressable properties mean that a SHA-1 collision is necessary to overwrite an existing RSL entry in the Git object store. Further, the attacker also needs more than push access to the repository as Git will not accept an object it already has in its store. Similarly, deleting an entry from the object store preserves the RSL structure cosmetically but verification workflows that require the entry will fail. This ensures that such an attack is detected, at which point the owners of the repository can restore the RSL state from their local copies.

Also note that while Git uses SHA-1 for its object store, cryptographic signatures are generated and verified using stronger hash algorithms. Therefore, a successful SHA-1 collision for an RSL entry will not go undetected as all entries are signed.

Dealing with fork* attacks

An attacker may attempt a forking attack where different developers receive different RSL states. This is the case where the attacker wants to rewrite the RSL's history by modifying an historical entry (which also requires a key compromise so the attacker can re-sign the modified entry) or deleting it altogether. To carry out this attack, the attacker must maintain and serve at least two versions of the RSL. This is because at least one developer must have the affected RSL entries--the author of the modified or deleted entries. Maintaining and sending the expected RSL entry for each user is not trivial, especially if multiple users have a version of the RSL without the attack. Also, the attacker may be able to serve multiple versions of the RSL from a central repository they control but any direct interactions between users that have the original RSL and the attacked RSL will expose the attack. These characteristics indicate that while a fork* attack is not impossible, it is highly unlikely to be carried out given its overhead and high chances of detection.

Finally, while gittuf primarily uses TUF's root of trust and delegations, it is possible that TUF's timestamp role can be leveraged to further mitigate fork* attacks. A future version of the gittuf design may explore the use of the timestamp role in this context.

An authorized key is compromised

When a key authorized by gittuf policies is compromised, it must be revoked and rotated so that an attacker cannot use it to sign repository objects. gittuf policies that grant permissions to the key must be updated to revoke the key, possibly adding the actor's new key in the process. Further, if a security analysis shows that the key was used to make malicious changes, those changes must be reverted and the corresponding RSL entries signed with the compromised key must be skipped. This ensures that gittuf clients do not consider attacker created RSL entries as valid states for the corresponding Git references. Clients that have an older RSL from before the attack can skip past the malicious entries altogether.