v1.0

twbs · Nov 16, 2014 · a736d2d · a736d2d
1 parent 5364c64
commit a736d2d
Show file tree

Hide file tree

Showing 43 changed files with 1,374 additions and 2 deletions.
diff --git a/.gitattributes b/.gitattributes
@@ -0,0 +1,8 @@
+# Enforce Unix newlines
+*.conf  text eol=lf
+*.sbt   text eol=lf
+*.scala text eol=lf
+*.sh    text eol=lf
+*.md    text eol=lf
+*.txt   text eol=lf
+*.yml   text eol=lf
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,21 @@
+*.class
+*.log
+
+# sbt specific
+.cache/
+.history/
+.lib/
+dist/*
+target/
+lib_managed/
+src_managed/
+project/boot/
+project/plugins/project/
+
+# Scala-IDE specific
+.scala_dependencies
+.worksheet
+.idea
+
+# Safeguard login creds
+src/main/resources/application.conf
diff --git a/.travis.yml b/.travis.yml
@@ -0,0 +1,3 @@
+language: scala
+scala:
+  - 2.10.4
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,23 @@
+Hacking on Savage
+=================
+## How do I build Savage?
+1. [Install sbt](http://www.scala-sbt.org/download.html)
+2. Go to your `savage` directory.
+3. Run `sbt compile`
+
+## How do I run the Savage service locally for test purposes?
+**This method is not recommended for use in production deployments!**
+
+0. Ensure that sbt is installed (see above).
+1. Go to your `savage` directory.
+2. Run `sbt`
+3. At the sbt prompt, enter `re-start 9090` (replace `9090` with whatever port you want the HTTP server to run on) or `re-start` (which will use the default port specified in `application.conf`). Note that running on ports <= 1024 requires root privileges (not recommended) or using port mapping.
+
+## How do I generate a single self-sufficient JAR that includes all of the necessary dependencies?
+0. Ensure that sbt is installed (see above).
+1. Go to your `savage` directory.
+2. Run `sbt assembly`
+3. If the build is successful, the desired JAR will be generated as `target/scala-2.10/savage-assembly-1.0.jar`.
+
+## Licensing
+Savage is licensed under The MIT License. By contributing to Savage, you agree to license your contribution under [The MIT License](https://github.com/cvrebert/savage/blob/master/LICENSE.txt).
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,27 @@
+# Written against Docker v1.2.0
+FROM dockerfile/java
+MAINTAINER Chris Rebert <code@rebertia.com>
+
+WORKDIR /
+
+RUN ["apt-get", "install", "git"]
+RUN ["apt-get", "install", "openssh-client"]
+RUN ["useradd", "savage"]
+
+ADD target/scala-2.10/savage-assembly-1.0.jar /app/server.jar
+ADD git-repo /app/git-repo
+
+ADD ssh/id_rsa.pub /home/savage/.ssh/id_rsa.pub
+ADD ssh/id_rsa /home/savage/.ssh/id_rsa
+
+RUN ssh-keyscan -t rsa github.com > /home/savage/.ssh/known_hosts
+
+RUN ["chown", "-R", "savage:savage", "/home/savage/.ssh"]
+RUN ["chown", "-R", "savage:savage", "/app/git-repo"]
+# chmod must happen AFTER chown, due to https://github.com/docker/docker/issues/6047
+RUN ["chmod", "-R", "go-rwx", "/home/savage/.ssh"]
+
+WORKDIR /app/git-repo
+USER savage
+CMD ["java", "-jar", "/app/server.jar", "6060"]
+EXPOSE 6060
diff --git a/LICENSE → LICENSE.txt b/LICENSE → LICENSE.txt
diff --git a/README.md b/README.md
@@ -1,4 +1,124 @@
-savage
+Savage
 ======
+[![Build Status](https://travis-ci.org/cvrebert/savage.svg?branch=master)](https://travis-ci.org/cvrebert/savage)
 
-Service that runs Sauce Labs cross-browser JS tests on Bootstrap pull requests
+Savage is a service watches for new or updated pull requests on a given GitHub repository. For each pull request, it evaluates whether the changes are "safe" (i.e. we can run a Travis CI build with them with heightened permissions without worrying about security issues) and "interesting" (i.e. would benefit from a Travis CI build with them with heightened permissions), based on which files were modified. If the pull request is "safe" and "interesting", then it initiates a Travis CI build with heightened permissions on a specified GitHub repository. When the Travis CI build completes, it posts a comment with the test results on the pull request. If the test failed, the pull requester can then revise their code to fix the problem.
+
+Savage's original use-case is for running Sauce Labs cross-browser JS tests on pull requests via Travis CI, while keeping the Sauce Labs access credentials secure.
+
+Affectionately named after an experimenter known for "busting" misconceptions, often with explosives.
+
+## Motivation
+(Savage is general enough to be used in other situations, but the following is the specific one it was built for.)
+
+You're a member of a popular open source project that involves front-end Web technologies. Cool.
+
+Specifically, the project involves JavaScript. Because it's a serious project, you have automated cross-browser testing for your JavaScript. You happen to use [Open Sauce](https://saucelabs.com/opensauce) for this.
+
+Unfortunately, [due to certain limitations](http://support.saucelabs.com/entries/25614798-How-can-we-set-up-an-open-source-account-that-runs-tests-on-people-s-pull-requests-), it's not possible to do cross-browser testing on pull requests "the obvious way" via Travis CI without potentially compromising your Sauce login credentials. This means that either (a) cross-browser problems aren't discovered in pull requests until after they've already been merged (b) repo collaborators must manually initiate the cross-browser tests on pull requests (and manage the resulting branches, and possibly post comments communicating the test results).
+
+By automating the process of initiating Travis-based Sauce tests and posting the results, cross-browser JavaScript issues can be discovered more quickly and with less work on the part of repo collaborators.
+
+## How it works (for the Open Sauce use-case)
+1. Use GitHub webhooks to listen for new or updated pull requests in a given GitHub repository.
+2. If the pull request does not modify any JavaScript files, ignore it.
+3. Ensure that no sensitive build files (e.g. `.travis.yml`, `Gruntfile.js`) have been modified.
+4. Clone the pull request's branch and push it to a test repo under an autogenerated name.
+5. Travis CI will automatically run a build on the new branch *under the test repo's user*. Thus, this build will have access to Travis secure environment variables; in particular, it will have access to the Sauce Labs credentials.
+6. Use webhooks to track the status of the Travis build.
+7. When the build finishes, post a comment to the GitHub pull request explaining the test results, and delete the corresponding branch.
+
+## Used by
+* ~~[Bootstrap](https://github.com/twbs/bootstrap); see [$GITHUB_BOT_ACCOUNT]~~ (FUTURE)
+* ~~[Video.js](https://github.com/videojs/video.js); see [$GITHUB_BOT_ACCOUNT]~~ (FUTURE, MAYBE)
+
+## Usage
+Using Savage involves two git repos (which can both be the same repo, although that's much less secure):
+* The *main repo*
+  * This repo is the one receiving pull requests
+  * Savage needs its GitHub web hook set up for this repo
+  * Savage does NOT need to be a Collaborator on this repo
+* The *test repo*
+  * The repo that Savage will push test branches to
+  * Travis CI should be set up for this repo
+  * Savage needs to be a Collaborator on this repo, so that it can push branches to it and also delete branches from it
+
+Java 7+ is required to run Savage. For instructions on building Savage yourself, see [the Contributing docs](https://github.com/cvrebert/savage/blob/master/CONTRIBUTING.md).
+
+Savage accepts exactly one optional command-line argument, which is the port number to run its HTTP server on, e.g. `8080`. If you don't provide this argument, the default port specified in `application.conf` will be used. Once you've built the JAR, run e.g. `java -jar savage-assembly-1.0.jar 8080` (replace `8080` with whatever port number you want). Note that running on ports <= 1024 requires root privileges (not recommended) or using port mapping.
+
+When running Savage, its working directory should be a non-bare git repo which is a clone of the repo being monitored.
+
+Savage's GitHub webhook must be setup on the main repo that will be receivi
+
+Other settings live in `application.conf`. In addition to the normal Akka and Spray settings, Savage offers the following settings:
+```
+savage {
+    // Port to run on, if not specified via the command line
+    default-port = 6060
+    // Full name of GitHub repo to watch for new pull requests
+    github-repo-to-watch = "twbs/bootstrap"
+    // Full name of GitHub repo to push test branches to
+    github-test-repo = "twbs/bootstrap-tests"
+    // List of Unix file globs constituting the whitelist of safely editable files
+    whitelist = [
+        "**.md",
+        "/bower.json",
+        "/composer.json",
+        "/fonts/**.{eot,ttf,svg,woff}",
+        "/less/**.less",
+        "/sass/**.{sass,scss}",
+        "/js/**.{js,html,css}",
+        "/dist/**.{css,js,map,eot,ttf,svg,woff}",
+        "/docs/**.{html,css,js,map,png,ico,xml,eot,ttf,svg,woff,swf}"
+    ]
+    // List of Unix file globs constituting the watchlist of files
+    //   which trigger a Savage build.
+    // To prevent unnecessary builds, a Savage build isn't triggered
+    // unless the pull request affects a file that matches one of the watchlist globs.
+    file-watchlist = [
+        "/js/**/*.js"
+    ]
+    // Prefix to use for branches that Savage pushes to the main repository.
+    // The branch name is generated by prefixing the pull request number with this prefix.
+    branch-prefix = "savage-"
+    // GitHub login credentials for the Savage bot to use
+    username = throwaway9475947
+    password = XXXXXXXX
+    // This goes in the "Secret" field when setting up the Webhook
+    // in the "Webhooks & Services" part of your repo's Settings.
+    // This string will be converted to UTF-8 for the HMAC-SHA1 computation.
+    // The HMAC is used to verify that Savage is really being contacted by GitHub,
+    // and not by some random hacker.
+    github-web-hook-secret-key = abcdefg
+    // Used as a shared secret in a hashing scheme that's used to verify
+    // that Savage is really being contacted by Travis CI,
+    // and not by some random hacker. For how to find your Travis token,
+    // see http://docs.travis-ci.com/user/notifications/#Authorization-for-Webhooks
+    travis-token = abcdefg
+}
+```
+
+### GitHub webhook configuration
+
+* Payload URL: `http://your-domain.example/savage/github`
+* Content type: `application/json`
+* Secret: Same as your `web-hook-secret-key` config value
+* Which events would you like to trigger this webhook?: "Pull Request"
+
+### Travis webhook configuration
+In `.travis.yml`:
+```
+notifications:
+  webhooks:
+    - http://your-domain.example/savage/travis
+```
+
+## Acknowledgments
+We all stand on the shoulders of giants and get by with a little help from our friends. Savage is written in [Scala](http://www.scala-lang.org) and built on top of:
+* [Akka](http://akka.io) & [Spray](http://spray.io), for async processing & HTTP
+* [Eclipse EGit GitHub library](https://github.com/eclipse/egit-github), for working with [the GitHub API](https://developer.github.com/v3/)
+
+## See also
+* [LMVTFY](https://github.com/cvrebert/lmvtfy), Savage's sister bot who does HTML validation
+* [Rorschach](https://github.com/twbs/rorschach), Savage's sister bot who sanity-checks Bootstrap pull requests
diff --git a/SECURITY.md b/SECURITY.md
@@ -0,0 +1,105 @@
+## DISCLAIMER
+The author is not a security expert and this project has not been subjected to a third-party security audit.
+
+## Responsible disclosure; Security contact info
+
+The security of Savage is important to us. We encourage you to report security problems to us responsibly.
+
+Please report all security bugs to `savage {AT} rebertia [DOT] com`. We aim to respond (with at least an acknowledgment) within one business day. We will keep you updated on the bug's status as we work towards resolving it.
+
+We will disclose a problem to the public once it has been confirmed and a fix has been made available. At that point, you will be credited for your discovery in the documentation, in the release announcements, and (if applicable) in the code itself.
+
+As Savage currently lacks corporate backing, we are unfortunately unable to offer bounty payments at this time.
+
+We thank you again for helping ensure the security of Savage by responsibly reporting security problems.
+
+## System model
+
+### System operation
+(Note: PR = pull request)
+
+```
+[GitHub]  >>>(Webhook notification of new/updated PR)>>>  [Savage]
+* Savage verifies that the notification was really from GitHub (and not an impostor)
+    by verifying the HMAC-SHA1 computed using the web hook secret key previously configured with GitHub.
+
+[GitHub]  <<<(Request details about the PR using the PR's HEAD commit's SHA)<<<  [Savage]
+[GitHub]  >>>(Response with details about the PR)>>>  [Savage]
+* Savage checks list of files modified by the PR against the whitelist
+  * If any files are outside of the whitelist, stop further processing.
+
+[GitHub]  <<<(Request for Git data for the PR's HEAD commit via its SHA)<<<  [Savage]
+[GitHub]  >>>(Response with Git data for the PR's HEAD commit)>>>  [Savage]
+* Savage generates a new branch name using the PR number and a specified prefix
+
+[GitHub]  >>>(Fetch refs from PR's GitHub repo)>>> [Savage]
+[GitHub]  <<<(Pushes new branch to test repository using the PR's HEAD commit, referenced via its SHA)<<<  [Savage]
+[GitHub]  >>>(Notifies Travis of the test repository's newly-pushed branch)>>>  [Travis CI]
+* Travis CI runs the build with the privileges of the test repository
+  * Notably, it has access to Travis CI secure environment variables
+
+[Travis CI] >>>(Outcome of build)>>> [Savage]
+* Savage verifies that the notification was really from Travis CI (and not an impostor)
+    by verifying the signature in the `Authorization` header using the secret Travis user token.
+
+[GitHub]  <<<(Post comment on PR regarding build outcome)<<<  [Savage]
+[GitHub]  <<<(Delete branch from test repository)<<<  [Savage]
+```
+
+Remarks:
+At no point do we use the PR's branch name directly. We also delete all fetched branches after the push is completed. This avoids maliciously crafted branch names which could be misinterpreted by other systems and also ensures that the attacker cannot change the contents of the branch out from under us, thus avoiding [TOCTTOU](http://en.wikipedia.org/wiki/Time_of_check_to_time_of_use) vulnerabilities.
+
+## Threat model
+
+### Assumptions
+(These are admittedly generous.)
+* We trust the machine that Savage is running on
+* We trust GitHub
+* We trust Travis CI
+* We trust that the EGit-GitHub library communicates with GitHub securely
+* We assume that the git command binaries are secure so long as they are only invoked with secure arguments
+* We assume that our build scripts are secure (this is outside the scope and control of Savage itself)
+* We assume that the filename whitelist is correct
+
+### Architecture-based threat analysis
+Out of scope per our assumptions:
+* Compromise of GitHub
+* Compromise of Travis CI API
+* Compromise of the machine on which Savage resides
+* Compromise of out outbound communications with GitHub
+* Allowing modification of a sensitive file due to incorrect whitelist settings
+
+Within scope:
+* Impersonating GitHub and delivering a malicious webhook notification
+  * Prevented by our checking of the HMAC-SHA1 signature of the webhook payload
+* Impersonating Travis and delivering a malicious webhook notification
+  * Prevented by our checking of the SHA-256 signature of the webhook payload
+* Shell-related vulnerabilities
+  * Avoided by not using the shell when invoking git; we use Java's `ProcessBuilder`/`Process` instead
+* Compromising the git fetch/push command via malicious input
+  * Avoided by checking that the relevant git-related data isn't fishy
+* Compromising the git branch deletion command via malicious input
+  * The command involves only a Savage-generated branch name, whose computation is simple and which is checked for validity. We believe this thus avoids the vulnerability.
+* Compromising the contents of the posted GitHub comment via malicious input
+  * Avoided by checking that the relevant data from Travis isn't fishy
+
+### Asset-centric threat analysis
+Assets:
+* Savage's GitHub credentials
+  * We don't believe this information is leaked by Savage itself.
+  * We don't believe the git commands can be induced to access the relevant configuration file that has the credentials.
+  * Travis deserializes the API responses as vanilla JSON; it doesn't `eval()` them; spray-json doesn't have any deserialization features that allow the execution of arbitrary code (contrast this with YAML and some of its implementations).
+* Write access to the test GitHub repo
+  * We believe that the various checks that Savage performs on the inputs and the fact that it is only capable of performing a couple git operations prevents malicious access to the test repo.
+* Commenting ability on the main GitHub repo
+  * Savage only uses the commit SHA and the Travis build URL in its comment text, and both of these are checked for validity/safety.
+* Credentials stored in Travis secure environment variables
+  * Under our somewhat generous assumptions, this should be impossible.
+
+## Notes on securing build scripts
+* Beware malicious Git input (branch names, commit messages, author info, etc.)
+* Beware malicious Travis input (e.g. environment variables)
+* Beware potentially-executable data files (e.g. `eval()`ing of JSON, YAML custom type deserialization hooks)
+* Beware the addition of files with maliciously-chosen names
+* Ensure that build scripts are absent from the whitelist
+* Ensure package management control files are absent from the whitelist, to prevent the installation of malicious packages
diff --git a/assembly.sbt b/assembly.sbt
@@ -0,0 +1,4 @@
+import AssemblyKeys._
+
+assemblySettings
+
diff --git a/build.sbt b/build.sbt
@@ -0,0 +1,33 @@
+name := "savage"
+
+version := "1.0"
+
+scalaVersion := "2.10.4"
+
+mainClass := Some("com.getbootstrap.savage.server.Boot")
+
+resolvers ++= Seq("snapshots", "releases").map(Resolver.sonatypeRepo)
+
+libraryDependencies += "org.eclipse.mylyn.github" % "org.eclipse.egit.github.core" % "2.1.5"
+
+libraryDependencies ++= {
+  val akkaV = "2.3.6"
+  val sprayV = "1.3.2"
+  Seq(
+    "io.spray"            %%  "spray-can"     % sprayV,
+    "io.spray"            %%  "spray-routing" % sprayV,
+    "io.spray"            %%  "spray-testkit" % sprayV   % "test",
+    "io.spray"            %%  "spray-json"    % "1.3.1",
+    "com.typesafe.akka"   %%  "akka-actor"    % akkaV,
+    "com.typesafe.akka"   %%  "akka-testkit"  % akkaV    % "test",
+    "org.specs2"          %%  "specs2"        % "2.3.12" % "test"
+  )
+}
+
+scalacOptions := Seq("-unchecked", "-deprecation", "-feature", "–Xlint", "-encoding", "utf8")
+
+scalacOptions in Test ++= Seq("-Yrangepos")
+
+// parallelExecution in Test := false
+
+Revolver.settings
diff --git a/project/plugins.sbt b/project/plugins.sbt
@@ -0,0 +1,3 @@
+addSbtPlugin("io.spray" % "sbt-revolver" % "0.7.2")
+
+addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2")
diff --git a/setup_droplet.sh b/setup_droplet.sh
@@ -0,0 +1,25 @@
+#!/bin/bash
+
+# Step 0: You need to have copied the assembly JAR to savage/target/scala-2.10/savage-assembly-1.0.jar
+# Step 0.1: You need to have the git repo checked out in ./git-repo
+# Step 0.2: The user's SSH public-private keys must be at ./ssh/id_rsa and ./ssh/id_rsa.pub
+
+# set to Pacific Time (for @cvrebert)
+# ln -sf /usr/share/zoneinfo/America/Los_Angeles /etc/localtime
+
+# remove useless crap
+aptitude remove wpasupplicant wireless-tools
+aptitude remove pppconfig pppoeconf ppp
+
+# setup firewall
+ufw default allow outgoing
+ufw default deny incoming
+ufw allow ssh
+ufw allow www
+ufw enable
+ufw status verbose
+
+# setup Docker; written against Docker v1.2.0
+docker build . 2>&1 | tee docker.build.log
+IMAGE_ID="$(tail -n 1 docker.build.log | cut -d ' ' -f 3)"
+docker run -d -p 80:6060 --name savage $IMAGE_ID