Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Scalable Consistency Adjustable Data Storage
Scala JavaScript Ruby Java R Shell Other

correctly set s3location when bundling new AMI

Conflicts:
	deploylib/src/main/scala/ec2/ec2.scala
latest commit d405f1fb4d
@marmbrus marmbrus authored
Failed to load latest commit information.
avro get rid of debugging output
axer/src/main/scala Commit of axer working tree
bin add back java rebel to sbt.
comm/src Merge branch 'piql' into v2.1.3
config mview test: de-bulkify put for mdcc test
deploylib/src/main correctly set s3location when bundling new AMI
ec2 update location of nexus war
experiments include opt for OptimizedQuery.
matheon/src/main/scala Update to use new RecordParser semantics
optional/src/main/scala merge in optional package for idomatic commandline parsing.
perf/src/main/scala mviews: better summarize fn
piql
project use AWS_DEFAULT_REGION env var for implicit cluster again
scalaengine/src make test cluster mem only again.
scripts bring back setup script.
.ensime Add matheon project with simple zero finding agg and file loader
.gitignore add results to git ignore.
README.md add link to perf source

README.md

SCADS (Scalable Consistency Adjustable Data Storage) is a research prototype distributed storage system used in the RAD Lab and the AMP Lab at UC Berkeley. The goals of the system were first described in our vision paper from CIDR2009

SCADS Sub-projects

SCADS is composed of the following sub-projects:

SCADS Core

Other Experiments

Deprecated Projects

The following sub-projects are no longer actively maintained:

  • demo - The RAD Lab final demo used SCADS along with other projects from the RAD Lab to scale web applications written by novice developers over a weekend to hundreds of servers on Amazon EC2.
  • director - The director ensures SLO compliance for storage operations by using machine learning models to dynamically re-provision a SCADS storage cluster based on current and projected workload. More details can be found in the paper from FAST2011.

Third Party Components

  • optional - The optional command line parsing library from paulp with added support for default arguments.

Building

SCADS is built using SBT. The SBT launcher is included in the distribution (bin/sbt) and is responsible for downloading all other required jars (scala library/compiler and dependencies).

SBT commands can be invoked from the command line. For example, to clean and build jar files for the entire SCADS project, you would run the following command:

scads/$ sbt clean package

You can also execute commands on specific sub-projects by specifying <subproject>/<command>. For example:

scads/$ sbt piql/compile

Additionally, if you are going to be running several commands, you can use SBT from an interactive console, which amortizes the cost of starting the JVM and JITing SBT and the scala compiler. For example:

scads> sbt
[info] Loading project definition from /Users/marmbrus/Workspace/radlab/scads/project
[info] Set current project to scads (in build file:/Users/marmbrus/Workspace/radlab/scads/)
scads:sbt09> project modeling
[info] Set current project to modeling (in build file:/Users/marmbrus/Workspace/radlab/scads/)
modeling:sbt09> compile
[success] Total time: 7 s, completed Sep 11, 2011 5:23:28 PM

Useful Command Reference

  • clean - delete generated files
  • compile - build the current project and all its dependencies
  • doc - compile scalado
  • gen-idea - build a project for use in InteliJ
  • ghpages:push-api-doc - update the api documentation hosted on github pages
  • package - create the jar or war file for the current project
  • project <subproject name> - switch to the specified sub-project; subsequent commands will be run only on this project and its dependencies
  • projects - list all of the available sub-projects
  • publish - publish jars to the radlab nexus repository
  • publish-local - publish jars to your local ivy cache
  • reload - recompiles the project definition and restarts SBT
  • run - run the mainclass for the current sub-project; if there are multiple choices you will be prompted for which one you want to run
  • test - run the testcases for the current sub-project and all its dependencies
  • test-only [test case] - run the specified testcase or only the testcases whose source has changed since tests were last run
  • update - download all managed dependencies jars; note that in contrast to maven, this must be run explicitly. This only needs to be run once unless dependencies have been added to the project.

Deploy Console

Running deploy-console in SBT brings up a scala console containing the deploylib environment for the current project. The deploylib environment contains all jar files necessary for running the current project on a remote machine. For example, to run the experiment comparing two plans for the PIQL intersection query on EC2:

scads/$ sbt modeling/deploy-console
[info] Loading project definition from /Users/marmbrus/Workspace/radlab/scads/project
[info] Set current project to scads (in build file:/Users/marmbrus/Workspace/radlab/scads/)
import deploylib._
import deploylib.ec2._
allJars: Seq[java.io.File] = List(/Users/marmbrus/Workspace/radlab/scads/piql/scadr/target/scala-2.9.1/scadr_2.9.1-2.1.2-SNAPSHOT.jar...
Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_24).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import edu.berkeley.cs.scads.piql.modeling.Experiments._
scala> import edu.berkeley.cs.scads.piql.modeling.PlanCompare._
scala> cluster.setup(numSlaves = 3)
scala> run
Something went wrong with that request. Please try again.