This is large collection of Scala code being collected for testing static analysis tools such as Alacs.
There are currently 258266 lines of Scala in the corpus.
Line counts are generated using cloc.
This collection is built using sbt. Compilation is exactly one step: sbt test-compile
For some reason sbt
will print Java unchecked warnings in red as errors even though they're just warnings. Don't be alarmed, the corpus will still compile.
NOTE: Some systems (e.g., Linux with home directory encryption) may not be able to build scala-project due to a known limitation of scalac
concerning file name lengths.
- scalatest (50k)
- blueeyes (21k)
- specs (20k)
- akka (14k)
- factorie (13k)
- scalaz (10k)
- kiama (10k)
- squeryl (9k)
- scalariform (8k)
- scala-stm (8k)
- scala-query (7k)
- kafka (7k)
- scalaxb (5k)
- gdata-scala-client (4k)
- casbah (4k)
- ensime (4k)
- scala-swing (4k)
- scala-migrations (3k)
- smile (3k)
- flashup (1k)
- scala (177k)
- lift (58k)
- apparat (18k)
- scalate (16k)
- [sgine](hg clone https://sgine.googlecode.com/hg/ sgine) (16k)
- akka-modules (15k)
- gapt (14k)
- scalanlp (13k)
- scala-ide (12k)
- uniscala (10k)
- scala-refactoring (10k)
- circumflex (6k)
- scala-intellij (67k)
- kojo (15k)
- gizzard (5k)
- flockdb (4k)
- sbt (15k)
- norbert (5k)
- scalax.io (5k)
- swap-scala (5k)
- scalala (4k)
- configgy (4k)
- sweetscala (3k)