Skip to content

Commit

Permalink
README and comments
Browse files Browse the repository at this point in the history
  • Loading branch information
jorgeortiz85 committed Sep 11, 2011
1 parent 791e6ee commit 7d16d44
Show file tree
Hide file tree
Showing 14 changed files with 120 additions and 4 deletions.
59 changes: 59 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
Muddl - a data-definition language

Muddl allows you to define schemas for your data, and it generates code to make it easy for you to work with that data in Scala. It comes with two parts: the muddl-compiler is run as part of your build process and generates code, and the muddl-library is added as a dependency to your project and required for this code to compile.



Features:

* Schema definition - Schemas are defined entirely in Scala. This keeps the compiler very simple. It needs to do a little bit of reflection and a lot of code generation, but all the hard work of parsing and defining the schema-language is done by the Scala compiler itself.

* Optional, required, and repeated fields - A field with a base type of A can be optional (returns Option[A]), required (returns A), or repeated (returns Seq[A]). Fields are optional by default.

* Embedded records - A field can consist of an embedded record.

* Pluggable serialization/deserialization - Muddl is serialization-agnostic. Muddl comes with a set of traits you can implement to define your own serialization and deserialization mechanisms. As long as ir can support a few basic concepts (objects, arrays, values) that are roughly isomorphic to JSON, any serialization mechanism should work.

* Type-safe serialization/deserialization - Muddl generates serialization and deserialization traits for each of your records. These traits will have an abstract method for each separate type that record needs in order to serialize/deserialize itself. If you forget to implement one of these methods, your code will not compile. This means if you add a field with a new type to your record, but forget to implement a serializer and a deserializer for this type, your code will fail to compile.

* Immutable by default - Everything in Muddl is immutable by default. However, records do have a corresponding mutable class, in case you need mutability.

* Edit history for mutable records - Mutable records keep track of old and new values for every mutated field. This allows you to hook in logging or tracking mechanisms at the point in which you write a record.

* toString/hashCode/equals - Muddl records include a useful toString implementation, as well as proper implementations for hashCode and equals. For immutable records, hashCode and equals will work very much like they do in case classes. For mutable records, instance equality is used for hashCode and equals.

* Decorators for easy extension - Muddl records very purposefully take a "just the data" philosophy. However, it's often convenient to have derived data or helper methods available with your record object. To facilitate this, Muddl generates decorator classes for each record. You can extend these decorate classes to add your own helper methods to your record objects.

* Type-level reflection - Muddl reifies type-level reflection on fields and records. This can be used to add additional compile-time safety to libraries using Muddl records, or to facilitate run-time reflection on Muddl records.



Non-features:

* Database access - Muddl will not read or write from a database for you. Instead, Muddl gives you serializers and deserializers to use with Muddl records, and lets you take care of reading and writing to the database.



Missing/broken/sub-optimal:

* Enumerations - Muddl does not yet support enumerations, but it will soon. These are unlikely to be based on Scala's Enumeration trait. More likely, they will be a different encoding of enumerations that take advantage of code generation.

* Multiple packages - Muddl generates records in the same package as the one in which their schema was defined. Currently, this works fine as long as you only ever use a single package. This will be fixed and multiple packages will be supported eventually.

* Empty vs missing arrays - Currently, Muddl does not distinguish between an empty array and a missing array. Is this distinction important? Probably yes. However, working with Option[Seq[A]] is painful. Maybe make arrays required parameters by default, and only optional if explicitly marked? Tough design decisions here.

* Foreign keys - Should Muddl recognize foreign keys? This requires the notion of records "with id". This could open the door for increased type safety (type of a Venue id != type of a Checkin id, even though they are both fundamentally ObjectIds) and other features (e.g., priming), but it's unclear what the best way to model this is.

* Performance testing - I'm reasonably confident that Muddl records are both more space efficient and more performant than Lift's Record, but as of yet I've done no testing on this. In particular, the serialization and deserialization patterns were written with little regard for performance and it is very likely that there is room for improvement here (guided by performance tests).

* Builder pattern - Currently, it is impossible to create a new Muddl record from scratch. Only existing records may be wrapped in a mutable wrapper and modified. What's the best way to create a new record? Perhaps with a mutable record that has no underlying immutable record (getting unset required fields would throw). Perhaps with a builder pattern (but this requires generating even more code).

* Default values - Currently, optional fields are all Option[A]. This is somewhat cumbersome to work with, and there is no way to define default values for fields. This can be mitigated with a decorator (make the field in the record be 'addressOpt', then add an 'address' field to the decorator that adds a default value), but that's somewhat tedious. Worth thinking about how to make this better.

* Schema inheritance - I have no idea whether this will work or not right now, I haven't tried it. If it doesn't work, it should be supported.

* Compiler plugin - A tiny compiler plugin could help with code generation (e.g., to discover all the schemas). Not critical, but worth considering.

* sbt plugin - An sbt plugin could make it slightly easier to integrate Muddl with existing sbt builds. Also not critical, but worth considering.

* Field numbering - Currently, all fields in a schema must be numbered. This isn't used anywhere at the moment, but one could imagine it being useful for interop with things like protobuf or Thrift. Worth keeping, or should we get rid of it until we need it?
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
package com.foursquare.muddl.compiler

/**
* This class gives a full description of a record schema's field.
*/
case class AnnotatedFieldSchema(schema: FieldSchema[_], longName: String) {
def number: Int = schema.number
def shortName: String = schema.shortName
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
package com.foursquare.muddl.compiler

/**
* This class gives a full description of a record schema.
*/
case class AnnotatedRecordSchema(
schema: RecordSchema,
className: String,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@ package com.foursquare.muddl.compiler

case class CodeGenResult(annotatedSchema: AnnotatedRecordSchema, code: String)

/**
* This class generates code for a given schema.
*/
object CodeGen {
def codeGen(schema: RecordSchema): CodeGenResult = {
val annotated = Reflection.annotateRecord(schema)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
package com.foursquare.muddl.compiler

/**
* This class describes a field in a schema.
* Fields must have a number, a short name, and a type.
* Fields can be optional, required, or repeated.
*/
case class FieldSchema[T](
number: Int,
shortName: String,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
package com.foursquare.muddl.compiler

/**
* Inherit from this class to define a schema.
*/
trait RecordSchema {
/**
* Call this method to define a field in this schema. A field type, a field number, and
* a field short name are all required.
*/
def field[T : Manifest](number: Int, shortName: String): FieldSchema[T] =
new FieldSchema[T](number, shortName, manifest[T], isOptional = true, isRepeated = false)
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@ package com.foursquare.muddl.compiler

import java.lang.reflect.Method

/**
* This class takes RecordSchemas and FieldSchemas and annotates them with some additional
* information based on Java reflection.
*/
object Reflection {
def annotateRecord(schema: RecordSchema): AnnotatedRecordSchema = {
val klass = schema.getClass
Expand All @@ -24,5 +28,6 @@ object Reflection {
AnnotatedFieldSchema(field, method.getName)
}

// TODO(jorge): Write a compiler plugin that can implement this method.
def allSchema: Seq[RecordSchema] = error("synthetic method not defined by compiler plugin")
}
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ sealed class DeserializationException(message: String) extends RuntimeException(
case class MissingRequiredFieldException(field: UntypedField) extends DeserializationException("Missing required field: " + field.shortName)
case class CouldntDeserializeException(field: UntypedField, value: Any) extends DeserializationException("Couldn't deserialize field: " + field.shortName + " (value: "+value+")")

/**
* This trait defines the basics of deserialization. Deserializer implementations must support
* Objects, Arrays, and Elements. They must also support required, optional, and repeated fields.
*/
trait Deserializer {
type Obj
type Arr
Expand Down
8 changes: 8 additions & 0 deletions library/src/main/scala/com/foursquare/muddl/Field.scala
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
package com.foursquare.muddl

/**
* UntypedField contains information about one of a Record's fields.
*/
trait UntypedField {
def number: Int
def shortName: String
def longName: String
def optional: Boolean
}

/**
* A Field contains information about one of a Record's fields.
*
* A Field's type includes both the type of its Record and the type of the fields value.
*/
case class Field[T <: Record[T], F](
get: T => F,
number: Int,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
package com.foursquare.muddl

/**
* A FieldMutation represents a change in the value of one of a Record's Fields.
*/
case class FieldMutation[F](oldValue: F, newValue: F)
3 changes: 3 additions & 0 deletions library/src/main/scala/com/foursquare/muddl/MetaRecord.scala
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
package com.foursquare.muddl

/**
* A MetaRecord has information about a Record's fields.
*/
trait MetaRecord[T <: Record[T]] {
def fields: Seq[Field[T, _]]
}
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
package com.foursquare.muddl

import java.lang.Object
import scala.ScalaObject

/**
* A MutableRecord is a Record that keeps a map of mutations, or changes to one of its fields.
*
* It overrides hashCode and equals to use the system defaults.
*/
trait MutableRecord[T <: Record[T]] extends Record[T] { self: T =>
protected def underlying: T with Record[T]
protected var _mutations: scala.collection.mutable.Map[Field[T, _], FieldMutation[_]] =
Expand Down
9 changes: 8 additions & 1 deletion library/src/main/scala/com/foursquare/muddl/Record.scala
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,21 @@ package com.foursquare.muddl

import scala.runtime.ScalaRunTime

/**
* A Record is very similar to a case class.
*
* Like a case class, it implements Product (including productArity, productElement, and productPrefix).
* Like a case class, it implements toString, hashCode, and equals.
* In addition, it also references a MetaRecord.
*/
trait Record[T <: Record[T]] extends Product { self: T =>
def meta: MetaRecord[T]

override def productArity: Int = meta.fields.size
override def productElement(n: Int): Any = meta.fields(n).get(this)
override def productPrefix: String = this.getClass.getSimpleName
// override def toString = ScalaRunTime._toString(this)
override def toString = meta.fields.map(f => f.shortName + "=" + f.get(this)).mkString(productPrefix + "(", ", ", ")")
override def toString = meta.fields.map(f => f.longName + "=" + f.get(this)).mkString(productPrefix + "(", ", ", ")")
override def hashCode = ScalaRunTime._hashCode(this)
override def equals(that: Any): Boolean = ScalaRunTime._equals(this, that)
}
4 changes: 4 additions & 0 deletions library/src/main/scala/com/foursquare/muddl/Serializer.scala
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
package com.foursquare.muddl

/**
* This trait defines the basics of serialization. Serializer implementations must support
* Objects, Arrays, and Elements.
*/
trait Serializer {
type Elem
type Obj <: Elem
Expand Down

0 comments on commit 7d16d44

Please sign in to comment.