Skip to content

Java 8 annotation processor and framework for deriving algebraic data types constructors, pattern-matching, morphisms, (near future: optics and typeclasses)

Notifications You must be signed in to change notification settings

Randgalt/derive4j

 
 

Repository files navigation

Derive4J: Java 8 annotation processor for deriving algebraic data types constructors, pattern matching and more!

Travis Maven Central Gitter Chat

Table of contents

Caution: if you are not familiar with Algebraic Data Types (aka "Sum Types" / "Tagged Unions") or the "visitor pattern" then you should learn a bit about them before further reading of this page:

This project has a special dedicace to Tony Morris' blog post [Debut with a catamorphism] (http://blog.tmorris.net/posts/debut-with-a-catamorphism/index.html). I'm also very thankful to @sviperll and his adt4j project which was the initial inspiration for Derive4J.

So. What can this project do for us, poor functional programmers stuck with a legacy language called Java? A good deal of what is available for free in better languages like Haskell: pattern matching, laziness... An example being worth a thousand words...

Example: a 'Visitor' for HTTP Request

Let's say we want to modelize an HTTP request. For the sake of the example let's say that an http request can either be

  • a GET on a given path
  • a DELETE on a given path
  • a POST of a content body on a given path
  • a PUT of a content body on a given path

and nothing else!

You could then use the corrected visitor pattern and write the following class in Java:

package org.derive4j.example;

/** A data type to modelize an http request. */
@Data
public abstract class Request {

  /** the Request 'visitor' interface, R being the return type
   *  used by the 'accept' method : */
  interface Cases<R> {
    // A request can either be a 'GET' (of a path):
    R GET(String path);
    // or a 'DELETE' (of a path):
    R DELETE(String path);
    // or a 'PUT' (on a path, with a body):
    R PUT(String path, String body);
    // or a 'POST' (on a path, with a body):
    R POST(String path, String body);
    // and nothing else!
  }

  // the 'accept' method of the visitor pattern:
  public abstract <R> R match(Cases<R> cases);

  /**
   * Alternatively and equivalently to the visitor pattern above, if you prefer a more FP style,
   * you can define a catamorphism instead. (see examples)
   * (most useful for standard data type like Option, Either, List...)
   */
}

Constructors

Without Derive4J, you would have to create subclasses of Request for all four cases. That is, write at the minimum something like:

  public static Request GET(String path) {
    return new Request() {
      @Override
      public <R> R match(Cases<R> cases) {
        return cases.GET(path);
      }
    };}

for each case. But thanks to the @Data annotation, Derive4j will do that for you! That is, it will generate a Requests class (the name is configurable, the class is generated by default in target/generated-sources/annotations when using Maven) with four static factory methods (what we call 'constructors' in FP):

  public static Request GET(String path) {...}
  public static Request DELETE(String path) {...}
  public static Request PUT(String path, String body) {...}
  public static Request POST(String path, String body) {...}

You can also ask Derive4J to generate null checks with:

@Data(arguments = ArgOption.checkedNotNull)

equals, hashCode, toString?

Derive4J philosophy is to be as safe and consistent as possible. That is why Object.{equals, hashCode, toString} are not implemented by generated classes by default. Nonetheless, as a concession to legacy, it is possible to force Derive4J to implement them, by declaring them abstract. Eg by adding the following in your annotated class:

  @Override
  public abstract int hashCode();
  @Override
  public abstract boolean equals(Object obj);
  @Override
  public abstract String toString();

The safer solution would be to never use those methods and use 'type classes' instead, eg. Equal, Hash and Show. The project Derive4J for Functiona Java aims at generating them automatically.

Pattern matching syntax

Now let's say that you want a function that returns the body size of a Request. Without Derive4J you would write something like:

  static final Function<Request, Integer> getBodySize = request -> 
      request.match(new Cases<Integer>() {
        public Integer GET(String path) {
          return 0;
        }
        public Integer DELETE(String path) {
          return 0;
        }
        public Integer PUT(String path, String body) {
          return body.length();
        }
        public Integer POST(String path, String body) {
          return body.length();
        }
      });

With Derive4J you can do that a lot less verbosely, thanks to a generated fluent structural pattern matching syntax! And it does exhaustivity check! (you must handle all cases). The above can be rewritten into:

static final Function<Request, Integer> getBodySize = Requests.cases()
      .GET(path          -> 0)
      .DELETE(path       -> 0)
      .PUT((path, body)  -> body.length())
      .POST((path, body) -> body.length())

or even (because you don't care of GET and DELETE cases):

static final Function<Request, Integer> getBodySize = Requests.cases()
      .PUT((path, body)  -> body.length())
      .POST((path, body) -> body.length())
      .otherwise(0)

Accessors (getters)

Now, patterning matching every time you want to inspect an instance of Request is a bit tedious. For this reason Derive4J generates 'getter' static methods for all fields. For the path and body fields, Derive4J will generate the following methods in the Requests class:

  public static String getPath(Request request){
    return Requests.cases()
        .GET(path          -> path)
        .DELETE(path       -> path)
        .PUT((path, body)  -> path)
        .POST((path, body) -> path)
        .apply(request);
  }
  // return an Optional because the body is not present in the GET and DELETE cases:
  static Optional<String> getBody(Request request){
    return Requests.cases()
        .PUT((path, body)  -> body)
        .POST((path, body) -> body)
        .otherwiseEmpty()
        .apply(request);
  }

(Actually the generated code is equivalent but more efficient)

Using the generated getBody methods we can rewrite our getBodySize function into:

static final Function<Request, Integer> getBodySize = request ->
      Requests.getBody(request)
              .map(String::length)
              .orElse(0);

Functional setters ('withers')

The most painful part of immutable data structures (like the one generated by Derive4J) is updating them. Scala case classes have copy methods for that. Derive4J generates similar modifier and setter methods in the Requests class:

  public static Function<Request, Request> setPath(String newPath){
    return Requests.cases()
            .GET(path          -> Requests.GET(newPath))
            .DELETE(path       -> Requests.DELETE(newPath))
            .PUT((path, body)  -> Requests.PUT(newPath, body))
            .POST((path, body) -> Requests.POST(newPath, body)));
  }
  public static Function<Request, Request> modPath(Function<String, String> pathMapper){
    return Requests.cases()
            .GET(path          -> Requests.GET(pathMapper.apply(path)))
            .DELETE(path       -> Requests.DELETE(pathMapper.apply(path)))
            .PUT((path, body)  -> Requests.PUT(pathMapper.apply(path), body))
            .POST((path, body) -> Requests.POST(pathMapper.apply(path), body)));
  }
  public static Function<Request, Request> setBody(String newBody){
    return Requests.cases()
            .GET(path          -> Requests.GET(path))    // identity function for GET
            .DELETE(path       -> Requests.DELETE(path)) // and DELETE cases.
            .PUT((path, body)  -> Requests.PUT(path, newBody))
            .POST((path, body) -> Requests.POST(path, newBody)));
  }
  ...

By returning a function, modifiers and setters allow for a lightweight syntax when updating deeply nested immutable data structures.

First class laziness

Languages like Haskell provide laziness by default, which simplifies a lot of algorithms. In traditional Java you would have to declare a method argument as Supplier<Request> (and do memoization) to emulate laziness. With Derive4J that is no more necessary as it generates a lazy constructor that gives you transparent lazy evaluation for all consumers of your data type:

  // the requestExpression will be lazy-evaluated on the first call
  // to the 'match' method of the returned Request instance:
  public static Request lazy(Supplier<Request> requestExpression) {
    ...
  }

Have a look at List for how to implement a lazy cons list in Java using Derive4J (you may also want to see the associated generated code).

Flavours

In the example above, we have used the default JDK flavour. Also available are FJ (Functional Java), Fugue (Fugue) and Javaslang (Javaslang) flavours. When using those alternative flavours, Derive4J will use eg. the specific Option implementations from those projects instead of the jdk Optional class.

Optics (functional lenses)

If you are not familiar with optics, have a look at Monocle (for scala, but Functional Java provides similar abstraction).

Using Derive4J generated code, defining optics is a breeze (you need to use the FJ flavour by specifying @Data(flavour = Flavour.FJ):

  /**
   * Lenses: optics focused on a field present for all data type constructors
   * (getter cannot 'failed'):
   */
  public static final Lens<Request, String> _path = lens(
      Requests::getPath,
      Requests::setPath);
  /**
   * Optional: optics focused on a field that may not be present for all constructors
   * (getter return an 'Option'):
   */
  public static final Optional<Request, String> _body = optional(
      Requests::getBody,
      Requests::setBody);
  /**
   * Prism: optics focused on a specific constructor:
   */
  public static final Prism<Request, String> _GET = prism(
      // Getter function
      Requests.cases()
          .GET(fj.data.Option::some)
          .otherwise(Option::none),
      // Reverse Get function (aka constructor)
      Requests::GET);

  // If there is more than one field, we use a tuple as the prism target:
  public static final Prism<Request, P2<String, String>> _POST = prism(
      // Getter:
      Requests.cases()
          .POST((path, body) -> p(path, body))
          .otherwiseNone(),
      // reverse get (construct a POST request given a P2<String, String>):
      p2 -> Requests.POST(p2._1(), p2._2()));
}

Updating deeply nested immutable data structure

Let's say you want to modelize a CRM. Each client is a Person which can be contacted either by email, telephone or postal mail. With Derive4J you could write the following:

import org.derive4j.*;
import java.util.function.BiFunction;

@Data
public abstract class Address {
  public abstract <R> R match(@FieldNames({"number", "street"}) 
  			      BiFunction<Integer, String, R> Address);
}
import org.derive4j.Data;

@Data
public abstract class Contact {
    interface Cases<R> {
      R byEmail(String email);
      R byPhone(String phoneNumber);
      R byMail(Address postalAddress);
    }
    public abstract <R> R match(Cases<R> cases);
}
import org.derive4j.*;
import java.util.function.BiFunction;

@Data
public abstract class Person {
  public abstract <R> R match(@FieldNames({"name", "contact"})
                              BiFunction<String, Contact, R> Person);
}

But now we have a problem: All clients have been imported from a legacy database with an off-by-one error for the street number! We must create a function that increments each person street number (if it exists) by one. And without modifying the original data structure (because it is immutable). With Derive4J, writing such a function is trivial:

import java.util.Optional;
import java.util.function.Function;

import static org.derive4j.example.Addresss.Address;
import static org.derive4j.example.Addresss.getNumber;
import static org.derive4j.example.Addresss.modNumber;
import static org.derive4j.example.Contacts.getPostalAddress;
import static org.derive4j.example.Contacts.modPostalAddress;
import static org.derive4j.example.Persons.Person;
import static org.derive4j.example.Persons.getContact;
import static org.derive4j.example.Persons.modContact;

  public static void main(String[] args) {

    Person joe = Person("Joe", Contacts.byMail(Address(10, "Main St")));

    Function<Person, Person> incrementStreetNumber = modContact(
    						       modPostalAddress(
    						         modNumber(number -> number + 1)));
    
    // newP is a copy of p with the street number incremented:
    Person correctedJoe = incrementStreetNumber.apply(joe);

    Optional<Integer> newStreetNumber = getPostalAddress(getContact(correctedJoe))
        .map(postalAddress -> getNumber(postalAddress));

    System.out.println(newStreetNumber); // print "Optional[11]" !!
  }

Popular use-case: domain specific languages

Algebraic data types are particulary well fitted for creating DSLs. Like a calculator for arithmetic expressions:

import java.util.function.Function;
import org.derive4j.Data;
import static org.derive4j.example.Expressions.*;

@Data
public abstract class Expression {

	interface Cases<R> {
		R Const(Integer value);
		R Add(Expression left, Expression right);
		R Mult(Expression left, Expression right);
		R Neg(Expression expr);
	}
	
	public abstract <R> R match(Cases<R> cases);

	private static Function<Expression, Integer> eval = Expressions
		.cases()
			.Const(value        -> value)
			.Add((left, right)  -> eval(left) + eval(right))
			.Mult((left, right) -> eval(left) * eval(right))
			.Neg(expr           -> -eval(expr));

	public static Integer eval(Expression expression) {
		return eval.apply(expression);
	}

	public static void main(String[] args) {
		Expression expr = Add(Const(1), Mult(Const(2), Mult(Const(3), Const(3))));
		System.out.println(eval(expr)); // (1+(2*(3*3))) = 19
	}
}

Catamorphisms

are generated for recursively defined datatypes. So that you can rewrite the above eval method into:

	public static Integer eval(Expression expression) {
		Expressions
		     .cata(
		        value -> value,
		        (left, right) -> left.get() + right.get(),
		        (left, right) -> left.get() * right.get(),
		        expr -> -expr.get()
		     )
		     .apply(expression)
	}

But beware that for very deep structure it may blow the stack! (unless you make good use of lazy constructors...)

But what exactly is generated?

This is a very legitimate question. Here is the Expressions.java file that is generated for the above @Data Expression class.

Parametric polymorphism

... works as expected. Eg. you can write the following:

import java.util.function.Function;
import java.util.function.Supplier;
import org.derive4j.Data;

@Data
public abstract class Option<A> {

    public abstract <R> R cata(Supplier<R> none, Function<A, R> some);

    public final <B> Option<B> map(final Function<A, B> mapper) {
        return Options.modSome(mapper).apply(this);
    }
}

=> The generated modifier method modSome allows polymorphic update and is incidentaly the functor for our Option!

Generalized Algebraic Data Types

GADTs are also supported out of the box by Derive4J (within the limitations of Java type system). Have a look at this gist to know how to define GADTs in Java and how they can help create type-safe DSL: https://gist.github.com/jbgi/208a1733f15cdcf78eb5

Use it in your project

Derive4J should be declared as a compile-time only dependency (not needed at runtime). So while derive4j is (L)GPL-licensed, the generated code is not linked to derive4j, and thus derive4j can be used in any project (proprietary or not).

Maven:

<dependency>
  <groupId>org.derive4j</groupId>
  <artifactId>derive4j</artifactId>
  <version>0.8.1</version>
  <optional>true</optional>
</dependency>

Gradle

compile(group: 'org.derive4j', name: 'derive4j', version: '0.8.1', ext: 'jar')

or better using the gradle-apt-plugin:

compileOnly "org.derive4j:derive4j-annotation:0.8.1"
apt "org.derive4j:derive4j:0.8.1"

Contributing

Bug reports and feature requests are welcome, as well as contributions to improve documentation.

Right now the codebase is not ready for external contribution (many blocks of codes are more complicated than should be). So you may better wait for resolution of #2 before trying to dig into the codebase.

Contact

jb@giraudeau.info, @jb9i or use the project github issues.

About

Java 8 annotation processor and framework for deriving algebraic data types constructors, pattern-matching, morphisms, (near future: optics and typeclasses)

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 100.0%