Differences between Frege and Haskell

Ingo Wechsung edited this page Apr 23, 2016 · 42 revisions

This document is intended for people who have a working knowledge of Haskell and want to write Frege code. The objective is not to describe each difference in every detail. For this, please consult the Frege Language Reference Manual, that can be obtained from the documentation site. It is rather just an overview with brief explanations. However, if you feel some difference you stumble upon should be included here, please open an issue on the project page.

I refer to the Haskell language as described in the Haskell 2010 Language Report. This document follows in structure the Haskell 2010 Language Report. I give the section numbers for easy reference. Most of the time I will not repeat how things are in Haskell, for example I will not write "In Haskell, this is such and such, whereas in Frege, it is thus." Rather, I just write "thus".

Words and phrases like "not yet", "currently", "at this time" etc. indicate the possibility that the incompatibility they refer to will be eliminated later.

General differences

  • The Foreign Function Interface is missing, instead of this there are language constructs to make Java types and methods usable. All primitive types are just Java types, and so is String.

  • Higher rank polymorphism is supported out of the box. For a detailed explanation of Frege features that correspond to GHC language extensions see this wiki page.

  • Data types act as namespaces within modules, this makes it possible to have record fields whose names do not pollute the top level namespace.

  • The inferred types of let bound functions are not generalised.

Library/Prelude

The Frege-Prelude defines many functions, type classes and types known from Haskell in a similar way. The goal is to implement most of the Haskell 2010 standard library in a most compatible way as far as this makes sense.

Apart from that, everything else will be quite different, because Frege uses the Java APIs whenever possible. At the time being, there is not much library support yet beyond the standard libraries.

Type Classes

Implementation of type classes may be incomplete and or entirely missing. Currently, multi parameter type classes are not supported.

Numerical Type Classes and Data Types

We have type classes Num (which implies Ord and Eq), Integral and Real (which may be either Double or Float) and support for trigonometric functions etc, in Floating. Rational and irrational numbers don't exist yet. A long standing goal is to establish a sound type class hierarchy that will not necessarily match Haskell's.

There are no Enum instances for types Double and Float. This implies that there are no arithmetic sequences with floating point numbers. All such sequences should be possible to produce with iterate.

Monadic type classes

The type class hierarchy is different here. Monad implies Applicative and Bind, Bind and Applicative imply Apply, and Apply implies Functor.

1.4 Namespaces / 5 Modules

The compilation unit is a package or module, whose name is a dot-separated sequence of identifiers, where the last one must start with an uppercase letter. Packages names are used only in import clauses. In all other cases, they are referenced by the simple name that was assigned in the import clause or, if there was none assigned, by the last component of the package name. This is known as name space identifier .

There are no export clauses. Constructors, functions and data types that must not be accessible from other modules can be marked private. A data type declaration can be marked abstract, this is equivalent to making all constructors private.

Execution

Every module, not just Main, can have a function

main :: [String] -> IO a

or

main :: IO a

Execution of the Java class file that results from compiling of such a package will apply this function to the list of command line arguments if it is a function to obtain an IO action, otherwise, it is already an IO action. The runtime then performs this action.

Qualified Import

The equivalent of a qualified Haskell import is:

import foo.bar.Module()

or

import foo.bar.Module as M()

The empty import list prevents pollution of the namespace of the importing module.

2.3 Comments

Comments introduced with {-- (block comment) or --- (line comment) are treated as documentation text. Documentation text looks like a comment, but is syntactically a single token that may occur at certain points in the program only. The text of the comment makes it down to the compiled module (Java class file) from where it can be retrieved later, for example by documentation tools.

2.4 Identifiers and Operators

  • default, deriving, foreign and newtype are not keywords.
  • abstract, derive, false, native, package, private, protected, public and true are keywords.
  • pure and mutable are keywords if and only if the next token is the keyword native.
  • Currently, no identifier may start with an underscore except the identifier _. The latter may only appear in patterns, where it signals an uninteresting value, as usual.
  • The only operator allowed as data constructor is :
  • ! and ? are unary operators. Unary operators bind yet stronger than function application, so that f !a is parsed as f (!a)
  • Frege has 18 precedence levels, 16 can be assigned to user defined operators.

2.5 Numeric Literals

Frege accepts Java style numeric literals, except hexadecimal floating point literals. In addition, integer literals can have trailing groups of 3 digits separated by underscores.

A decimal integer literal followed by the suffix n or N has type Integer, for other numeric literals one gets the Frege type by capitalizing the first letter of the name of the type that the literal would have in Java (i.e. int -> Int, float -> Float).

Flexibly typed numeric literals

Numeric literals are not overloaded. The Num class has functions

fromInt :: Num a => Int -> a
fromIntegral :: (Integral i, Num r) => i -> r

that can be used to overload integer literals manually.

However, there are easements in the typing of integer decimal literals:

  1. If the type checker can infer that the type must not be Int but some different type t, the literal simply gets converted to type t. It is an error if t is not an instance of Num.
  2. If the type checker infers that the literal should have some unknown type t that is an instance of type class Real the literal is converted to Double and t is unified with Double. For example, the type of 1/2 is Double.
  3. If there is a type signature that forces the type to be polymorphic, the literal is replaced by an application of fromInt to the literal.
  4. If none of the above rules are applicable, the type is Int

For double literals without the suffix 'd' or 'D', the following rules apply:

  1. If the type checker can infer that the type must not be Double but some different type t, the literal simply gets converted to type t. It is an error if t is not an instance of Real.
  2. If there is a type signature that forces the type to be polymorphic, the literal is replaced by an application of fromDouble :: Real n => Double -> n to the literal.
  3. In all other cases, it is Double.

Practical consequences are that flexibly typed numeric literals determine the type of an expression only if the type is not determined by other means, like a type signature or application of a function that demands a certain type, hence a "do what I mean" approach. Moreover, in the presence of type signatures, almost full compatibility with Haskell is achieved.

Numeric literals with a suffix as well as octal or hexadecimal integer literals are not felxibly typed.

2.6 Character and String Literals

Character and string literals are the same as in Java. A string literal denotes a value of type String. A string value is not a list of characters but an instance of the Java class java.lang.String.

There exists a type class ListSource in the Prelude that abstracts over types one can make ordinary lists from by providing a toListoperation. String is an instance of this class, as well as Maybe and arrays. List comprehension generators accept instances of ListSource right from the arrow.

Besides the more general and lazy toList, the Prelude functions packed and unpacked convert eagerly from lists of characters to strings and back.

Regular expression literals

A sequence of characters included in grave accent marks ´ denotes a value of type Regex, which is the Frege incarnation of java.util.regex.Pattern. When used as patterns, Regex literals will check String arguments, like in

foo ´^A´ = "string starts with an 'A'"
foo _    = "string does not start with an 'A'"

3 Expressions, Patterns

  • do expressions are terms (known as aexp in the Haskell 2010 Report). This often allows us to write monadic expressions that appear as function arguments without using the dollar operator.
  • There is a whole range of syntactic sugar for field access, test for field existence, nondestructive update of data values with fields and nondestructive change of values through application of functions to fields. This applies to algebraic data types that have constructors with field labels.
  • If e has type T u1 ... uk, the expression e.f is transformed to (T.f e), i.e. the function f from namespace T applied to e. Otherwise, if f is a type class operation of type class C it means (C.f e). This is so because every type (constructor) name and class name is associated with a separate name space.
  • The function composition operator is (U+2022 BULLET). However, to minimize the incompatibility, a dot that is not part of another operator will be treated as if it were just if: the previous token was a ( or the next token is a ) or the . is enclosed in whitespace on both sides.
  • There is currently no notation to make a pattern irrefutable.
  • The precedence rules for patterns are subtly different from Haskell's, but one easy rule is enough to understand them: In Frege, patterns are syntactically just expressions (in fact, there is not even a separate grammar rule for patterns), though of course only a subset of all possible expressions make valid patterns.

A consequence is that if the left hand side of a binding is syntactically an application of some function or operator, it will be interpreted as function binding for that function except if the operator is @, which is lexically a right associative operator that binds weaker than : and treated specially so as to allow traditional pattern bindings like in

let qs@(q:_) = ....

Note that the parentheses are not needed in Frege.

Hence, f y@(x:xs) = ... is taken as pattern binding, even if f y is not a valid pattern (which causes this construct to fail, but only after it is parsed). If one meant to define a function f that takes a list y with head x and tail xs, one writes f (y@x:xs) = ....

Likewise, application of ! is syntactically an unary expression, not an application, therefore a construct like !p = ... is interpreted as (strict) binding for pattern p and

f !x !y = ...

binds f to a function that takes 2 strict arguments.

The rule of thumb is to put sub patterns and arguments of function bindings in parentheses when they are more complex than !v, because the syntax works exactly like on the right hand side.

4.2 User-Defined Datatypes

  • No constraints can be specified in constructor declarations.
  • Infix notation can not be used in constructor declarations (but is possible in constructor application).
  • Every type declared with data owns a namespace. The data declaration can have a where clause, and value bindings in the scope of that where clause go in the type’s namespace. Qualified access of those bindings is possible, the qualifier is the type constructor name.
  • Field labels are not global, but live in the namespace associated with the type. Hence, if type T has a (data constructor that has a) field m and the type of e is an application of T, then e.m accesses the field and T.m is a function that extracts the field from any value v whose type is an application of T, provided v was constructed with a data constructor in whose field list m appears.
  • In constructors with record field syntax, the record fields can have polymorphic function types, even higher ranked ones. Note that type variables that did not appear as arguments to the type constructor must be protected by a forall, e.g.:

    data ListFunctions = LF { rev, tl :: forall a.[a]->[a] }

  • There is no deriving clause. See below for derive declarations that serve the same purpose.

  • Strictness flags on individual fields are possible, by writing a ! before the field name. In traditional constructor definitions there are no field labels, hence such constructors are all lazy.
  • Any Java class or interface can be declared as an abstract Frege types. A value of such a type can only be constructed and inspected by native functions.

4.2.3 Datatype Renamings

There is no newtype keyword. An algebraic data type with exactly one constructor with exactly one field serves the same purpose. It follows, that in Frege there is no equivalent for Haskell's

data Lifted a = L a

4.3.2 Instance Declarations

  • Type synonyms in instance declarations are allowed.
  • The arguments to a type constructor need not be distinct type variables, they can also be type applications or the same type variable can be used more than once. However, since at most one instance per class and type constructor may be in place, this is not generally recommended. In addition, I am not sure if this will work with multi-parameter classes and instances (not yet implemented), so the best bet is to stick with the Haskell 2010 rules.

"All in one" instance declarations.

Say we have something like:

class A a where
    foo :: a -> something
class A b => B b where
    bar :: b -> something
class B c => C c where
    baz :: c -> something

Then if we want to write an instance of C for some type T, we can just write:

instance C T where
    foo = ...
    bar = ...
    baz = ...

This is equivalent to

instance A T where
    foo = ...
instance B T where
    bar = ...
instance C T where
    baz = ....

The advantage being that the "all in one" form will compile as long as the "overall" interface of C (here foo, bar, baz) does not change. The library maintainer could alter the hierarchy behind C, for example by splitting B into B1 and B2 or adding another super class to A. OTOH, the library consumer does not even have to know that A and B exist.

4.3.3 Derived Instances

A variant of the instance declaration uses the keyword derive in place of instance and lacks a where clause. It serves the same purpose as the deriving keyword in Haskell, but because it is independent of the data declaration it can occur everywhere an instance declaration can, even in a different module. Hence, class, data and derive declarations could be in up to 3 different modules.

If the instantiated type is given without constraints, derive automatically assumes a simple one. For example, one can have

data T a b c = .....
derive Eq  (T a b c)
-- short for derive (Eq a, Eq b, Eq c) => Eq (T a b c)

Currently, instances for Eq, Ord, Show, Enum and Bounded can be derived. Derived Show instances currently do not show field labels/record syntax.

8 Foreign Function Interface

Frege has language constructs to

  • use any existing Java primitive, class or interface type as abstract data type. Actually, all basic types like Bool, Char, Int, Long, Integer, Float, Double, String, Regex are just abstract datatypes borrowed from Java.
  • make existing java methods and even java operators available as Frege functions. These are known as native functions .

While the interoperability with other languages is not strictly restricted to Java it certainly can't cross the border of the JVM.

Miscellaneous

The standard type class hierarchy deviates from the Haskell standard, especially for numeric classes and for monadic ones, where we follow the Monad is an Applicative is a Functor approach.

Functions can have higher ranked types, provided an appropriate annotation was given. The compiler can check higher rank types, but cannot infer them.

There is no monomorphism restriction. Beware of polymorphic top level "values", they will turn into functions at runtime.

The types of un-annotated let bindings will not be generalised. If one needs a polymorphic local function, an annotation is needed. Note that there are cases where it is not possible to write the type of such a function, the Haskell 2010 report has examples of this.

Likewise, values bound by a lambda or case alternative are doomed to be monomorphic unless there is a type annotation that allows the compiler to infer a polymorphic type.

Several GHC language options correspond to features in Frege, please consult the detailed list.

Here are three ways to achieve polymorphism for a lambda bound value:

-- compiler knows that rev is polymorphic from data definition (see above)
foo (LF{rev}) = (rev [1,2], rev['a', 'b'])

-- compiler learns type of rev from bars type signature
bar :: (forall a.[a]->[a]) -> ([Int], [Char])
bar rev = (rev [1,2], rev['a', 'b'])

-- compiler learns type of rev from pattern signature
baz (rev :: forall a.[a]->[a]) = (rev [1,2], rev ['a', 'b'])

This works similarly in case alternatives and with higher rank types.