Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A New Design for Tuples in Rascal #154

Closed
PaulKlint opened this issue Mar 1, 2013 · 16 comments
Closed

A New Design for Tuples in Rascal #154

PaulKlint opened this issue Mar 1, 2013 · 16 comments

Comments

@PaulKlint
Copy link
Member

A New Design for Tuples in Rascal

Current Design

The current tuple type is defined by the type constructor tuple with a list of ordered (optionally named) fields:
tuple[t1 n1, t2 n2, … ].

An example is the type tuple[str name, int age]. The assignment B = <”Belle”, 3> assigns a tuple of this type to variable B.

The elements of a tuple can be accessed by

  • Position:B[0].
  • Field name: B.name (provided that the field name is present)

The following operations are available for tuples:

  • Tuple concatenation, using the + operator.
  • Tuple pattern matching.

Problems to be addressed

  • Before the introduction of keyword parameters. Function calls and tuples where nearly identical (except for varyadic arguments). We want to restore this similarity in the presence of keyword parameters.
  • We want to share keyword parameters across function calls (see Allow to pass the keywords parameters to another function #94).
  • Allow row polymorphism, in order to write functions on relations with a varying number fields (= columns). This has many applications, one of them implementing the “grammar of graphics” approach to drawing charts.

Proposed Design

The proposed design can be summarized as: “add unordered keyword fields with default value to tuples and introduce tuple splicing”. More precisely, the arguments of the tuple constructor may be (in this order):

  • A possibly empty, ordered, list of positional, typed and named fields.
  • A possibly empty, unordered, list of keyword fields with default values.

The elements of a tuple value can be accessed as follows:

  • Positional arguments can be accessed by position and by name (when present).
  • Keyword fields can only be accessed by their name.

The following operations are available on these enhanced tuples:

  • Concatenation: positional fields are concatenated in order (as before), and the keyword fields are merged (with incompatible redefinitions being forbidden).
  • Tuple splicing: splice one tuple in another one. This preserves the order of positional fields and also takes care of placing positional fields befire keyword fields.
  • Tuple pattern matching, also allowing tuple splicing. We impose the restriction that a multi-variable may only occur as the last element of a tuple pattern.

For completeness sake, we may want support tuple substraction as well.

Some examples

Introduce CourseGrade to represent the grades for a course:

alias CourseGrade = tuple[str name, 
                          list[int] attempts, 
                          bool obligatory = true, 
                          prerequisites = true];

Create a CourseGrade value:

CourseGrade g = <“algebra II”, [4, 5, 7], obligatory = false>;

Define a function that computes average grades:

real avg(set[CourseGrade] grades) = 
     (0 | it + last(g.attempts) | 
          g <- grades, size(g.attempts) > 0) / size(grades));

Tuple concatenation:

< “algebra II”, [4, 5, 7], obligatory = false> + <”john”, parttime = true> ⇒
< “algebra II”,  [4, 5, 7], “John”, obligatory = false, parttime = true>

Tuple splicing:
Assume A = <[4, 5, 7], prerequisites = true>, then

<”algebra II”, obligatory = true, *A> ⇒ 
< “algebra II”, [4, 5, 7], obligatory = true, prerequisites = true>

Tuple Matching:
The match

<str name, obligatory=O, *R> := 
< “algebra II”, [4, 5, 7], obligatory = true, prerequisites = true> 

will succeed and creates the bindings:

  • name == “algebra II”,
  • O == true,
  • R = <[4,5,7], prerequisites= true>

Tuple Matching

We make the concept of tuple matching now a bit more precise. A tuple pattern P and a tuple value T, match iff

  • If P contains no multi-variable:
    • They have the same number of positional fields and corresponding positional fields match.
    • Each keyword field in P matches with a keyword field with the same name in T.
  • If P does contain one multi-variable:
    • Each positional field in P matches with a corresponding positional field in T.
    • Each keyword field in P matches with a keyword field with the same name in T.
      *Unmatched postional fields and keyword fields are collected in a tuple that is bound to the multi-variable.
  • If P contains more than one multi-variable:
    • Not allowed

The following example matches will succeed:

  • <3, 4, b=true> := <3, 4, x=3.5, b=true>
  • <3, *R> := <3, 4, x=3.5, b=true> and R will be bound to <4, x=3.5, b=true>
  • <b=true, *R> := <3, 4, x=3.5, b=true> and R will be bound to <3, 4, x=3.5>

Function calls

Disregarding varyadic arguments (see the discussion below), function signatures and tuples are identical in the proposed design. Both consist of an ordered list of positional parameters followed by an unordered list of keyword parameters/fields. Consider the following function:

real scale1(real x, real factor = 1.0, bool mul= true) = 
mul ? x * factor : x / factor; 

It can easily be rewritten as:

alias Scale = tuple[real x, real factor = 1.0, bool mul = true];
real scale2(Scale s) = s.mul ? s.x * s.factor : s.x / s.factor;

The only inconvenience is that we have to prefix every use of a field with “s.” To solve this, we want tuple splicing at the level of argument lists (which is completely natural if we consider the argument list as a tuple). For instance,

real scale3(*Scale s) = mul ? x * factor : x / factor;

The function scale3 can be called in two ways:

  • x1 = scale3(3.5, factor=3);
  • Scale s = <3.5, factor=3); x1 = scale3(s);

Now suppose the arguments of a function are coming both from direct arguments of the function and from arguments coming from a tuple argument. Then we have to concatenate the direct arguments and the spliced-in tuple arguments as follows:

real scale4(bool printit=false, *Scale s) {
  r = mul ? x * factor : x / factor;
  if(printit) println(“scale4 returns <r>”);
  return r;
}

Typical calls to scale4 are:

  • x1 = scale4(3.5, printit=true, factor=3);
  • Scale s = <3.5, factor=3); x1 = scale4(printit=true, s);

Examples

Domain

The domain function for relations of any arity looks like this:

set[&A] domain(rel[&A, *&REST] R) = { a | <a, *r> <- R };

Range

The range function for relations of any arity looks like this:

rel[&REST] range(rel[&A, *&REST] R) = { r | <a, *r> <- R };

Observe that the precise type of the range is returned, e.g.,

range({<1, “a”, false>, <2, “b”, true>}) => 
rel[str,bool]: {<“a”, false>, < “b”, true>}

Compose

The composition of two relations (both of arity of at least 2) could be defined as:

rel[&A, &C, *&R1, *&R2] compose(rel[&A, &B, *&R1] ab, rel[&B, &C, *&R2] bc) =
  { <a, c, *r1, *r2>  | <a, b, *r1> <- ab, <b, c, *r2> <- bc};

Note that this is an interesting example since the result type contains two multi-variables.

Move (using a positional field)

Define a move function that is applicable to any tuple that has a first field x and returns a new tuple with x incremented by 1. An example of such a tuple type is:

alias Point = tuple[int x, int y];

&T move(tuple[int x, *&R rest] p) = <p.x+1, *p.rest>;

Move (using a keyword field)

Define a move function that is applicable to any tuple that has a keyword field x, such as in:

alias Point = tuple[int y, int x=0];

&T move(tuple[int x = xdefault, *&R rest] p) = <x = p.x + 1, *p.rest> : p;

Davy’s Example

This example (see #94) is about passing keyword parameters to a recursive call of the same function. The solution is to use tuple splicing in the function signature.

alias Options = tuple[bool recursive = true, 
                      bool checkDependencies = true, 
                      bool calculateDuplicates = true];

public set[X] collect(Y src, *Options o) {
  if (file(f) := Y) {
    // do some stuff or not
    return x(..);
  }
  else if (recursive) {
    return { *collect(s, o) | s <- Y.files};
  }
  return {};
}

call: collect(|...|, calculateDuplicates = true);

Tijs’ Example

This is similar to Davy’s example in a slightly different setting:

alias Env = map[str,int];
alias EvalArgs = tuple[Env env, bool debug = false, 
                       bool trace = false, int optLevel = 0];

int eval(add(E1, E2), *EvalArgs a) = eval(E1, a) + eval(e2, a);
int eval(sub(E1, E2), *EvalArgs a) = eval(E1, a) - eval(e2, a);

Grammar of Graphics

The Grammar of Graphics is way of organizing data to gradually transform input data into a format that makes it simple to draw charts. Essentially, given relations are extended with auxiliary columns to represent the drawing information. Here is an example function geoms that takes a dataset as input and extends it with a field geom that indicates whether a circle or square should be drawn at each position (described by the required keyword fields x and y):

lrel[int x = xdefault, int y = ydefault, str geom, *&Rest] 
    geoms(lrel(int x = xdefault, int y = xdefault, *&Rest] dataset)   
    = [<x=xval, y=yval, geom=select(xval,yval), r>  | 
       <int x = xval, int y = yval, *r> <- dataset 
      ];

str select(int x, int y) = x < 0 ? “circle” : “square”;

Definition and Implementation

PDB

The following changes to the PDB are needed:

  • The tuple type has to be extended with keyword fields.
  • Concatenation on tuples has to be generalized.

Note: a more general issue is where to store this information (in the type or in the value). The implementation of keyword parameters/fields should be uniform for nodes, constructors and tuples.

Syntax for tuple type

The current definition of tuple type is:

TupleType = “tuple” “[“ {TypeArg “,”}+ “]”;
TypeArg   = Type | Type Name;

The new definition will be:

TupleType = “tuple” “[“ {TypeArg “,”}+ “]”;
TypeArg     = Type | Type Name | Type Name “=” Expression | “*” Type ;

Notes:

  • We may want to impose a restriction on the order of TypeArgs (e.g., positional ones first).
  • This definition allows more than one multi-variable, and they may occur at any position in the type. Some restrictions may be necessary here as well (e.g. allow only one multi-variable in a pattern)
  • The above definition opens up the possibility to have splicing for othe types as well (but does this have a meaningful interpretation?)

Rascal Interpreter

This will affect the following parts of the Rascal interpreter:

  • Tuple expressions
  • Tuple matching
  • Calls
  • Possibly nodes and ADTs (in order to harmonize the implementation)

Other Tools

  • Regeneration of the parser.
  • Extension of the type checker

Staging of Implementation

  • Implement extended tuples in PDB.
  • Extend Rascal syntax
  • Implement tuple expressions and tuple matching in Rascal interpreter
  • Adapt call code to use new tuples.
  • Adapt code for nodes, constructors, etc. to harmonize implementation.
  • Extend type checker.

Remaining Issues

Varyadic arguments for functions

In principle, one could add varyadic fields to tuples as well (using the same notation as we have for varyadic function arguments) but this makes it hard to infer the tuple type. For instance, consider <”abc”, 1, 2, 3> is its type:

  • tuple[str,int,int,int], or rather
  • tuple[str, int …], or rather
  • tuple[str, list[int]].

To avoid this confusion, we have not added varyadic field to tuples.

But how about functions with varyadic arguments? There are two solution directions:

  • Remove them from the language:
    • Pro: argument lists and tuples become indentical and this has many benefits both at the language level and at the implementation level.
    • Con: Makes some users unhappy (although it turns out that varyadic arguments are not used that much).
  • Allow them after an optional spliced tuple argument:
    • Pro: less changes to the language.
    • Con: We can achieve less uniformity.

The prototypical example of varyadic arguments is a print function:
void print(value v …) { … }

With a call like: print(1, “bc”, false);

If we remove varyadic arguments then the following alternatives are available to define the print function:

  • void print(list[value] vals) { … }; Call: print([1, “bc”, false]);
  • void print([*value vals]) { … } ; Call: print([1, “bc”, false]);
  • void print(*tuple[*&X] vals){ … }; Call: print(1, “bc”, false);

In the first two styles, vals is bound to a list of values, in the last one, to a tuple of value fields.

@ghost ghost assigned PaulKlint Mar 1, 2013
@jurgenvinju
Copy link
Member

Cool stuff. Lets talk about tuple equality. When are they equal and when
not?

I would prefer that tuples that do not have a given keyword parameter to be
equal to tuples that have the keyword parameter but set to default. This
would solve problems such as:

tuple[int w=0, int h = 0] t = <h=1>;

The tuple literal does not know or need to be aware of the fact that it
will be assigned to a larger tuple. Then:

tuple[int w=0, int h = 0] u = <h=1, w=0>;

u == t should be true even though the runtime tuple type of t does not see
or have the default value for w.

A solution would be to store the value of u the same as t, namely without
the value for w.

This came out of a discussion with Nastya.

On Friday, March 1, 2013, Paul Klint wrote:

A New Design for Tuples in Rascal Current Design

The current tuple type is defined by the type constructor tuple with a
list of ordered (optionally named) fields:
tuple[t1 n1, t2 n2, ... ].

An example is the type tuple[str name, int age]. The assignment B =
<"Belle", 3> assigns a tuple of this type to variable B.

The elements of a tuple can be accessed by

  • Position:B[0].
  • Field name: B.name (provided that the field name is present)

The following operations are available for tuples:

  • Tuple concatenation, using the + operator.
  • Tuple pattern matching.

Problems to be addressed

Proposed Design

The proposed design can be summarized as: "add unordered keyword fields
with default value to tuples and introduce tuple splicing". More precisely,
the arguments of the tuple constructor may be (in this order):

  • A possibly empty, ordered, list of positional, typed and named
    fields.
  • A possibly empty, unordered, list of keyword fields with default
    values.

The elements of a tuple value can be accessed as follows:

  • Positional arguments can be accessed by position and by name (when
    present).
  • Keyword fields can only be accessed by their name.

The following operations are available on these enhanced tuples:

  • Concatenation: positional fields are concatenated in order (as
    before), and the keyword fields are merged (with incompatible redefinitions
    being forbidden).
  • Tuple splicing: splice one tuple in another one. This preserves the
    order of positional fields and also takes care of placing positional fields
    befire keyword fields.
  • Tuple pattern matching, also allowing tuple splicing. We impose the
    restriction that a multi-variable may only occur as the last element of a
    tuple pattern.

For completeness sake, we may want support tuple substraction as well.
Some examples

Introduce CourseGrade to represent the grades for a course:

alias CourseGrade = tuple[str name,
list[int] attempts,
bool obligatory = true,
prerequisites = true];

Create a CourseGrade value:

CourseGrade g = <"algebra II", [4, 5, 7], obligatory = false>;

Define a function that computes average grades:

real avg(set[CourseGrade] grades) =
(0 | it + last(g.attempts) |
g <- grades, size(g.attempts) > 0) / size(grades));

Tuple concatenation:

< "algebra II", [4, 5, 7], obligatory = false> + <"john", parttime = true> =>
< "algebra II", [4, 5, 7], "John", obligatory = false, parttime = true>

Tuple splicing:
Assume A = <[4, 5, 7], prerequisites = true>, then

<"algebra II", obligatory = true, *A> =>
< "algebra II", [4, 5, 7], obligatory = true, prerequisites = true>

Tuple Matching:
The match

<str name, obligatory=O, *R> :=
< "algebra II", [4, 5, 7], obligatory = true, prerequisites = true>

will succeed and creates the bindings:

  • name == "algebra II",
  • O == true,
  • R = <[4,5,7], prerequisites= true>

Tuple Matching

We make the concept of tuple matching now a bit more precise. A tuple
pattern P and a tuple value T, match iff

If P contains no multi-variable:
- They have the same number of positional fields and corresponding
positional fields match.
- Each keyword field in P matches with a keyword field with the
same name in T.
-

If P does contain one multi-variable:
- Each positional field in P matches with a corresponding positional
field in T.
- Each keyword field in P matches with a keyword field with the
same name in T. *Unmatched postional fields and keyword fields are
collected in a tuple that is bound to the multi-variable.
-

If P contains more than one multi-variable:
- Not allowed

The following example matches will succeed:

  • <3, 4, b=true> := <3, 4, x=3.5, b=true>
  • <3, *R> := <3, 4, x=3.5, b=true> and R will be bound to <4, x=3.5,
    b=true>
  • <b=true, *R> := <3, 4, x=3.5, b=true> and R will be bound to <3, 4,
    x=3.5>

Function calls

Disregarding varyadic arguments (see the discussion below), function
signatures and tuples are identical in the proposed design. Both consist of
an ordered list of positional parameters followed by an unordered list of
keyword parameters/fields. Consider the following function:

real scale1(real x, real factor = 1.0, bool mul= true) =
mul ? x * factor : x / factor;

It can easily be rewritten as:

alias Scale = tuple[real x, real factor = 1.0, bool mul = true];
real scale2(Scale s) = s.mul ? s.x * s.factor : s.x / s.factor;

The only inconvenience is that we have to prefix every use of a field with
"s." To solve this, we want tuple splicing at the level of argument lists
(which is completely natural if we consider the argument list as a tuple).
For instance,

real scale3(*Scale s) = mul ? x * factor : x / factor;

The function scale3 can be called in two ways:

  • x1 = scale3(3.5, factor=3);
  • Scale s = <3.5, factor=3); x1 = scale3(s);

Now suppose the arguments of a function are coming both from direct
arguments of the function and from arguments coming from a tuple argument.
Then we have to concatenate the direct arguments and the spliced-in tuple
arguments as follows:

real scale4(bool printit=false, *Scale s) {
r = mul ? x * factor : x / factor;
if(printit) println("scale4 returns ");
return r;
}

Typical calls to scale4 are:

  • x1 = scale4(3.5, printit=true, factor=3);
  • Scale s = <3.5, factor=3); x1 = scale4(printit=true, s);

Examples Domain

The domain function for relations of any arity looks like this:

set[&A] domain(rel[&A, *&REST] R) = { a | <a, *r> <- R };

Range

The range function for relations of any arity looks like this:

rel[&REST] range(rel[&A, *&REST] R) = { r | <a, *r> <- R };

Observe that the precise type of the range is returned, e.g.,

range({<1, "a", false>, <2, "b", true>}) =>
rel[str,bool]: {<"a", false>, < "b", true>}

Compose

The composition of two relations (both of arity of at least 2) could be
defined as:

rel[&A, &C, *&R1, *&R2] compose(rel[&A, &B, *&R1] ab, rel[&B, &C, *&R2] bc) =
{ <a, c, *r1, *r2> | <a, b, *r1> <- ab, <b, c, *r2> <- bc};

Note that this is an interesting example since the result type contains
two multi-variables.
Move (using a positional field)

Define a move function that is applicable to any tuple that has a first
field x and returns a new tuple with x incremented by 1. An example of such
a tuple type is:

alias Point = tuple[int x, int y];

&T move(tuple[int x, *&R rest] p) = <p.x+1, *p.rest>;

Move (using a keyword field)

Define a move function that is applicable to any tuple that has a keyword
field x, such as in:

alias Point = tuple[int y, int x=0];

&T move(tuple[int x = xdefault, *&R rest] p) = <x = p.x + 1, *p.rest> : p;

Davy's Example

This example (see #94 #94) is
about passing keyword parameters to a recursive call of the same function.
The solution is to use tuple splicing in the function signature.

alias Options = tuple[bool recursive = true,
bool checkDependencies = true,
bool calculateDuplicates = true];

public set[X] collect(Y src, *Options o) {
if (file(f) := Y) {
// do some stuff or not
return x(..);
}
else if (recursive) {
return { *collect(s, o) | s <- Y.files};
}
return {};
}

call: collect(|...|, calculateDuplicates = true);

Tijs' Example

This is similar to Davy's example in a slightly different setting:

alias Env = map[str,int];
alias EvalArgs = tuple[Env env, bool debug = false,
bool trace = false, int optLevel = 0];

int eval(add(E1, E2), *EvalArgs a) = eval(E1, a) + eval(e2, a);
int eval(sub(E1, E2), *EvalArgs a) = eval(E1, a) - eval(e2, a);

Grammar of Graphics

The Grammar of Graphics is way of organizing data to gradually transform
input data into a format that makes it simple to draw charts. Essentially,
given relations are extended with auxiliary columns to represent the
drawing information. Here is an example function geoms that takes a
dataset as input and extends it with a field geom that indicates whether
a circle or square should be drawn at each position (described by the
required keyword fields x and y):

lrel[int x = xdefault, int y = ydefault, str geom, *&Rest]
geoms(lrel(int x = xdefault, int y = xdefault, *&Rest] dataset)
= [<x=xval, y=yval, geom=select(xval,yval), r> |
<int x = xval, int y = yval, *r> <- dataset
];

str select(int x, int y) = x < 0 ? "circle" : "square";

Definition and Implementation PDB

The following changes to the PDB are needed:

  • The tuple type has to be extended with keyword fields.
  • Concatenation on tuples has to be generalized.

Note: a more general issue is where to store this information (in the type
or in the value). The implementation of keyword parameters/fields should be
uniform for nodes, constructors and tuples.
Syntax for tuple type

The current definition of tuple type is:

TupleType = "tuple" "[" {TypeArg ","}+ "]";
TypeArg = Type | Type Name;

The new definition will be:

TupleType = "tuple" "[" {TypeArg ","}+ "]";
TypeArg = Type | Type Name | Type Name "=" Expression | "*" Type ;

Notes:

  • We may want to impose a restriction on the order of TypeArgs (e.g.,
    positional ones first).
  • This definition allows more than one multi-variable, and they may
    occur at any position in the type. Some restrictions may be necessary here
    as well (e.g. allow only one multi-variable in a pattern)
  • The above definition opens up the possibility to have splicing for
    othe types as well (but does this have a meaningful interpretation?)

Rascal Interpreter

This will affect the following parts of the Rascal interpreter:

  • Tuple expressions
  • Tuple matching
  • Calls
  • Possibly nodes and ADTs (in order to harmonize the implementation)

Other Tools

  • Regeneration of the parser.
  • Extension of the type checker

Staging of Implementation

  • Implement extended tuples in PDB.
  • Extend Rascal syntax
  • Implement tuple expressions and tuple matching in Rascal interpreter
  • Adapt call code to use new tuples.
  • Adapt code for nodes, constructors, etc. to harmonize implementation.
  • Extend type checker.

Remaining Issues Varyadic arguments for functions

In principle, one could add varyadic fields to tuples as well (using the
same notation as we have for varyadic function arguments) but this makes it
hard to infer the tuple type. For instance, consider <"abc", 1, 2, 3> is
its type:

  • tuple[str,int,int,int], or rather
  • tuple[str, int ...], or rather
  • tuple[str, list[int]].

To avoid this confusion, we have not added varyadic field to tuples.

But how about functions with varyadic arguments? There are two solution
directions:

  • Remove them from the language:
    • Pro: argument lists and tuples become indentical and this has
      many benefits both at the language level and at the implementation level.
    • Con: Makes some users unhappy (although it turns out that
      varyadic arguments are not used that much).
      • Allow them after an optional spliced tuple argument:
    • Pro: less changes to the language.
    • Con: We can achieve less uniformity.

The prototypical example of varyadic arguments is a print function:
void print(value v ...) { ... }

With a call like: print(1, "bc", false);

If we remove varyadic arguments then the following alternatives are
available to define the print function:

void print(list[value] vals) { ... }; Call: print([1, "bc", false]);

void print([*value vals]) { ... } ; Call: print([1, "bc", false]);

void print(tuple[&X] vals){ ... }; Call: print(1, "bc", false);

In the first two styles, vals is bound to a list of values, in the last
one, to a tuple of value fields.

Reply to this email directly or view it on GitHubhttps://github.com//issues/154
.

Jurgen Vinju

@jurgenvinju
Copy link
Member

I think variadic argument lists are most useful in languages that do not
have list or array literals. I could do without for simplicity sake.
Overloading is problematic with these things in the type checker for
example.

On Friday, March 1, 2013, Paul Klint wrote:

A New Design for Tuples in Rascal Current Design

The current tuple type is defined by the type constructor tuple with a
list of ordered (optionally named) fields:
tuple[t1 n1, t2 n2, ... ].

An example is the type tuple[str name, int age]. The assignment B =
<"Belle", 3> assigns a tuple of this type to variable B.

The elements of a tuple can be accessed by

  • Position:B[0].
  • Field name: B.name (provided that the field name is present)

The following operations are available for tuples:

  • Tuple concatenation, using the + operator.
  • Tuple pattern matching.

Problems to be addressed

Proposed Design

The proposed design can be summarized as: "add unordered keyword fields
with default value to tuples and introduce tuple splicing". More precisely,
the arguments of the tuple constructor may be (in this order):

  • A possibly empty, ordered, list of positional, typed and named
    fields.
  • A possibly empty, unordered, list of keyword fields with default
    values.

The elements of a tuple value can be accessed as follows:

  • Positional arguments can be accessed by position and by name (when
    present).
  • Keyword fields can only be accessed by their name.

The following operations are available on these enhanced tuples:

  • Concatenation: positional fields are concatenated in order (as
    before), and the keyword fields are merged (with incompatible redefinitions
    being forbidden).
  • Tuple splicing: splice one tuple in another one. This preserves the
    order of positional fields and also takes care of placing positional fields
    befire keyword fields.
  • Tuple pattern matching, also allowing tuple splicing. We impose the
    restriction that a multi-variable may only occur as the last element of a
    tuple pattern.

For completeness sake, we may want support tuple substraction as well.
Some examples

Introduce CourseGrade to represent the grades for a course:

alias CourseGrade = tuple[str name,
list[int] attempts,
bool obligatory = true,
prerequisites = true];

Create a CourseGrade value:

CourseGrade g = <"algebra II", [4, 5, 7], obligatory = false>;

Define a function that computes average grades:

real avg(set[CourseGrade] grades) =
(0 | it + last(g.attempts) |
g <- grades, size(g.attempts) > 0) / size(grades));

Tuple concatenation:

< "algebra II", [4, 5, 7], obligatory = false> + <"john", parttime = true> =>
< "algebra II", [4, 5, 7], "John", obligatory = false, parttime = true>

Tuple splicing:
Assume A = <[4, 5, 7], prerequisites = true>, then

<"algebra II", obligatory = true, *A> =>
< "algebra II", [4, 5, 7], obligatory = true, prerequisites = true>

Tuple Matching:
The match

<str name, obligatory=O, *R> :=
< "algebra II", [4, 5, 7], obligatory = true, prerequisites = true>

will succeed and creates the bindings:

  • name == "algebra II",
  • O == true,
  • R = <[4,5,7], prerequisites= true>

Tuple Matching

We make the concept of tuple matching now a bit more precise. A tuple
pattern P and a tuple value T, match iff

If P contains no multi-variable:
- They have the same number of positional fields and corresponding
positional fields match.
- Each keyword field in P matches with a keyword field with the
same name in T.
-

If P does contain one multi-variable:
- Each positional field in P matches with a corresponding positional
field in T.
- Each keyword field in P matches with a keyword field with the
same name in T. *Unmatched postional fields and keyword fields are
collected in a tuple that is bound to the multi-variable.
-

If P contains more than one multi-variable:
- Not allowed

The following example matches will succeed:

  • <3, 4, b=true> := <3, 4, x=3.5, b=true>
  • <3, *R> := <3, 4, x=3.5, b=true> and R will be bound to <4, x=3.5,
    b=true>
  • <b=true, *R> := <3, 4, x=3.5, b=true> and R will be bound to <3, 4,
    x=3.5>

Function calls

Disregarding varyadic arguments (see the discussion below), function
signatures and tuples are identical in the proposed design. Both consist of
an ordered list of positional parameters followed by an unordered list of
keyword parameters/fields. Consider the following function:

real scale1(real x, real factor = 1.0, bool mul= true) =
mul ? x * factor : x / factor;

It can easily be rewritten as:

alias Scale = tuple[real x, real factor = 1.0, bool mul = true];
real scale2(Scale s) = s.mul ? s.x * s.factor : s.x / s.factor;

The only inconvenience is that we have to prefix every use of a field with
"s." To solve this, we want tuple splicing at the level of argument lists
(which is completely natural if we consider the argument list as a tuple).
For instance,

real scale3(*Scale s) = mul ? x * factor : x / factor;

The function scale3 can be called in two ways:

  • x1 = scale3(3.5, factor=3);
  • Scale s = <3.5, factor=3); x1 = scale3(s);

Now suppose the arguments of a function are coming both from direct
arguments of the function and from arguments coming from a tuple argument.
Then we have to concatenate the direct arguments and the spliced-in tuple
arguments as follows:

real scale4(bool printit=false, *Scale s) {
r = mul ? x * factor : x / factor;
if(printit) println("scale4 returns ");
return r;
}

Typical calls to scale4 are:

  • x1 = scale4(3.5, printit=true, factor=3);
  • Scale s = <3.5, factor=3); x1 = scale4(printit=true, s);

Examples Domain

The domain function for relations of any arity looks like this:

set[&A] domain(rel[&A, *&REST] R) = { a | <a, *r> <- R };

Range

The range function for relations of any arity looks like this:

rel[&REST] range(rel[&A, *&REST] R) = { r | <a, *r> <- R };

Observe that the precise type of the range is returned, e.g.,

range({<1, "a", false>, <2, "b", true>}) =>
rel[str,bool]: {<"a", false>, < "b", true>}

Compose

The composition of two relations (both of arity of at least 2) could be
defined as:

rel[&A, &C, *&R1, *&R2] compose(rel[&A, &B, *&R1] ab, rel[&B, &C, *&R2] bc) =
{ <a, c, *r1, *r2> | <a, b, *r1> <- ab, <b, c, *r2> <- bc};

Note that this is an interesting example since the result type contains
two multi-variables.
Move (using a positional field)

Define a move function that is applicable to any tuple that has a first
field x and returns a new tuple with x incremented by 1. An example of such
a tuple type is:

alias Point = tuple[int x, int y];

&T move(tuple[int x, *&R rest] p) = <p.x+1, *p.rest>;

Move (using a keyword field)

Define a move function that is applicable to any tuple that has a keyword
field x, such as in:

alias Point = tuple[int y, int x=0];

&T move(tuple[int x = xdefault, *&R rest] p) = <x = p.x + 1, *p.rest> : p;

Davy's Example

This example (see #94 #94) is
about passing keyword parameters to a recursive call of the same function.
The solution is to use tuple splicing in the function signature.

alias Options = tuple[bool recursive = true,
bool checkDependencies = true,
bool calculateDuplicates = true];

public set[X] collect(Y src, *Options o) {
if (file(f) := Y) {
// do some stuff or not
return x(..);
}
else if (recursive) {
return { *collect(s, o) | s <- Y.files};
}
return {};
}

call: collect(|...|, calculateDuplicates = true);

Tijs' Example

This is similar to Davy's example in a slightly different setting:

alias Env = map[str,int];
alias EvalArgs = tuple[Env env, bool debug = false,
bool trace = false, int optLevel = 0];

int eval(add(E1, E2), *EvalArgs a) = eval(E1, a) + eval(e2, a);
int eval(sub(E1, E2), *EvalArgs a) = eval(E1, a) - eval(e2, a);

Grammar of Graphics

The Grammar of Graphics is way of organizing data to gradually transform
input data into a format that makes it simple to draw charts. Essentially,
given relations are extended with auxiliary columns to represent the
drawing information. Here is an example function geoms that takes a
dataset as input and extends it with a field geom that indicates whether
a circle or square should be drawn at each position (described by the
required keyword fields x and y):

lrel[int x = xdefault, int y = ydefault, str geom, *&Rest]
geoms(lrel(int x = xdefault, int y = xdefault, *&Rest] dataset)
= [<x=xval, y=yval, geom=select(xval,yval), r> |
<int x = xval, int y = yval, *r> <- dataset
];

str select(int x, int y) = x < 0 ? "circle" : "square";

Definition and Implementation PDB

The following changes to the PDB are needed:

  • The tuple type has to be extended with keyword fields.
  • Concatenation on tuples has to be generalized.

Note: a more general issue is where to store this information (in the type
or in the value). The implementation of keyword parameters/fields should be
uniform for nodes, constructors and tuples.
Syntax for tuple type

The current definition of tuple type is:

TupleType = "tuple" "[" {TypeArg ","}+ "]";
TypeArg = Type | Type Name;

The new definition will be:

TupleType = "tuple" "[" {TypeArg ","}+ "]";
TypeArg = Type | Type Name | Type Name "=" Expression | "*" Type ;

Notes:

  • We may want to impose a restriction on the order of TypeArgs (e.g.,
    positional ones first).
  • This definition allows more than one multi-variable, and they may
    occur at any position in the type. Some restrictions may be necessary here
    as well (e.g. allow only one multi-variable in a pattern)
  • The above definition opens up the possibility to have splicing for
    othe types as well (but does this have a meaningful interpretation?)

Rascal Interpreter

This will affect the following parts of the Rascal interpreter:

  • Tuple expressions
  • Tuple matching
  • Calls
  • Possibly nodes and ADTs (in order to harmonize the implementation)

Other Tools

  • Regeneration of the parser.
  • Extension of the type checker

Staging of Implementation

  • Implement extended tuples in PDB.
  • Extend Rascal syntax
  • Implement tuple expressions and tuple matching in Rascal interpreter
  • Adapt call code to use new tuples.
  • Adapt code for nodes, constructors, etc. to harmonize implementation.
  • Extend type checker.

Remaining Issues Varyadic arguments for functions

In principle, one could add varyadic fields to tuples as well (using the
same notation as we have for varyadic function arguments) but this makes it
hard to infer the tuple type. For instance, consider <"abc", 1, 2, 3> is
its type:

  • tuple[str,int,int,int], or rather
  • tuple[str, int ...], or rather
  • tuple[str, list[int]].

To avoid this confusion, we have not added varyadic field to tuples.

But how about functions with varyadic arguments? There are two solution
directions:

  • Remove them from the language:
    • Pro: argument lists and tuples become indentical and this has
      many benefits both at the language level and at the implementation level.
    • Con: Makes some users unhappy (although it turns out that
      varyadic arguments are not used that much).
      • Allow them after an optional spliced tuple argument:
    • Pro: less changes to the language.
    • Con: We can achieve less uniformity.

The prototypical example of varyadic arguments is a print function:
void print(value v ...) { ... }

With a call like: print(1, "bc", false);

If we remove varyadic arguments then the following alternatives are
available to define the print function:

void print(list[value] vals) { ... }; Call: print([1, "bc", false]);

void print([*value vals]) { ... } ; Call: print([1, "bc", false]);

void print(tuple[&X] vals){ ... }; Call: print(1, "bc", false);

In the first two styles, vals is bound to a list of values, in the last
one, to a tuple of value fields.

Reply to this email directly or view it on GitHubhttps://github.com//issues/154
.

Jurgen Vinju

@jurgenvinju
Copy link
Member

Is row polymorphism only for the nonpositional fields? Otherwise i could
imagine issues.

On Friday, March 1, 2013, Paul Klint wrote:

A New Design for Tuples in Rascal Current Design

The current tuple type is defined by the type constructor tuple with a
list of ordered (optionally named) fields:
tuple[t1 n1, t2 n2, ... ].

An example is the type tuple[str name, int age]. The assignment B =
<"Belle", 3> assigns a tuple of this type to variable B.

The elements of a tuple can be accessed by

  • Position:B[0].
  • Field name: B.name (provided that the field name is present)

The following operations are available for tuples:

  • Tuple concatenation, using the + operator.
  • Tuple pattern matching.

Problems to be addressed

Proposed Design

The proposed design can be summarized as: "add unordered keyword fields
with default value to tuples and introduce tuple splicing". More precisely,
the arguments of the tuple constructor may be (in this order):

  • A possibly empty, ordered, list of positional, typed and named
    fields.
  • A possibly empty, unordered, list of keyword fields with default
    values.

The elements of a tuple value can be accessed as follows:

  • Positional arguments can be accessed by position and by name (when
    present).
  • Keyword fields can only be accessed by their name.

The following operations are available on these enhanced tuples:

  • Concatenation: positional fields are concatenated in order (as
    before), and the keyword fields are merged (with incompatible redefinitions
    being forbidden).
  • Tuple splicing: splice one tuple in another one. This preserves the
    order of positional fields and also takes care of placing positional fields
    befire keyword fields.
  • Tuple pattern matching, also allowing tuple splicing. We impose the
    restriction that a multi-variable may only occur as the last element of a
    tuple pattern.

For completeness sake, we may want support tuple substraction as well.
Some examples

Introduce CourseGrade to represent the grades for a course:

alias CourseGrade = tuple[str name,
list[int] attempts,
bool obligatory = true,
prerequisites = true];

Create a CourseGrade value:

CourseGrade g = <"algebra II", [4, 5, 7], obligatory = false>;

Define a function that computes average grades:

real avg(set[CourseGrade] grades) =
(0 | it + last(g.attempts) |
g <- grades, size(g.attempts) > 0) / size(grades));

Tuple concatenation:

< "algebra II", [4, 5, 7], obligatory = false> + <"john", parttime = true> =>
< "algebra II", [4, 5, 7], "John", obligatory = false, parttime = true>

Tuple splicing:
Assume A = <[4, 5, 7], prerequisites = true>, then

<"algebra II", obligatory = true, *A> =>
< "algebra II", [4, 5, 7], obligatory = true, prerequisites = true>

Tuple Matching:
The match

<str name, obligatory=O, *R> :=
< "algebra II", [4, 5, 7], obligatory = true, prerequisites = true>

will succeed and creates the bindings:

  • name == "algebra II",
  • O == true,
  • R = <[4,5,7], prerequisites= true>

Tuple Matching

We make the concept of tuple matching now a bit more precise. A tuple
pattern P and a tuple value T, match iff

If P contains no multi-variable:
- They have the same number of positional fields and corresponding
positional fields match.
- Each keyword field in P matches with a keyword field with the
same name in T.
-

If P does contain one multi-variable:
- Each positional field in P matches with a corresponding positional
field in T.
- Each keyword field in P matches with a keyword field with the
same name in T. *Unmatched postional fields and keyword fields are
collected in a tuple that is bound to the multi-variable.
-

If P contains more than one multi-variable:
- Not allowed

The following example matches will succeed:

  • <3, 4, b=true> := <3, 4, x=3.5, b=true>
  • <3, *R> := <3, 4, x=3.5, b=true> and R will be bound to <4, x=3.5,
    b=true>
  • <b=true, *R> := <3, 4, x=3.5, b=true> and R will be bound to <3, 4,
    x=3.5>

Function calls

Disregarding varyadic arguments (see the discussion below), function
signatures and tuples are identical in the proposed design. Both consist of
an ordered list of positional parameters followed by an unordered list of
keyword parameters/fields. Consider the following function:

real scale1(real x, real factor = 1.0, bool mul= true) =
mul ? x * factor : x / factor;

It can easily be rewritten as:

alias Scale = tuple[real x, real factor = 1.0, bool mul = true];
real scale2(Scale s) = s.mul ? s.x * s.factor : s.x / s.factor;

The only inconvenience is that we have to prefix every use of a field with
"s." To solve this, we want tuple splicing at the level of argument lists
(which is completely natural if we consider the argument list as a tuple).
For instance,

real scale3(*Scale s) = mul ? x * factor : x / factor;

The function scale3 can be called in two ways:

  • x1 = scale3(3.5, factor=3);
  • Scale s = <3.5, factor=3); x1 = scale3(s);

Now suppose the arguments of a function are coming both from direct
arguments of the function and from arguments coming from a tuple argument.
Then we have to concatenate the direct arguments and the spliced-in tuple
arguments as follows:

real scale4(bool printit=false, *Scale s) {
r = mul ? x * factor : x / factor;
if(printit) println("scale4 returns ");
return r;
}

Typical calls to scale4 are:

  • x1 = scale4(3.5, printit=true, factor=3);
  • Scale s = <3.5, factor=3); x1 = scale4(printit=true, s);

Examples Domain

The domain function for relations of any arity looks like this:

set[&A] domain(rel[&A, *&REST] R) = { a | <a, *r> <- R };

Range

The range function for relations of any arity looks like this:

rel[&REST] range(rel[&A, *&REST] R) = { r | <a, *r> <- R };

Observe that the precise type of the range is returned, e.g.,

range({<1, "a", false>, <2, "b", true>}) =>
rel[str,bool]: {<"a", false>, < "b", true>}

Compose

The composition of two relations (both of arity of at least 2) could be
defined as:

rel[&A, &C, *&R1, *&R2] compose(rel[&A, &B, *&R1] ab, rel[&B, &C, *&R2] bc) =
{ <a, c, *r1, *r2> | <a, b, *r1> <- ab, <b, c, *r2> <- bc};

Note that this is an interesting example since the result type contains
two multi-variables.
Move (using a positional field)

Define a move function that is applicable to any tuple that has a first
field x and returns a new tuple with x incremented by 1. An example of such
a tuple type is:

alias Point = tuple[int x, int y];

&T move(tuple[int x, *&R rest] p) = <p.x+1, *p.rest>;

Move (using a keyword field)

Define a move function that is applicable to any tuple that has a keyword
field x, such as in:

alias Point = tuple[int y, int x=0];

&T move(tuple[int x = xdefault, *&R rest] p) = <x = p.x + 1, *p.rest> : p;

Davy's Example

This example (see #94 #94) is
about passing keyword parameters to a recursive call of the same function.
The solution is to use tuple splicing in the function signature.

alias Options = tuple[bool recursive = true,
bool checkDependencies = true,
bool calculateDuplicates = true];

public set[X] collect(Y src, *Options o) {
if (file(f) := Y) {
// do some stuff or not
return x(..);
}
else if (recursive) {
return { *collect(s, o) | s <- Y.files};
}
return {};
}

call: collect(|...|, calculateDuplicates = true);

Tijs' Example

This is similar to Davy's example in a slightly different setting:

alias Env = map[str,int];
alias EvalArgs = tuple[Env env, bool debug = false,
bool trace = false, int optLevel = 0];

int eval(add(E1, E2), *EvalArgs a) = eval(E1, a) + eval(e2, a);
int eval(sub(E1, E2), *EvalArgs a) = eval(E1, a) - eval(e2, a);

Grammar of Graphics

The Grammar of Graphics is way of organizing data to gradually transform
input data into a format that makes it simple to draw charts. Essentially,
given relations are extended with auxiliary columns to represent the
drawing information. Here is an example function geoms that takes a
dataset as input and extends it with a field geom that indicates whether
a circle or square should be drawn at each position (described by the
required keyword fields x and y):

lrel[int x = xdefault, int y = ydefault, str geom, *&Rest]
geoms(lrel(int x = xdefault, int y = xdefault, *&Rest] dataset)
= [<x=xval, y=yval, geom=select(xval,yval), r> |
<int x = xval, int y = yval, *r> <- dataset
];

str select(int x, int y) = x < 0 ? "circle" : "square";

Definition and Implementation PDB

The following changes to the PDB are needed:

  • The tuple type has to be extended with keyword fields.
  • Concatenation on tuples has to be generalized.

Note: a more general issue is where to store this information (in the type
or in the value). The implementation of keyword parameters/fields should be
uniform for nodes, constructors and tuples.
Syntax for tuple type

The current definition of tuple type is:

TupleType = "tuple" "[" {TypeArg ","}+ "]";
TypeArg = Type | Type Name;

The new definition will be:

TupleType = "tuple" "[" {TypeArg ","}+ "]";
TypeArg = Type | Type Name | Type Name "=" Expression | "*" Type ;

Notes:

  • We may want to impose a restriction on the order of TypeArgs (e.g.,
    positional ones first).
  • This definition allows more than one multi-variable, and they may
    occur at any position in the type. Some restrictions may be necessary here
    as well (e.g. allow only one multi-variable in a pattern)
  • The above definition opens up the possibility to have splicing for
    othe types as well (but does this have a meaningful interpretation?)

Rascal Interpreter

This will affect the following parts of the Rascal interpreter:

  • Tuple expressions
  • Tuple matching
  • Calls
  • Possibly nodes and ADTs (in order to harmonize the implementation)

Other Tools

  • Regeneration of the parser.
  • Extension of the type checker

Staging of Implementation

  • Implement extended tuples in PDB.
  • Extend Rascal syntax
  • Implement tuple expressions and tuple matching in Rascal interpreter
  • Adapt call code to use new tuples.
  • Adapt code for nodes, constructors, etc. to harmonize implementation.
  • Extend type checker.

Remaining Issues Varyadic arguments for functions

In principle, one could add varyadic fields to tuples as well (using the
same notation as we have for varyadic function arguments) but this makes it
hard to infer the tuple type. For instance, consider <"abc", 1, 2, 3> is
its type:

  • tuple[str,int,int,int], or rather
  • tuple[str, int ...], or rather
  • tuple[str, list[int]].

To avoid this confusion, we have not added varyadic field to tuples.

But how about functions with varyadic arguments? There are two solution
directions:

  • Remove them from the language:
    • Pro: argument lists and tuples become indentical and this has
      many benefits both at the language level and at the implementation level.
    • Con: Makes some users unhappy (although it turns out that
      varyadic arguments are not used that much).
      • Allow them after an optional spliced tuple argument:
    • Pro: less changes to the language.
    • Con: We can achieve less uniformity.

The prototypical example of varyadic arguments is a print function:
void print(value v ...) { ... }

With a call like: print(1, "bc", false);

If we remove varyadic arguments then the following alternatives are
available to define the print function:

void print(list[value] vals) { ... }; Call: print([1, "bc", false]);

void print([*value vals]) { ... } ; Call: print([1, "bc", false]);

void print(tuple[&X] vals){ ... }; Call: print(1, "bc", false);

In the first two styles, vals is bound to a list of values, in the last
one, to a tuple of value fields.

Reply to this email directly or view it on GitHubhttps://github.com//issues/154
.

Jurgen Vinju

@PaulKlint
Copy link
Member Author

Row polymorphism works both for positional and keyword fields. This is illustrated by the example Tuple matching. In

<str name, obligatory=O, *R> := 
< “algebra II”, [4, 5, 7], obligatory = true, prerequisites = true> 

the success of the match is determined by a positional field (name) and a keyword field (obligatory) and
R will be bound to <[4,5,7], prerequisites= true>, i.e., also fields of both kinds.

@PaulKlint
Copy link
Member Author

I agree with the treatement of equality as @jurgenvinju proposes. It differs from how this is done in the current implementation of default values (the default case is also stored in the runtime value).

@jurgenvinju
Copy link
Member

Ok that looks good. I am worried about field projection from a relation
where not all tuples have all fields. Will we special case this to add
default values?

On Saturday, March 2, 2013, Paul Klint wrote:

Row polymorphism works both for positional and keyword fields. This is
illustrated by the example Tuple matching. In

<str name, obligatory=O, *R> :=
< “algebra II”, [4, 5, 7], obligatory = true, prerequisites = true>

the success of the match is determined by a positional field (name) and a
keyword field (obligatory) and
R will be bound to <[4,5,7], prerequisites= true>, i.e., also fields of
both kinds.


Reply to this email directly or view it on GitHubhttps://github.com//issues/154#issuecomment-14325632
.

Jurgen Vinju

@PaulKlint
Copy link
Member Author

Can you be more precise? The phrase "a relation where not all tuples have all fields" is unclear. In the tuples as proposed:

  • all fields are always "present" (irrespective of where the field's value is stored in the case that the value is equal to the default value).
  • all positional fields have to be specified when the tuple value is created.
  • keyword fields may be specified when the tuple value is created.

It may be that you are just worrying about implementation details. Given

alias CourseGrade = tuple[str name, 
                          list[int] attempts, 
                          bool obligatory = true, 
                          prerequisites = true];
CourseGrade g = <“algebra II”, [4, 5, 7], obligatory = false>;

theng.obligatory will just return false. This will also be the case for tuples that occur in (list) relations and the values of those fields will occur in any operation/selection we apply to that relation.

@swatbot
Copy link

swatbot commented Mar 2, 2013

Yes, only details that i am worried abput.

On Saturday, March 2, 2013, Paul Klint wrote:

Can you be more precise? The phrase "a relation where not all tuples have
all fields" is unclear. In the tuples as proposed:

  • all fields are always "present" (irrespective of where the field's
    value is stored in the case that the value is equal to the default value).
  • all positional fields have to be specified when the tuple value is
    created.
  • keyword fields may be specified when the tuple value is created.

It may be that you are just worrying about implementation details. Given

alias CourseGrade = tuple[str name,
list[int] attempts,
bool obligatory = true,
prerequisites = true];
CourseGrade g = <“algebra II”, [4, 5, 7], obligatory = false>;

theng.obligatory will just return false. This will also be the case for
tuples that occur in (list) relations and the values of those fields will
occur in any operation/selection we apply to that relation.


Reply to this email directly or view it on GitHubhttps://github.com//issues/154#issuecomment-14329068
.

Jurgen Vinju

@atzeus
Copy link

atzeus commented Mar 4, 2013

Looks good! I'm a bit confused on when a field is positional or not. Is a field a keyword field (i.e. not positional) if it has a default value? If so, the fact that a field has a default value is part of the type? Or is the default value itself part of the type. If so are two types with the same fields but different default values equalivalent or not?

Also, i am in favour of removing varargs, it's just too messy :)

@atzeus
Copy link

atzeus commented Mar 4, 2013

I propose we make the default values part of the type (they have to be), but do not include them in type equality, that is two tuple types are the same if they have the same keyword arguments, with the same types, but their default values may be different. This then allows us to have multiple aliases for the same type with different default values.

@jurgenvinju
Copy link
Member

@atzeus that was my first thought to, but that it not going to be modular and representations will differ at run-time, so sharing is impossible to get fast. They should be part of static types, but not of dynamic run-time types in order to allow transparent substitution if the values are equal.

Multiple aliases for the same type with different default values is already possible without this BTW. The static type system decides which default value to produce based on the receiver type. The run-time then checks whether the value has the keyword parameter or not. If it does, it projects this value from the tuple, if not the expression has been annotated by the type checker with necessary default value.

@tvdstorm
Copy link
Member

tvdstorm commented Mar 4, 2013

One remark (unfortunately, not inline): tuple concatenation should work as map concat: values of keyword params in right-hand side override the values of the keyword args in the left if there overlapping keys.

@tvdstorm
Copy link
Member

tvdstorm commented Mar 4, 2013

Would it be possible to extend tuple types with keyword params later? E.g. given

alias EvalArgs = tuple[Env env, bool trace = false];

And then in another module:

alias EvalArgs = tuple[Env env, Heap heap = (), bool trace = false];

Maybe a bit weird using alias but definitely useful.

@swatbot
Copy link

swatbot commented Mar 4, 2013

I think that works out of the box due to shadowing and subtyping.

On Monday, March 4, 2013, Tijs van der Storm wrote:

Would it be possible to extend tuple types with keyword params later? E.g.
given

alias EvalArgs = tuple[Env env, bool trace = false;

And then in another module:

alias EvalArgs = tuple[Env env, Heap heap = (), bool trace = false];

Maybe a bit weird using alias but definitely useful.


Reply to this email directly or view it on GitHubhttps://github.com//issues/154#issuecomment-14371858
.

Jurgen Vinju

@PaulKlint
Copy link
Member Author

@atzeus yes a field (or parameter) is a keyword field/parameter since it has a default value and that default value is part of the type.

@jurgenvinju / @tvdstorm I am not yet completely sure how this aliasing business will work out but let's be optimistic.

@tvdstorm you are right about tuple concatenation.

@PaulKlint
Copy link
Member Author

Further Analysis of Proposed Tuple Types

_Anastasia Izmaylova, Paul Klint_

Abstract

In “A new Design for Tuples in Rascal” (see #154) we have proposed extensions to Rascal tuples. This design aims at (1) combining conventional tuples and extensible records, and (2) unifying concepts of tuples and function/constructor signatures with keyword parameters, which will enable more concise and extensible code. Here we discuss some implications for static and dynamic semantics. First, we outline our design, and then we give concrete examples to demonstrate the design and some implications.

Outline of the design

The basic idea of the extension is to allow additional fields in tuple types and tuple values in the style of keyword parameters. We call them _keyword fields_. Thus, we enrich tuple semantics with

  • Keyword fields without default value.
    • This implies that: (a) tuple values may have unordered, labelled elements (fields); and (b) tuple types may have keyword fields that are unordered, labelled and typed.
    • _Note_ that types inferred for tuple literals and tuple expressions are always tuple types without default value.
    • _Note_ that types with keyword fields without default value require the presence of the respective keyword fields in tuple values, and the types of the field values are subtypes of their declared types.
    • _Note_ that subtyping for keyword fields implies both width subtyping (subtype tuple types always have more keyword fields) and depth subtyping. The resulting subtyping for enriched tuples combines now both subtyping for positional elements and keyword fields.
  • Keyword fields with default value.
    • Tuple types may have keyword fields that are unordered, labelled, and typed and specify a default value.
    • _Note_ that tuple types with keyword fields with default value exist only statically.
    • _Note_ that tuple types with keyword fields with default value do not require the presence of the respective fields in tuple values. A tuple type with a keyword field with default value is a supertype of the tuple type with the same keyword field but without default value.
    • _Interestingly, if the declared type of an expression is a tuple type with a keyword field with default value, and that field is present in the actual tuple value of the expression, then there is not necessarily a subtype relation between the declared field type and the actual field value. This follows from the motivation of introducing keyword fields with default value in tuple types - to allow _type substitutability as follows: (1) the expression of a type that has more keyword fields (with or without default values) can be used in place where a type with less keyword fields is required; _and_ (2) the expression of a type that has less keyword fields can still be used in place where a type with more keyword fields and specified default values is required. So, in general, such substitutability is possible if dropping keyword fields with default values from both types gives the same type. This brings the notion of _overriding_ of tuple fields with statically declared keyword fields with default value.
  • Patterns, splicing and row polymorphism are also defined for tuples with keyword fields with or without default values.

Summary of design constraints

There are a number of global design constraints:

  • CONS1: the typing rules for tuples, constructors and functions should be as similar as possible.
  • CONS2: whenever possible, the rules should favour static typing over dynamic typing.
  • CONS3: the rules for tuples should enable inheritance-like behaviour.
  • CONS4: the dynamic type of a variable or expression is always a subtype of its static type.
  • CONS5: dynamic values that are literally identical, should always be equal (under ==).

Discussion on tuple types and equality of tuple values

  1. Tuple types:
    • Where will the type information (incl. default values) be represented?
    • What are the types inferred from tuple literals or tuple expressions?
    • How is the lub on tuple types defined?
    • How is subtyping on tuple types defined?
  2. How is equality on tuple values defined?

Tuple types

We consider here the design alternative: default values are represented in the static type of a tuple.

We make a distinction between

  • types that have a required keyword field but no default value, e.g., tuple[bool b]: such types describe domains of values that _are required to_ have this field with the value of a type being a subtype.
  • types that have a keyword field and specify a default value, e.g., tuple[bool b=true]: such types describe domains of values that _may have_ or _may not have_ the field with this name. At runtime, when the field is absent in a value, its type and value are given by the respective declared type and default value. _Note_ that a field _may be overridden_: if the field is present in a value but the type of the value is not a subtype of the declared keyword field type, then the field, as declared, is overridden by the statically declared keyword field with default value. (Later examples, see Implications for assignment, will motivate this).

In the following we give simple running examples. To avoid notational clutter in this exploratory phase, we use:

  • T1 = tuple[int] (such tuple types exist as declared, inferred and dynamic types)
  • T2 = tuple[int; bool b] (such tuple types ensure the presence of keyword fields in tuple values, e.g., the field b is required; T2 exists as a declared, inferred or dynamic type)
  • T3 = tuple[int; bool b = true] (such types only exist as declared or inferred types)
  • T4 = tuple[int; bool b = false] (such types only exist as declared or inferred types)

Note that we temporarily separate positional elements and keyword fields with a ;. A better syntax has still to be proposed.

LUB (least upper bound)

The following are true for our running examples:

  • lub(T1, T2) = T1 (the keyword field is dropped, since not present in T1)
  • lub(T1, T3) = T3 (the keyword field and its default value is preserved)
  • lub(T1, T4) = T4 (idem)
  • lub(T2, T3) = T3
  • lub(T2, T4) = T4
  • lub(T3, T4) = T1 (default values may conflict, the keyword field is dropped in the lub)

Subtyping

The intuition behind subtyping is that the smaller type corresponds with a “smaller” set (subset) of values (in an informal sense, the value sets may be infinite). This gives the following ordering:

  • T2 <: T1 (T2 requires a keyword field to be present in values and, therefore, represents a more specific, smaller, set of values than T1)

  • T2 <: T3 and T2 <: T4 (T2 requires keyword field to be present in values and, therefore, is more specific than T3 and T4)

  • T1 <: T3 and T3 <: T1

    _AND_

  • T1 <: T4 and T4 <: T1

_Note_ that subtyping under the latter two bullets defines substitutability when the expression of a type with less keyword fields may be used in place where a type with more keyword fields _with_ default values is required. From this, the following is also true: T3 <: T1 <: T4. The intuition behind is that all these types, T1, T3, T4, statically represent the same domain of values, e.g., <1>, <1, b = false>, <1, b = true>, and _ALSO_ <1, b = "a">, <1, b = 0>, etc. T3 and T4 just specify extra information that will be available at runtime. It is really just like with static labels, e.g., tuple[int] <: tuple[int age;] && tuple[int age;] <: tuple[int], both types are freely substitutable, tuple[int age;] just propagates the knowledge of label-to-index mapping at runtime, e.g., age -> 0.

Associating a type with a tuple expression

When associating a type with a tuple expression, we have to rely on local information; hence, we are limited in what we can infer:

  • <1> : tuple[int]
  • <1, b = true> : tuple[int; bool b] (_note_ that the default value is not part of the inferred type)
  • <1, b = true, b = "a"> : tuple[int; str b] (_note_ that when duplicate keyword fields occur, the rightmost prevails; see Splicing for more examples).

Assignment

The design alternative we discuss is to associate with each value v a static and dynamic type, written as v : <static_type(v), dynamic_type(v)>. The static type can provide a default value when applicable.

Examples:

  • tuple[int; bool b=true] v1 = <1>; ==> v1 == <1> : <tuple[int; bool b=true], tuple[int]>
  • tuple[int; bool b=false] v2 = <1>; ==> v2 == <1> : <tuple[int; bool b=false], tuple[int]>
  • tuple[int; bool b=false] v3 = v1; ==> v3 == <1> : <tuple[int; bool b=false], tuple[int]>

We expect that: v1 == v2

Implications for assignment

  • tuple[int; bool b = true] and tuple[int; str b = "a"] are also substitutable, which follows from:

tuple[int; bool b] <: tuple[int] <: tuple[int; bool b = true] <: tuple[int] <: tuple[int; str b = "a"]

It might look weird for some cases, reasonable for others, and unavoidable for the rest, e.g.,

tuple[int; bool b = true] t1 = <1>; ==> t1.b == true;
tuple[int; str b = "a"] t2 = t1; ==> t2.b == "a"

The above _does_ look reasonable and useful.

tuple[int; str b = "a"] t3 = <1, b = true>; ==> t3.b == "a";

(str b = "a" overrides bool b = true, unsound, otherwise)

The above _does_ look unintuitive, but, on the other hand, it is type-safe and
sound.

Compare the latter with:

tuple[int] t3 = <1, b = true>; and tuple[int, str b = "a"] t4 = t3;

Another example of such behaviour:

rel[int] r1 = { <1>, <1, b = false>, <1, b = 0>, <1, b = "a">};
rel[int, str b = "b"] r2 = r1;
[ e.b | e <- r2 ] == [ "b", "b", "a" ];

Also,

rel[int] r1 = {<1>, <1, b = false>, <1, b = 0>};
rel[int, str b = "a"] r2 = r1;
[ e.b | e <- r2 ] == ["a", "a", "a"];

The above _is_ unavoidable - there are some ignorable attributes possible, but we do not know anything about them.

One final example:

tuple[int] t1 = <1, kwf = false>;
tuple[int, str kwf = "a"] t2 = t1; ==> t2.kwf == "a".

t2.kwf += "b" ; ==> t2 == <1, kwf = "ab"> 

To summarize: it makes sense to ignore keyword fields with default values in subtyping, which does not lead to unsoundness as runtime behavior depends on the full extra static information, e.g., bool b = true, or str b = "a", the rest is sound.

Equality

Equality follows the way values are represented: values are equal (==) when

  • Their actual runtime values are equal.
  • Their types are "compatible" (to be made precise).

Two tuple values are _always equal_ when either (1) their actual runtime values are identical, or (2) when they are not identical, but they would have become identical after inserting default values according to their respective static types.

Here are some representative examples for E1 : T1 == E2 : T2 (we use <?> to denote an arbitrary default value)

E1 T1 E2 T2 Result
<1> <tuple[int],tuple[int]> <1> <tuple[int],tuple[int]> true
<1> <tuple[int;bool b=<?>],tuple[int]> <1> <tuple[int],tuple[int]> true
<1> <tuple[int;bool b=<?>],tuple[int]> <1> <tuple[int;bool b=<?>],tuple[int]> true
<1,b=true> <tuple[int;bool b],tuple[int;bool b]> <1> <tuple[int],tuple[int]> false
<1,b=true> <tuple[int;bool b],tuple[int;bool b]> <1> <tuple[int;bool b=true],tuple[int]> true
<1,b=true> <tuple[int;bool b],tuple[int;bool b]> <1> <tuple[int;bool b=false],tuple[int]> false
<1,b=false> <tuple[int;bool b],tuple[int; bool b]> <1> <tuple[int;bool b=false],tuple[int]> true
<1,b=false> <tuple[int;bool b],tuple[int;bool b]> <1,b=false> <tuple[int;bool b=true],tuple[int]> true
<1,b=true> <tuple[int;bool b],tuple[int;bool b]> <1,b=false> <tuple[int;bool b=false],tuple[int]> false

Matching

We discuss matching also by some representative examples. There are a number of issues to resolve:

  • _Issue 1 :_ in a pattern match P := S, should extra keyword fields in the subject S, which do not occur in the pattern P, prevent a successful match?
  • _Issue 2 :_ we need a mechanism to match an unknown number of positional elements and fields in a subject.

Examples:

<3, 4, b = true> := <3, 4, x = 3.5, b = true>; (1)

Type LHS: tuple[int, int; bool b].

Type RHS: tuple[int, int; real x, bool b].

RHS <: LHS, so the pattern match is type correct.

Result: true.

Discussion: this example illustrates _Issue 1_ mentioned above. There are two designs possible here: (1) as above, where unmatched keyword fields in the subject do not affect the match, or (2) require always a full structural match, like <3, 4, b = true, *R> as a pattern.

In the following examples we assume the first choice.

<3, 4, b = V> := <3, 4, x = 3.5, b = true>; (2)

Type LHS: tuple[int, int; value b].

Type RHS: tuple[int, int; real x, bool b].

RHS <: LHS, so the pattern match is type correct.

Result: true && V == true.

<3, 4, b = V> := <3, 4, x = 3.5>; (3)

Type LHS: tuple[int, int; value b].

Type RHS: tuple[int, int; real x].

LHS and RHS are incomparable => static error.

<3, 4, b = V ? true> := <3, 4, x = 3.5>; (4)

New notation: b = P ? true is a pattern that matches P with the value of the field b in the subject when present, otherwise, it matches P with the default value true, thus, binding all variables in P. In this particular case, P is a single, untyped variable pattern V.

Type LHS: tuple[int, int; bool b = true].

Type RHS: tuple[int, int; real x].

RHS <: LHS, so the pattern match is type correct.

Result: true && V == true.

<3, 4, b = V ? true> := <3, 4, x = 3.5, b = false>; (5)

Type LHS: tuple[int, int; bool b = true].

Type RHS: tuple[int, int; real x, bool b].

RHS <: LHS, so the pattern match is type correct.

Result: true && V == false.

<3, 4, b = V ? "a"> := <3, 4, x = 3.5, b = false>; (6)

Type LHS: tuple[int, int; str b = "a"].

Type RHS: tuple[int, int; real x, bool b].

RHS <: LHS, so the pattern match is type correct.

Result: true && V == "a" (_note_ also the field overriding here).

<3, *R> := <3, 4, x = 3.5, b = true>; (7)

Type LHS: tuple[int, ROW].

Type RHS: tuple[int, int; real x, bool b].

RHS <: LHS, so it should match.

Result: true && R == <4, x = 3.5, b = true> && R: tuple[int; real x, bool b].

Discussion: this example illustrates _Issue 2_ mentioned above. The subtype relation has to be adapted to take care of ROWs.

We do this here by examples:

tuple[bool b = true, ROW] <: tuple[bool b = true, real x = 3.5] 

is true for ROW = tuple[real x = 3.5].

tuple[bool b= true, str s= "a", ROW1] <: tuple[bool b= true, real x= 3.5, ROW2] 

is true for ROW1 = tuple[real x = 3.5] and ROW2 = tuple[str s = "a"].

<b = true, *R> := <3, 4, x = 3.5, b = true>; (8)

Type LHS: tuple[; bool b, ROW].

Type RHS: tuple[int, int; real x, bool b].

RHS <: LHS, so it should match.

Result: true && R == <3, 4, x = 3.5> && R : tuple[int, int; real x].

<b = true, *R> := <3, 4, x = 3.5>; (9)

Type LHS: tuple[; bool b, ROW].

Type RHS: tuple[int, int; real x].

LHS and RHS are incomparable => static error.

<b = V ? true, *R> := <3, 4, x = 3.5>; (10)

Type LHS: tuple[; bool b = true, ROW].

Type RHS: tuple[int, int; real x].

RHS <: LHS, so it should match.

Result: true && R == <3, 4, x = 3.5> && R : tuple[int, int; real x] && V == true.

Splicing and overriding

Since we use overriding in cases of keyword fields with the same name, we get:

< *<b = true>, *<b = "a">, *<b = 0> > ==> <b = 0> : tuple[int b]

A similar example:

tuple[int] t1 = <1, b = true>;
tuple[int] t2 = <2, b = "a">;
tuple[int] t3 = <3, b = 0>;

with the same outcome:

< *t1, *t2, *t3 > ==> <1,2,3,b = 0> : tuple[int, int, int; int b]

Other examples:

Example 1:

Given:

<1> : <tuple[int], tuple[int]>;
<1, b = true> : <tuple[int; bool b = true], tuple[int]>;

we get:

{<1>, <1, b = true>} : <set[lub(tuple[int], tuple[int; bool b = true])], 
set[lub(tuple[int], tuple[int])]> 
==> {<1>, <1, b = true>} : <rel[int; bool b = true], rel[int]>

Example 2:

Given:

rel[int, bool b = false] r = {<1>, <1, b = true>} 

we get:

{<1>, <1,b = true>} : <rel[int], rel[int]> 
&& r : <rel[int, bool b = false], rel[int]>;
for(elem <- r) append elem.b;

is perfectly ok, since the type of the element value <1>, <rel[int, bool b=false], rel[int]>,
provides enough information to compute the selection of field b.

Example 3:

Given:

alias R1 = tuple[;str name = "none];

alias R2 = tuple[;str name = "none", int age = 0];

alias R3 = tuple[;str name = "none", int age = 0; str street = "nowhere"];

then we have R3 <: R2 <: R1.

Example 4:

Here is a description of the Figure data type and some operations on them.

alias Props = tuple[int width = 1, int height = 1, str color = "white", 
int lineWidth = 1,..];

data Point = point(int x = 0, int y = 0);

data Shape = line(Point org, Point end, *Props props);
data Shape = rect(Point org, *Props props);
data Shape = circle(Point org, real radius, *Props props);

Shape moveTo(Shape s, Point newOrg) { s.origin = newOrg; return s; }

Shape scale(line(p1, p2, props), int f) 
= line(p1, point(f * (p2.x-p1.x), f * (p2.y - p1. y), props);

Shape scale(rect(p1, width=w, height=h, *props), int f) 
= rect(p1, width=f*w, height=f*h, props);

Shape scale(circle(p1, radius, *props), int f) = circle(p1, f * radius, props);

An alternative -- and probably more consistent -- design is to represent all parameters (x, y, radius, etc.) as keyword parameters.

alias Props = tuple[int x = 0, int y = 0, int radius = 0, int width = 1, 
int height = 1, str color = "white", int lineWidth = 1, ..];

data Point = point(int x = 0, int y = 0);

data Shape = line(*Props props);
data Shape = rect(*Props props);
data Shape = circle(*Props props);

Shape moveTo(Shape s, Point newOrg) { s.x=newOrg.x; s.y=newOrg.y; return s; }

Shape scale(line(width=w,height=h, *Props props), int f) 
= line(width=f*w, height-f*h, props);

Shape scale(rect(width=w, height=h, *props), int f) 
= rect(width=f*w, height=f*h, props);

Shape scale(circle(radius=r, *props), int f) 
= circle(radius= f * r, props);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants