Nemerle language

IlyaGerasimets edited this page Aug 5, 2012 · 2 revisions
Clone this wiki locally

Nemerle language

Author: Chistyakov Vladislav
Original: http://rsdn.ru/article/Nemerle/TheNemerleLanguage.xml
Translator: Alexey Badalov

This article opens a series devoted to teaching the Nemerle programming language. Existing articles about this language assume the programmer’s familiarity with the Microsoft .NET framework and the C# programming language. On the other hand, this series is targeted at people familiar with neither one nor the other and could be used for teaching programming as such. People new to programming might require assistance of someone more experienced.

General introduction

Kernigan and Ritchie’s book – “C Programming Language” was my source of inspiration for this work. Many years ago, I learned to program from that book. I really liked it’s structure. Instead of taking a problem and picking it apart, looking at small details along the way, this book gives the reader the minimum foundation required to program in “C” and then opens up one new aspect of it after another. Moreover, the book does not rely on abstract examples, so loved by functional programming adepts, but uses examples that are simple, yet still practical. Understanding that Kernigan and Ritchie have set the bar very high, I decided not to compete with them in creativity, but simply repeat their methodology for Nemerle.
Nemerle is a whole new language. It is different from C in many ways, but this should not prevent us from using the same approach and even the same examples as the famous “C Programming Language”.

Language overview

Nemerle was envisioned as a high-level programming language with support for object-oriented programming, functional programming, and metaprogramming.
The language’s predecessors are C#, ML, and LISP. Nemerle is probably closest to C#. A programmer familiar with C# can read a couple of introductory articles, spend two-three days experimenting, and begin writing code in Nemerle. Of course, to master all the language has to offer, one would have to spend at least a month, but even that period can hardly be called long for studying a powerful modern programming language.
If you are unfamiliar with C# and the .NET platform, then you will need more time to learn Nemerle. But it is still not all that difficult. The language is designed to make it easy to begin programming quickly and easily. There is no need to know everything about the language to write programs of use to yourself and others.
The given work is targeted at people with no C# programming experience. In principle, it could be used to learn programming from the ground up, but then you would need help from someone who already knows programming and you should have some basic understanding of the way the computer and the programs you run on it work.

Why Nemerle?

“Why learn this particular language, when there is a plethora of others (including on the .NET platform)?” — you might ask. The answer is very simple. This is one of the best and most interesting programming languages created to this day and it is very easy and pleasant to learn. I have no doubt that this is the best language created for the .NET platform so far. This language supports almost all features present in other programming languages (except, maybe, logical programming, mainly represented by Prolog programming language), so if you master this language, you will find it easier to learn others.

HINT
By the way, I am convinced that a good programmer should know at least three programming languages. So, if you have a choice of which language to learn, learn any, and then a couple more ;). It is also desirable to have these languages be as different as possible. I would recommend getting familiar with the following ones: Nemerle, Haskell, C++, Prolog, Erlang, and Ruby. After this, most other languages will look like dialects of these.
Of course, there are some concepts that have not made it into Nemerle. For example, “dependent types” (a whole new concept, currently only available in experimental languages, like agda2), or continuations (available in a number of languages). But this is not too much. Besides, some concepts (such as continuations) could be implemented using Nemerle macros.

15 reasons for someone who knows other languages and is considering learning another
1. Nemerle is one of the most modern and powerful programming languages available today.
2. Nemerle supports most of the popular programming paradigms and lets you learn them all within a framework of one language.
3. Nemerle is fairly simple to learn, but unlike other simple programming language, such as Ruby and Python, it is statically typed. On the one hand, this produces fairly efficient code (comparable to C/C++ and no slower than C#) and on the other – finds many errors before even running the program.
4. Even though Nemerle supports such interesting paradigms as FP (functional programming) and MP (metaprogramming), it does not require a totally different frame of mind (like Haskell and LISP). Moreover, learning Nemerle makes it much easier to study those others. And there is no magic here. It is just that Nemerle does not sacrifice any common tools to make the language developer’s job easier.
5. Nemerle’s design pays great attention to consistency and intuitiveness of the language. Even though C# is considered a fairly intuitively clear language, forum discussions often point out very strange and illogical behaviours of the C# compiler. In most such cases, Nemerle behaves the way the programmer presumes.
6. Most of the operators in the language are macros written in the language itself. The base language is quite compact.
7. The language supports type inference, which makes it possible to omit types in code in 99.9% of cases. This makes it considerably easier to learn.
8. Support of an intelligent IDE (a Microsoft Visual Studio extension). IDE support makes it much easier to write and understand code.
9. Nemerle is completely compatible with C# and VB at the library level. All components and libraries available in .NET could be used from Nemerle. Moreover, if the public interface does not use algebraic types, a Nemerle library could be used from C# or VB without any changes or special considerations, as it sometimes happens with languages that are adapted to the platform, instead of being initially designed for it.
10. Nemerle is exceptionally well-suited for describing complex logic. Such powerful tools as pattern matching, algebraic data types, and macros let you make the solution significantly simpler and easier to understand than languages lacking such tools (which includes all of the popular languages: C#, Java, VB, and C#). Moreover, the result can be placed into libraries and used in projects developed in other languages.
11. Nemerle has a powerful metaprogramming system, which lets you automate code generation inside Nemerle projects and use the generated code in other languages.
12. Nemerle is an extensible language, with which you can make your wildest dreams come true.
13. Nemerle is an innovative, but not at all experimental language. It is very practical.
14. Nemerle is an open project with a very liberal licence. Its compiler and all the included modules can be freely included in projects, modified, or studied.
15. Finally, Nemerle is quite an expressive and consistent language. It is very nice to write in. Most programmers writing in Nemerle find it more expressive and consistent than those other popular languages they used before.

How to begin using the language?

The most convenient way to write in Nemerle is through the “Microsoft Visual Studio Integration” (or VS, for short). Still, VS, like any other IDE (integrated development environment), hides most of the project handling subtleties the beginner should know about. This is why I will begin by using the command-line compiler.

Installer

In order to be able to use Nemerle, you have to install it on your machine. You can do this by downloading and installing the installer (get it here), or by building the compiler from sources (build instructions).
If you use the installer, then it is best to use the default installation path (%ProgramFiles%\Nemerle).

ncc.exe

The Nemerle command-line compiler is called ncc.exe. By default, it is located in ProgramFiles\Nemerle (for instance, on my machine, this is “C:\Program Files\Nemerle”).
The libraries for “Microsoft VS integration” and the run-time library Nemerle.dll, which is required for running Nemerle programs, are also located there. In order to make using ncc simplier, add the path ProgramFiles\Nemerle to the environment variable PATH (look here for instructions). Then you will be able to call ncc from any folder on your machine.
Ncc takes a file list to compile and a set of keys as arguments. The keys should be prefixed with the symbol “-“ (minus) or “/” (slash). The full list of compilation keys can be retrieved by entering the following in the command line:

ncc –h

Ncc has many keys, but only the following are important for us right now:

Key Description
-no-color Suppresses the special escape codes used for colouring the output in Linux. Under Windows, if no special support is available, these escape symbols mess up the compiler output. So, if do not have this support, always specify –no-color in the command line.
-out:STRING Lets you set the output file name (i.e. the name of the executable exe-file or dll). Replace “STRING” with the file name. If you omit this key, the compiler will generate the executable as “out.exe”.
-reference:STRING Reference external libraries (called assemblies in .Net).
-target:STRING Output file type: exe (executable file) or library.
-nostdmacros Omit standard Nemerle library macros.
-macros:STRING Add a macro assembly reference.

Simple introduction

In this section, we would prefer to avoid delving too deeply into the subtleties of the language, getting mired in details and formal rules. On the contrary, we would like to demonstrate the most important elements of the language (its base) on real programs.
In this section, we intentionally stop short of giving a full description of the language or even a rigorous description of its parts (of course, the samples will be correct). The primary goal is to get you to the point of being able to write useful programs in Nemerle as quickly as possible. In order to reach this goal, we will focus on the basics: basic console input/output, arithmetic, variables, operators, code flow constructs, and functions. We intentionally leave out many elements of Nemerle that have primary importance for writing large programs, including user-defined types, most of the rich Nemerle operator library, several code flow operators, and numerous other details.
Of course, the first examples might seem less brief and elegant than if we wrote them using the full power of Nemerle. Don’t mind this! As we progress, we will rewrite them, each time making them shorter and more expressive.

First example – “Hello, World!”

As fairly noted in the “C Programming Language” (which largely inspired me for this work and serves as a prototype), “the only way learn a new programming language is by writing programs in it. The first program to write is the same in all languages: print the words “hello, world”.
You have to create a file with a “.n” extension (without the quotation marks) and an appropriate name (such as “HelloWorld.n”). Paste the following code into it (program text is called code or source code):

using System.Console;

WriteLine("Hello, World!");

Call the compiler with the file name and the necessary compilation keys:

ncc -no-color HelloWorld.n -out:HelloWorld.exe

And execute the resulting program “HelloWorld.exe”.

NOTE
If you are using Mono (an independent .NET implementation supported by Novell and compatible with Windows and Linux), then program execution requires you to enter “mono HelloWorld.exe” in the command line.

If you do everything right, then after executing “HelloWorld.exe”, you will see the following in the console (that is, on the computer screen):

Hello, World!

HelloWorld compilation
Figure 1. HelloWorld compilation.

Try to change the program code and see what error messages and warnings the compiler gives you.
What does this example demonstrate? Surprisingly plenty: first of all, it shows that Nemerle, like most other modern compiled languages, has no built-in means for input/output or anything else (except, a basic set of arithmetic operators).

Functions

Everything you do in Nemerle is done by calling functions (or methods, but more about this a little later). In this case, we use the system function WriteLine (declared in .NET), which outputs a single line of text and a newline symbol to the console (which makes subsequent output begin on a new line). If we did not want to end the line, we could use the function Write.

NOTE
Try to replace the name of the function WriteLine by Write and see how this changes the console output.

A function call looks like its name followed by a list of arguments enclosed in parentheses. If the function has no arguments, then nothing is written between the parentheses (they are left empty).
You can write your own functions (more on this later) or use libraries (written by other programmers and stored in binary assembly files).
Functions and other actions can be executed sequentially. In this case, the separate actions (expressions or function calls) should be separated by “;” (semicolon). This behaviour is analogous to the Pascal programming language. Although, you can write the “;” sign at the end of each action (as it is commonly done in C-like languages, such as C#, Java, and C++).
Next example:

using System.Console;

WriteLine("Hello, World!");
WriteLine("World! Hello!");

outputs:

Hello, World!
World! Hello!

The last semicolon can be safely omitted.

HINT
Try to omit the semicolon from the first or the second line and see what the compiler has to say about that. See what happens, if you add extra semicolons.

At the same time, let’s look at another console function: Write. The difference is that it does not end the output with a newline, which makes subsequent output begin exactly where the previous output ends. The following example demonstrates this:

using System.Console;

Write("Hello");
Write(", World");
WriteLine("!");

outputs the following line, same as the initial example:

Hello, World!

Strings and string literals

Nemerle strings, similarly to all other .NET languages, are represented by special objects. We will discuss what objects and their types are later. For now, just consider objects as some values you can manipulate and that can hve different types.
To create some predetermined (static) text, we use string literals (also known as string constants”). Nemerle supports three types of string literals, but for the moment, it will be enough for us to use just one – the simple string literal.
The simple string literal is just a string of symbols enclosed in a pair of double quotation marks “"”. Such a string should be written in a single line and cannot contain the symbol “"”. In order to add this symbol to a string, you need to use the so-called escape notation. This just means that you should put the symbol “\” (escape symbol) in front of the symbol you want to add. Of course, the symbol “\” also cannot be used by itself (since the compiler would perceive it as a control symbol), so if you want to use this symbol, you have to double it.
This type of string literal is often called the C-string, since it was first used in the language “C”.
We will go over the other string literal types later. If you want to know more right now, take a look at the “String literals” section of the “Extended description” section.

Namspaces and types

I think, everything in this example should be clear, except for the line:

using System.Console;

If you remove this line, the compiler will give you the error message:

HelloWorld.n:3:1:3:10: error: unbound name `WriteLine'

This means that the compiler could not find the function name.
If all functions existed in a common (global) namespace, programmers would find it very difficult to use library functions, since their names would have to be made unique.
In order to use the same function names without making a mess, we have namespaces and modules (more precisely types, but we’ll discuss them later).
Functions can be declared in types (such as modules) or inside other functions. Types can be declared in namespaces, other types, and the global (nameless) namespace. For example, the function WriteLine is declared in the module Console, which is declared in the namespace System. The module Console groups all functions having to do with console input/output, while System groups all system types (that is, basic for the system). For example, the “string” class is declared in the System namespace.
In general, it is important for you to remember that namespaces serve to prevent name conflicts, while types serve to group functions (and other elements).
The line “using System.Console;” opens, so to speak, the type “System.Console”. After this, the contents of this type can be used directly. In principle, we could reach the necessary functions without opening any types or namespaces. The initial example could be rewritten in the following manner:

System.Console.WriteLine("Hello, World!");

This way of writing names is called fully qualified.

HINT
I will use bold italic to highlight new terms, memorizing which could be a good idea.

In addition, you can partially qualify names. For example, let’s take the initial example and rewrite it to use a partially qualified name:

using System;

Console.WriteLine("Hello, World!");

The previous two examples will have the same console output as the very first one.

Arithmetic

Nemerle lets you perform arithmetic computations almost the same way you used to write them in your school notebook.
For example, to convert 40 degrees Fahrenheit to Celsius (using the formula “C = 5/9 * (F – 32)”), we could write the following program:

using System.Console;

WriteLine(5.0 / 9.0 * (40.0 - 32.0));

Place it into the file f2c.n, compile it, and execute (you should be able to do this by now).
This code outputs the following to the console:

4.44444444444444

It might seem strange that numbers are written with “.0”. This is necessary, because in Nemerle, like all other C-like languages, integer numbers are written differently from floating-point numbers. Moreover! Their division and multiplication are done differently. If you divide an integer number by another integer, you only get a third integer. The fractional part is lost. So, if you divide 5 / 9:

using System.Console;

WriteLine(5 / 9);

you get 0, not 0.555555!

NOTE
This not entirely intuitive behaviour is due to the fact that arithmetic in Nemerle, same as in its ancient predecessor C, is tied to the architecture of the processor running the program. Modern processors have separate integer computation units (ALU, Arithmetic logic unit) and units (commonly referred to as coprocessors) for floating point operations.
Floating point numbers are such that it is not always possible to get the required precision in computations and sometimes even rudimentary operations can produce unexpected results. If you want to learn more, see the article: What Every Computer Scientist Should Know About Floating Point Arithmetic.
Many languages with dynamic typing (often referred to as scripting languages) operate with a more abstract notion of a number to save the programmers from the pesky details. Unfortunately this has a strongly negative impact on code performance. Nemerle is a statically typed, compiled language. In this respect, it shares the approach of the lower-level languages, such as C, not scripts. Fortunately this only very rarely leads to any problems in practice.

If even a single argument of an expression is a floating point number, the other argument will be automatically converted to the larger type (the type that can hold values from a greater range). Type conversion is performed by the compiler. In the end, we get the most efficient code. This way, thanks to automatic type conversion, instead of writing the expression “5.0 / 9.0 * (40.0 – 32.0)” we could write “5.0 / 9 * (40 – 32)” or “5 / 9.0 * (40 – 32)”. All the other types would be converted by the compiler. Still, the more clearly you state your intent, the less you have to guess. And the less you have to guess (all other things being equal), the easier it is for a programmer to understand what the code means and, therefore, the less likely to make a mistake. By the way, if we wrote “5 / 9 * (40.0 – 32)”, the compiler would have only converted the result of integer division, yielding in incorrect (from our point of view) result.

Local functions

Functions are convenient for putting some part of a computation into a “black box” that can then be used without looking inside it. Function use in Nemerle (and in the other languages) is one of the most important ways of dealing with the potential complexity of large programs. If functions are organized the right way, they make it possible to ignore how the job is done and focus on what they do. Nemerle is not called a functional language for nothing. Function use in it is tuned to perfection. It is not only easy and convenient to create functions (same as in C), but it is just as easy to manipulate them. They can be passed into other functions, stored as variables, and even combined into new functions from several others. You might often encounter functions of only several lines called just once. They are created to clarify some parts of the program. The function name alone can be quite descriptive. Combined with a list of parameters, a function can be an invaluable source of information about the program.
Thus, it is not a very good idea to write formulas in the same place you need to do the computation. It is much better to place the formula into a function. The following program does exactly this:

using System.Console;

def fahrenheitToCelsius(fahrenheit)
{
  5.0 / 9 * (fahrenheit - 32)
}

WriteLine(fahrenheitToCelsius(40));

WARNING
If you know other imperative languages from the C lineage, you are probably a bit surprised by the absence of the “return” operator in the function. This is a language feature, owing to its functional nature. Nemerle has no “return”, “break”, “continue”, or other imperative control flow instructions that interrupt functions or loops. Instead, Nemerle has a construct that replaces all of these operators and does even more. The operators “return”, “break”, and “continue” can be implemented as macros based on this construct. Since loops themselves are implemented as macros in Nemerle, I will tell you more about this construct when we get to study macros (probably, the most fun part of Nemerle).

This code also outputs this to the console:

4.44444444444444

Even though I am nearly sure that you do not yet know the full function declaration syntax, you can understand what is going on here and probably noticed, that code written this way is easier to understand.
The keyword “def” lets the compiler distinguish function declaration from its call, since we declared a local function, i.e. function declared inside an expression.
Even though the body of the function consists of a single expression, it is enclosed in brackets. Brackets enclose the so-called blocks. A block can contain zero, one, or several expressions (separated by semicolons). A function body is always a block. If you forget to enclose the expression in brackets, the compiler will produce the error message:

f2c.n:4:3:4:6: error: parse error near double literal: expected `{' 
at the beginning of function body

Our computation became more structured, but we cannot say the same about console output. Let’s change this situation and make it more attractive. For this, we will use the formatting built into the WriteLine function.
In Nemerle, functions declared in types, unlike local functions (in the above example), can be overloaded. This means it is possible to have more than one function with the same name. The compiler can tell them apart by the number and types of their parameters. The WriteLine method has many overloads; among them, is one that takes a format string as the first parameter and parameters to be referenced in the format string as the second (and possibly third, fourth, etc.).
Let’s output information in the following format:

Fahrenheit:   40 Celsius:   4.4

The following example does this:

using System.Console;

def fahrenheitToCelsius(fahrenheit)
{
  5.0 / 9 * (fahrenheit - 32)
}

WriteLine("Fahrenheit: {0, 4} Celsius: {1, 5:##0.0}", 40, 
  fahrenheitToCelsius(40));

A little bit about formatted output"
As you can see, “40” is being output in place of “{0, 4}” and “4.4” instead of “{1, 5:##0.0}, with the numbers being right-aligned. This is called formatted output. Each pair of brackets inside the string describes a placeholder, which is then replaced by WriteLine parameters following the format string. The parameter the value of which is substituted for a placeholder is determined by the first number. “0” means that the value of the second (first after the format string) parameter will be used, “1” — third parameter, and so on. The subsequent placeholder fields are optional, but we need them. After the comma, we specify field alignment. Unused symbols from the number written after the comma are filled with spaces. Colon begins the format string section (also optional). I will not describe all the possible formats, since there is quite a few of them. Everything to do with the format can be foud in MSDN (Microsoft’s general documentation) in the notes for the String.Format function. I will only say that the format of the second placeholder lets us omit unnecessary decimal places and output only one place.

WARNING
Nemerle (and in .NET in general) starts numbering from zero. Beware!

Let’s shift the problem up a notch. Instead of one value, let’s output 16, with a 20-degree step:

Fahrenheit:    0 Celsius: -17,8
Fahrenheit:   20 Celsius:  -6,7
Fahrenheit:   40 Celsius:   4,4
Fahrenheit:   60 Celsius:  15,6
Fahrenheit:   80 Celsius:  26,7
Fahrenheit:  100 Celsius:  37,8
Fahrenheit:  120 Celsius:  48,9
Fahrenheit:  140 Celsius:  60,0
Fahrenheit:  160 Celsius:  71,1
Fahrenheit:  180 Celsius:  82,2
Fahrenheit:  200 Celsius:  93,3
Fahrenheit:  220 Celsius: 104,4
Fahrenheit:  240 Celsius: 115,6
Fahrenheit:  260 Celsius: 126,7
Fahrenheit:  280 Celsius: 137,8
Fahrenheit:  300 Celsius: 148,9

Of course, we could simply copy this line sixteen times:

WriteLine("Fahrenheit: {0, 4} Celsius: {1, 5:##0.0}", 40, fahrenheitToCelsius(40));

and change temperature values in Fahrenheit. But seeing 16 nearly identical copies of text might hurt the fragile spirit of the developer striving towards harmony in everything to do with code. Instead, it is better to write another function, which will output strings with the necessary step. Any function can call any other function, including itself. This lets us create a cycle or, to be exact, recursion. Here is how it looks:

using System.Console;

def fahrenheitToCelsius(fahrenheit)
{
  5.0 / 9 * (fahrenheit - 32)
}

def loop(fahrenheit)
{
  WriteLine("Fahrenheit: {0, 4} Celsius: {1, 5:##0.0}", fahrenheit, fahrenheitToCelsius(fahrenheit));

  loop(fahrenheit + 20);
}

loop(0);

Pay special attention to the highlighted line. This is a recursive function call (i.e. the function calling itself).

NOTE
The recursive call works very simply. When control flow reaches the point of the recursive call, the function is called again, but with new parameter values (in our program, the value of the only parameter “fahrenheit” is incremented by 20). Then the process repeats (in our example, the new value is again output to the console and another recursive call occurs). We do not get stack (memory area storing function arguments and return addresses) overflow in this case, since Nemerle optimizes tail recursion.
There is also another view of recursion functions – mathematical. You can read about it here. The mathematical view of recursion is similar to the image you get on a TV screen from a camera pointed at the same screen. Maybe, this analogy will help you understand recursion better.

I suppose, many of you have guessed that this code will output something entirely different from what we wanted. Moreover, this program simply will not terminate! You will have to press Ctrl+C to interrupt the program. Otherwise, it will be scrolling the console window for a long time, printing new and new values.
In order to stop the recursion, we need some means of controlling the computation flow depending on the value of the argument passed to the function.
If you know any other programming languages, then you have probably guessed that this is usually the if/else operator. This is true, but there is no such operator in Nemerle. :)
On the other hand, Nemerle has the operator match. This is the so-called pattern matching operator. It lets you compare a value with one or more pattern and perform some actions when a pattern matches the value.
Here is how the correct version of our example looks using this operator:

using System.Console;

def fahrenheitToCelsius(fahrenheit)
{
  5.0 / 9 * (fahrenheit - 32)
}

def loop(fahrenheit)
{
  WriteLine("Fahrenheit: {0, 4} Celsius: {1, 5:##0.0}", fahrenheit, fahrenheitToCelsius(fahrenheit));

  match (fahrenheit)
  {
    | 300 => ()
    | _   => loop(fahrenheit + 20);
  }
}

loop(0);

The way the operator works is very simple. It takes a value as its only parameter and its body (consisting of a block) contains a list of patterns. Each pattern begins with the symbol “|” and is comprised of a single pattern expression. It can be followed by another pattern or by “=>”. The “=>” is then followed by expressions/actions (one or more), which are computed, if one of the patterns preceding the “=>” matches the value (the one in the parentheses following the keyword “match”).
In our program, the fahrenheit parameter is the matched expression, while the expressions “300” and “_” are the patterns. The underscore — “_” — is something special in Nemerle. It is called the wildcard symbol. When the wildcard is used as a pattern, it matches any value. This way, if we put “| _” first in the pattern list, it will always be matched, while the other patterns never. But the Nemerle compiler is not so dumb as to permit such a situation. The compiler will see that all of the patterns except the first are meaningless and emit the corresponding warning. On the other hand, if the wildcard pattern is placed at the end of the list, it will only match the value when none of the preceding patterns do. Hence, the order of patterns in the match statement can be very important (later we will see cases, in which it is not important).
In our case, the pattern “_” is exactly what we need! After all, if the value of fahrenheit is not equal to 300, then we need to recursively call the current function again. Only if the value is equal to 300, do we need to stop the recursion.

NOTE
The term “stop recursion” should be explained separately. There is no operation that stops recursion. After all, recursion means just calling the same function. So, in order to “stop recursion” we merely need to “not call the same function”.

However, there is catch! All branches of the match operator must return values of the same type (this makes it possible to use match inside expressions). But we have nothing to do in the first branch! And not doing anything implies not having any operators that do something. We’ve got a conflict! We must return something, but cannot perform any actions.
Thankfully, there is expression “()”, which means “nothing”. This expression also performs no actions. Writing it after the “=>”, we tell the compiler that we deliberately ask it to do nothing. Besides this, the compiler infers that the result of the match operator is also “nothing”, and since the match operator is the last operator of the function, its return type is also inferred as “nothing”. This is exactly what we need, since the function “loop” is not supposed to return anything – its purpose is to output information to the console.

NOTE
Functions returning nothing or, in other words, not returning anything are called procedures in some languages. But Nemerle does not make a distinction between functions and procedures. Procedures are created to have sideffects. In our case, console output of information is the sideffect.

The job is done, but there are some nuances. The code of our function presumes that it gets the correct value as input (in the range from 0 to 300). If we go back and feed not “0” into the loop, but, say, “400”, then our function never terminates. How to remedy this situation?
For this, we can use the “greater”/”less” comparison operators and one more “match” operator:

using System.Console;

def fahrenheitToCelsius(fahrenheit)
{
  5.0 / 9 * (fahrenheit - 32)
}

def loop(fahrenheit)
{
  WriteLine("Fahrenheit: {0, 4} Celsius: {1, 5:##0.0}", fahrenheit, fahrenheitToCelsius(fahrenheit));

  match (match (fahrenheit >= 0) { | true  => fahrenheit < 300 | false => false })
  {
    | true  => loop(fahrenheit + 20);
    | false => ()
  }
}

loop(400);

Lets pick the nested match operator apart:

match (fahrenheit >= 0) { | true  => fahrenheit < 300 | false => false }

HINT
Nemerle operators can be written in one line or several. The way they are laid out and their indentation are needed only to make the code more readable and depend on the programmer’s preferences and (possibly) the coding standards in the group of programmers to which the programmer belongs.
If you have no code formatting preferences, then try to follow the formatting style used in this work. In any case, using some formatting style will make it easier to read code for you and for others. Try to avoid messy formatting, use a consistent style, don’t create overly long lines (longer than the width of the editor’s window on an average monitor –about 80-120 symbols), use spaces or tabs for block body indentation.

The first thing I would like to point your attention is that one match operator is nested in another. This is a very important feature of this operator, since it enables you to write code consisting of only expressions. Moreover, as you will see later, this feature enables you to use match as a construction block for more compact operators.
This second thing is the use of operators “>=” and “<” – these are the “comparison operators” (see the table below to understand how comparison operators work).

Comparison operator Name Examples (result follows “=>”)
< “less than” 1 < 2 => true
1 < 1 => false
2 < 1 => false
<= “less than or equal to” 1 <= 2 => true
1 <= 1 => true
2 <= 1 => false
> “greater than” 1 > 2 => false
1 > 1 => false
2 > 1 => true
== “equal” 1 == 2 => false
1 == 1 => true
2 == 1 => false
>= “greater than or equal to” 1 >= 2 => false
1 >= 1 => true
2 >= 1 => true
!= “not equal” 1 != 2 => true
1 != 1 => false
2 != 1 => true

If fahrenheit is greater than or equal to zero, the “>=” operator returns “true”, which matches the pattern “| true”, triggering the expression “fahrenheit < 300”. It, in turn, becomes true when fahrenheit is less than three hundred, else it is false. If the value of fahrenheit is less than zero, the operator “>=” returns false, which matches the pattern “false”. In this case, the value of the operator is “false” (since the expression “false” follows the “=>” operator.
This way, if the value of fahrenheit is greater than or equal to zero and less than three hundred, the match will return “true”, otherwise it will return “false”.
Some might consider this expression to be unwieldy. I gladly agree! As people nowadays say: TL/DR. After all, we only need to connect two operators with an “and”. In our code, it would make sense to use the appropriate operator taking two logical values and returning “true” only when both are “true” and returning “false” otherwise. This operator – “&&” – is built into all C-like languages (C, C#, Java, and C++), but Nemerle does not have it. Compare:

match (fahrenheit >= 0)
{
  | true  => fahrenheit < 300
  | false => false
}

and

fahrenheit >= 0 && fahrenheit < 300

Of course, Nemerle code is more verbose and, therefore, less expressive!
But there is also a benefit. The language has fewer constructs, and is thus easier to understand and remember.

NOTE
Niklaus Wirth — the creator of the famous Pascal programming language (as well as a number of other less famous languages) and generally an IT celebrity — considers simplicity a programming language’s main virtue, from the educational point of view.

But such an advantage can only be enjoyed for a short while. Later on, it becomes a serious detriment! If the language could not be extended, then many (including me) would have considered such simplicity a serious flaw in the language (as happens in practice with Wirth’s later creations: languages Oberon and Oberon 2). Fortunately, Nemerle is an extensible language, and the main extension tool is – macros! They let us kill two birds with one stone: have a simple and concise basic language, while still allowing Nemerle programmers to use the most convenient constructs (for specific applications). But, all in due time – soon I will show you how to be more expressive with macros. For now, let’s look at some more code.
If the match results in value “true”, control flow reaches the pattern:

  | true  => loop(fahrenheit + 20);

of the outer match operator. This causes a recursive call of the loop function.
However, if the value of the nested operator is “false”, then the following pattern is reached:

  | false => ()

which leads to the end of the loop function’s execution and, therefore, the whole program.
This way, an error in the argument (value 400, instead of 0) will not lead to an infinite loop. It will simply produce the following value once:

Fahrenheit  400 Celsius 204,4

Notice that the last pattern does not use the wildcard “_” and that the pattern’s type is different. In the first example, it was an integer value, but in this one it is a boolean (logical). This is another built-in data type in Nemerle (and .NET). It can take one of only two values: “true” and “false”. This data type is not compatible with any other. So, if you try to enter an expression of another type, such as:

...
match (fahrenheit)
{
  | true  => loop(fahrenheit + 20);
  | false => ()
}
...

the compiler will tell you that it expected something other than what it was given:

f2c.n:15:7:15:11: error: expected bool, got int+ in matched value: 
System.Boolean is not a subtype of System.Int32 [simple require]

(meaning: expected a boolean, but got an integer)

Problems:
1. Rewrite the code of the loop function in such a way that in case the value was outside the range 0 .. 300, an error message was printed to the console.
2. Rewrite the code of the loop function in such a way that the function takes two parameters: the start and the end of a range, checks to make sure that the former is less than the latter by at least 20, and only then produces a list of values for the given range.
3. Write a program for printing a conversion table from Celsius to Fahrenheit (i.e. the reverse table).

Built-in data types

The compiler message in the previous section:

f2c.n:15:7:15:11: error: expected bool, got int+ in matched value: 
System.Boolean is not a subtype of System.Int32 [simple require]

contains words “bool”, “System.Boolean”, “int”, and “System.Int32”. If you have never programmed in statically typed languages, then I inform you that these are data types. Where did they come from? They were inferred by the compiler. The Nemerle compiler is smart enough to infer types of parameters (and variables, as we will see later) based on the values with which they are initialized and the context, in which they are used. The compiler sees that the parameter fahrenheit is used in the match operator, all patterns in which have a boolean type (the boolean type in Nemerle is named “bool”). It also see that the parameter is added to 20 and infers that the parameter must be an integer value (a 32-bit integer value in Nemerle has type “int”). The compiler tries to unify these types, but it can’t, so it produces the error message saying “I expected bool, but got an int+”.
I have something to say about that “+”, but will do it later, when the time is right. For now, I will instead explain System.Boolean, System.Int32, and where they come from. System.Boolean and System.Int32 are the names of types “bool” and “int”, respectively, used in .NET. This is to say that Nemerle has the corresponding aliases for the types System.Boolean and System.Int32. So, they are the same types; it is just that one part of the compiler is using the alias names, while another .NET type names.
Here is a list of basic .NET types and there Nemerle aliases:

.NET type Nemerle alias Description
System.Byte; byte Unsigned integer, 8-bit
System.SByte; sbyte Signed integer, 8-bit
System.Int16; short Signed integer, 16-bit
System.UInt16; ushort Unsigned integer, 16-bit
System.Int32; int Signed integer, 32-bit
System.UInt32; uint Unsigned integer, 32-bit
System.Int64; long Signed integer, 64-bit
System.UInt64; ulong Unsigned integer, 64-bit
System.Single; float Floating-point number, 32-bit
System.Double; double Floating-point number, 64-bit
System.Decimal; decimal High-precision floating-point number
System.String; string String
System.Object; object Any type can be converted to this one
System.Boolean; bool Boolean type (true, false)
System.Char; char Symbol. A 16-bit integer for string symbol representation
System.Void; void No type

Floating point numbers store not only the integer part of a number, but also the fractional, for example, 4.555 or 123.1. Given that, the fractional part does not have a fixed length, such as 4 significant digits, but can change or “float”. This is where the name comes: “floating point number”.

NOTE
Thanks to this storage format, floating point numbers can take values from a very large range. But at the same time, they sacrifice some precision, since a number cannot have both a very large integer part and a precise fractional. It can be large, but approximate. In other words, precision is traded for size. As everywhere in life, there are compromises.

The void type is a special type that is not even a type, really. It is used when we need to state that a function does not return any value or, as we said above, returns “nothing”. This type cannot be set for parameters or variables. However, an expression can have a void type. The () expression used in the previous section has this very type. In addition, the loop function has void type.
Nemerle also has user-defined types (types that could be defined in source code) and functional types. But these are higher matters that we will deal with a little later on.
For now, you need to understand that every value in Nemerle has a type.

Type inference and explicit specification

Types can be inferred by the compiler or explicitly specified by the programmer.
It is usually unnecessary to explicitly specify types for local functions and inside function bodies. However, it is sometimes helpful. However, global functions need their types to be specified. How is it done? Why, very simply. Here is the example from a previous section, in which parameter types are explicitly stated:

using System.Console;

def fahrenheitToCelsius(fahrenheit : int) : double
{
  5.0 / 9 * (fahrenheit - 32)
}

def loop(fahrenheit : int) : void
{
  WriteLine("Fahrenheit: {0, 4} Celsius: {1, 5:##0.0}", fahrenheit, fahrenheitToCelsius(fahrenheit));

  match (match (fahrenheit >= 0)
         {
           | true  => fahrenheit < 300
           | false => false
         })
  {
    | true  => loop(fahrenheit + 20);
    | false => ()
  }
}

loop(0);

HINT
Take a look. The nested match operator is formatted differently. It is still an expression; only the formatting is different!

If a type is specified explicitly, the compiler will use it, instead of trying to infer a type from initialization or usage. This can improve error messages and sometimes makes it easier to find type-related errors.
In addition, Nemerle lets you state types not only for function parameters and variables, but for any expressions. For example, if you need to use the floating point constant “5.0”, you can state the type in the corresponding expression part:

def fahrenheitToCelsius(fahrenheit : int) : double
{
  (5 : double) / 9 * (fahrenheit - 32)
}

The type specification syntax is “expression : type”.

NOTE
The parentheses in the example are required, because otherwise the compiler will see the expression “double / 9” as a type, which is wrong. In many other cases, the parentheses are unnecessary.

HINT
If you encounter a situation, in which the compiler gives you a type-related error (such as in the preceding section), but you do not understand what the compiler wants from you (what it means, exactly), you might want to specify types explicitly in places related to the error. This will help the compiler determine the location and cause of the error. After you find the error and fix it, you can remove the type specifications (although, you can leave them in — up to you).

NOTE
Situations demanding type specification are very rare, but they exist. They are usually related to code ambiguities, where there are two equally good alternatives and the compiler is unable to decide which one to choose.

Macros

First macro

In the beginning of this article, I mentioned that the first examples might not be very expressive, but we would improve them as we go. In the temperature conversion example I used a nested match operator. Along the way, I noted that our code was less expressive than analogous code in C-like languages and promised to improve it. Now is the time to do this.
To start, let’s compile that program of ours, which displayed a temperature conversion table, forcibly disabling the macros included in the standard library. For this, we need to add the compiler key “-nostdmacros”. Do this and run the program. The command line should look something like this:

ncc -nostdmacros -no-color f2c.n -out: f2c.exe

HINT
By the way, you can simplify compilation by creating a batch file to call the compiler and, should compilation succeed, run the program. If you use Windows, you need to create a file with the extension “.cmd”, such as “c.cmd”, and the following content:

@echo off
ncc -nostdmacros -no-color %1.n -out:%1.exe
if %errorlevel% == 0 %1.exe

The program no longer works correctly? Don’t worry, this is only due to operator priorities, which are specified in the standard library. In order to correct the mistake, you merely have to put the expression “5.0 / 9” in parentheses:

(5.0 / 9) * (fahrenheit - 32)

Do this and try to compile the code again. Does it work? Then let’s move on.
Now we have to create another source file for our macros. Let’s name it “MyMacros.n”.
The macro implementing the operator “&&” from other C-like languages will look like this:

macro @&& (e1, e2) 
{
  <[ 
    match ($e1)
    {
      | true  => $e2
      | false => false
    }
  ]>
}

Write this in MyMacros.n and compile it as a library. The command line should look something like this:

ncc -no-color -target:library MyMacros.n -ref:Nemerle.Compiler.dll -out:MyMacros.dll

What changed in the command line, besides the source file name (MyMacros.n)?
First, we added the key “-target:library”. It tells the compiler to generate a library, instead of an executable file. Second, we changed the output file’s extension (to dll, which stands for Dynamic Link Library). Third, we added the key “-ref:Nemerle.Compiler.dll”, which tells the compiler to add a reference to the “Nemerle.Compiler.dll” library. Nemerle.Compiler.dll is a library implementing the Nemerle compiler. The ncc.exe tool, which we have been using to compile our programs all this time, is merely a command line interface. All of the compiler’s internal logic is implemented in Nemerle.Compiler.dll, which is what ncc.exe uses.
Macros are compiler plugins (optional modules). They use many of the types declared in the compiler. This is why macros need a reference to the compiler library.

NOTE
Notice that this is the first time we use types from an external (requiring an explicit reference) library.

So, if you did everything right, running the aforementioned command should create the file MyMacros.dll in the current folder. This is the assembly containing our first macro.
Now, let’s add this assembly to our temperature conversion program.
In order to do this, add the key “-macros:MyMacros.dll” to the command line compiling the program. The command line should look something like this:

ncc -nostdmacros -no-color f2c.n -macros:MyMacros.dll -out: f2c.exe

HINT
Once again, it would make sense to create a batch file to compile MyMacros.dll, then build the program, then run it. Here is the “c.cmd” batch file for doing this:

@echo off
set PATH=%ProgramFiles%\Nemerle;%PATH%

rem %1 – program file name
rem %2 – macro library name
if "%1" == "" goto usage
if "%2" == "" goto usage

rem Compile the macro library
ncc -no-color -target:library %2.n -ref:Nemerle.Compiler.dll -out:%2.dll
rem Exit on compilation failure
if NOT %errorlevel% == 0 exit /B %errorlevel%

rem Compile the program
ncc -nostdmacros -no-color %1.n  -macros:%2.dll -out:%1.exe
rem Exit on compilation failure
if NOT %errorlevel% == 0 exit /B %errorlevel%


rem Run the program
%1.exe
rem Terminate the batch file
exit /B 0

rem Output batch file usage
:usage
echo usage: c.cmd ProgramFileName MacrosFileName

HINT
The given batch file can be used as follows:

c f2c MyMacros

Now we can use the “&&” macro in our program. Replace the nested match in the following way:

using System.Console;

def fahrenheitToCelsius(fahrenheit)
{
  (5.0 / 9) * (fahrenheit - 32)
}

def loop(fahrenheit)
{
  WriteLine("Fahrenheit: {0, 4} Celsius: {1, 5:##0.0}", fahrenheit, fahrenheitToCelsius(fahrenheit));

  match (fahrenheit >= 0 && fahrenheit < 300)
  {
    | true  => loop(fahrenheit + 20);
    | false => ()
  }
}

loop(0);

Let’s now compile the program. Don’t forget to first compile the macro assembly with the “&&” macro source code. The MyMacros.dll assembly should be in the same folder as the source file f2c.n. The easiest way to do this is to use the batch file shown previously.
Unfortunately, you get an error message:

f2c.n:13:10:13:59: error: in argument #1 of <.s, needed a int, got bool-: 
System.Boolean is not a subtype of System.Int32 [simple require]

The problem is again with operator priorities. In order to get the code to compile, enclose the comparison operators in parentheses:

match ((fahrenheit >= 0) && (fahrenheit < 300))

Now, if you compile the example and run it, it works as it should.
The priority problem can be solved by adding the following lines at the beginning of the MyMacros.n file:

using Nemerle.Internal;

[assembly: OperatorAttribute("Nemerle.Core", "*",  false, 260, 261)]
[assembly: OperatorAttribute("Nemerle.Core", "/",  false, 260, 261)]
[assembly: OperatorAttribute("Nemerle.Core", "+",  false, 240, 241)]
[assembly: OperatorAttribute("Nemerle.Core", "-",  false, 240, 241)]
[assembly: OperatorAttribute("Nemerle.Core", "<",  false, 210, 211)]
[assembly: OperatorAttribute("Nemerle.Core", ">",  false, 210, 211)]
[assembly: OperatorAttribute("Nemerle.Core", "<=", false, 210, 211)]
[assembly: OperatorAttribute("Nemerle.Core", ">=", false, 210, 211)]
[assembly: OperatorAttribute("Nemerle.Core", "==", false, 165, 166)]
[assembly: OperatorAttribute("Nemerle.Core", "!=", false, 165, 166)]
[assembly: OperatorAttribute("Nemerle.Core", "&&", false, 160, 161)]
[assembly: OperatorAttribute("Nemerle.Core", "||", false, 150, 151)]

macro @&& (e1, e2) 
{
  <[ 
    match ($e1)
    {
      | true  => $e2
      | false => false
    }
  ]>
} 

This is an operator priority description set using the global attribute OperatorAttribute. We will talk about attributes in later chapters. For now, it is not important. You only need to understand what the parameters of this attribute mean. The first parameter sets the namespace, in which the operators are declared. “Nemerle.Core” is the system namespace, open by default. The second parameter is the operator name. The third says, whether the operator is unary (true) or binary (false). A unary operator has only one argument. For example, if we need to change the sign of a number, we write “-x”. The minus in this case is a unary operator. On the other hand, if we are subtracting one number from another (“2 – 1”), then we use a binary operator. The next two parameters set the priority operand priority (right and left) — the lower the number, the lower the priority. This way, we give the lowest priority (150) to the operator “||” and the highest priority to “*” and “/”. This is why the operators “*” and “/” have equal priority.

NOTE
Priority values are not consecutive, in order to let others add priorities for other operators that should go between the existing ones.

Now, if we remove the parenthesis we just added to our example and try to compile it, it compiles successfully and produces the familiar table when executed.

Delving into the first macro

Let’s figure out our first macro:

macro @&& (e1, e2) 
{
  <[ 
    match ($e1)
    {
      | true  => $e2
      | false => false
    }
  ]>
}

What is a macro? What is it made of? What does it do?
A macro is a function that received code fragments as its parameters and returns (generates) another code fragment. The returned fragment is usually made from templates stored in the macro and fragments received as parameters.
The only obvious difference between a macro and a function is that a macro is preceded by the keyword “macro”.
The macro body (traditionally enclosed in a block) can contain any Nemerle expressions, including those using other macros (compiled and available during this macro’s compilation).
In our case, the macro’s body consists of the expression:

<[ 
  match ($e1)
  {
    | true  => $e2
    | false => false
  }
]>

There is some magic in this expression! The thing is that code (the match operator, in this case) enclosed in brackets “<[ ]>” is not interpreted by the compiler as code to be converted into executable bits, but as a sort of code template to be converted into special internal representation used by the compiler. This representation is called AST (Abstract Syntax Tree). This is the type of macro parameters and the type of the value it returns.
Such a template is called quotation in Nemerle. It is like we are quoting code. You probably noticed that there are expressions “$e1” and “$e2” in the quotation. These refer to the corresponding macro parameters (“e1” and “e2”). The “$” symbol is needed to tell the compiler that this is not quoted code, but outside references. These expressions are substituted into the quotation (in the place they are referenced) and thus form the resulting code.
The program code feeds two expressions into the macro:

fahrenheit >= 0 

and

fahrenheit < 300

They are substituted in place of “$e1” and “$e2” and form the following code:

  match (fahrenheit >= 0)
  {
    | true  => fahrenheit < 300
    | false => false
  }

This is exactly the code that we used to have to write by hand.
This way, one could say that a macro is a kind of syntactic sugar for converting one (usually simpler) syntax into another.
Of course, this is a simplified view of macros, but it is sufficient for us to use the advantages of macros and make our code more terse and expressive.
An attentive reader could notice that two things remained unexplained:
1. What does the “@” symbol in front of “&&” mean?
2. How does the compiler now that the macro declares an operator?
The “@” is a special symbol used by the language to say that a name is just an identifier, not anything special (such as an operator or keyword).
For instance, we cannot name something (such as a function or its parameter) in our program “def” or “match”. But if we begin such a name with the “@” symbol, the compiler will know that it is a name, not a keyword and let us use it. And the name will be stored in program metadata as if it had no prefix “@”.

NOTE
Compiling a Nemerle program creates a so-called assembly, which contains not only executable code, but also additional information about the program (type, function description, etc). This information is called “metadata”.

When it comes to operators, the compiler views all sequences of symbols: ‘=’, ‘<’, ‘>’, ‘@’, ‘^’, ‘&’, ‘-’, ‘+’, ‘|’, ‘*’, ‘/’, ‘$’, ‘%’, ‘!’, ‘?’, ‘~’, ‘.’, ‘:’, and ‘#’ operator names. Secondly, the compiler lets you use macros with one or two parameters as operators. This way, you can create operators with word names (such as “and” and “or”).
Any macro can be used as a function (using the function call syntax). This way, we could have used our “&&” macro like this:

...
match (@&&(fahrenheit >= 0,  fahrenheit < 300))
...

The compiler would have understood us.

WARNING
Take care to understand macros correctly!!
It is very important to keep in mind that the macro is not called when the control flow reaches the point in the program where the macro is used, but when the compiler compiles the code!
The macro gets source code as its parameters and forms different source code, which is then compiled into executable code and ran during program execution.
It is incorrect to say that a macro is called by a program. It is correct to say that a macro is expanded during program compilation.
Remember this when you talk about macros and you will soon understand what a powerful and convenient tool they are.

In order to get a better understanding of macros, try the following problems:

NOTE
1. Write the macro “||” implementing the “logical or” operation.
2. Add console output above the quotation and see when they are printed. Separate expressions with semicolons, like in normal functions.
3. Add console output, but place it inside the quotation. See when and how many times it is called.

At this note, we finish our first introduction to macros. Nemerle supports several types of macros and offers powerful tools for processing code inside them. To understand why we need them and how they work, we will have to take a closer look at Nemerle itself.

Intermediate summary

It might surprise you, but those language features that we saw are already sufficient for writing fairly non-trivial programs. Of course, it is fairly difficult to write a complicated program with the limited arsenal of operators that I showed you without knowing the libraries. In the subsequent chapters we will address this difficulty. So, don’t get the impression that Nemerle is a limited language. We are just getting started.

References