Semantics of INCLUDE (this has to do with namespaces) #1467

stefjoosten · 2024-02-16T17:33:36Z

Problem

Namespaces require semantics that will prepare us to work with distributed systems and allow us to do data migrations. So far, we have generated information systems with one unified namespace. The semantics of the INCLUDE statement until Ampersand vs. 5.0 is the set union. To support data migration, we need to support three systems, one of which has an INCLUDE relation with the two others.

Requirements

Proposed solution

In issue #850 we decided to borrow Haskell's module mechanism, with one file for each module. Each file starts with a MODULE statement, so let's replace the CONTEXT statement from Ampersand with the MODULE statement. Without any INCLUDE statements, Ampersand compiles the entire file into one information system containing a dataset, a schema, and a set of interfaces. So it compiles a module called ${\tt bar}$ to a triple $\langle D_{\tt bar}, S_{\tt bar}, F_{\tt bar}\rangle$. With an INCLUDE statement, we need to define that every identifier in the included module is known in the including module by the prefix " ${\tt bar.}$ ". To define renaming, need an operator $\downarrow$, just for defining the semantics in the compiler:
${\tt x\downarrow y\ =\ x<>}$ "." ${\tt<>y}$
I will overload this operator to work for information systems, datasets, schemas, interface sets, and their constituent elements as well, meaning that $x\downarrow y$ prefixes the name $x$ together with a dot to every identifier in the namespace of $y$. For example, if $y$ contains the name client, then $x\downarrow y$ contains the name x.client on every qualifying occurrence of client in $y$.

Let ${\tt foo}$ and ${\tt bar}$ be information systems. Each has a dataset, a schema, and some (0...) interfaces.
Let $D_{\tt foo}$ and $D_{\tt bar}$ be datasets. Let $S_{\tt foo}$ and $S_{\tt bar}$ be schemas. Let $F_{\tt foo}$ and $F_{\tt bar}$ be sets of interfaces. Now we can define the system ${\tt foo\ INCLUDES\ bar}$ as:

$D_{\tt foo\ INCLUDES\ bar}\ =\ D_{\tt foo}\cup {\tt bar}\downarrow D_{\tt bar}$

$S_{\tt foo\ INCLUDES\ bar}\ =\ S_{\tt foo}\cup {\tt bar}\downarrow S_{\tt bar}$

$F_{\tt foo\ INCLUDES\ bar}\ =\ F_{\tt foo}\cup {\tt bar}\downarrow F_{\tt bar}$

For the datasets, this means that all relation names and concept names in ${\tt bar}$ are prefixed with ${\tt bar}$. Atoms are left alone. In the schema of ${\tt bar}$, all rule names, relation names, concept names, pattern names, and view names are prefixed with ${\tt bar}$. All rule names, relation names, concept names, and interface names from $F_{\tt bar}$are prefixed with ${\tt bar}$.

Surely, name clashes can occur. If, for example, system ${\tt foo}$ contains a name bar.account and ${\tt bar}$ contains a name account, the system $D_{\tt foo\ INCLUDES\ bar}$ has a name clash. We will forbid that to ensure a disjoint union semantics.

Alias

In the current implementation, two relation declarations with the same name, source, and target are treated as the same. I don't mind this to remain, but it does not work across the INCLUDE mechanism (because we forbid name clashes). I propose to do this explicitly with an ALIAS statement, for example:

ALIAS client, bar.client

This statement presumes that aliases have the same type, or else we get type errors. Needless to say, the ALIAS statement can also work inside one namespace. It is not linked to the INCLUDE mechanism. Aliasing works for concepts and relations, but not for other named entities.

Consequences

This mechanism excludes cyclic INCLUDE-dependencies. I expect the proposed mechanism to meet the requirements of the migration mechanism, but I will leave that to @sjcjoosten to verify. I hope that this include-relation between information systems is transitive. If not, I would like to fix that, so we can draw an include-graph of the system.

If module ${\tt foo}$ includes module ${\tt bar}$, we currently implement both ${\tt foo}$ and ${\tt bar}$ on the same database. For distributed systems, we will have to allow them to be implemented on different databases. I suggest we do that in another issue.

The text was updated successfully, but these errors were encountered:

hanjoosten · 2024-02-17T15:14:05Z

I don't get this. What problem is there to be solved?

stefjoosten · 2024-02-18T07:06:49Z

The problem is that we have no agreed-upon semantics of the namespace stuff. So how are we going to build it first-time-right?

stefjoosten assigned sjcjoosten Feb 16, 2024

stefjoosten changed the title ~~Multiple datasets~~ Semantics of INCLUDE (this has to do with namespaces) Feb 16, 2024

stefjoosten assigned hanjoosten and Michiel-s Feb 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semantics of INCLUDE (this has to do with namespaces) #1467

Semantics of INCLUDE (this has to do with namespaces) #1467

stefjoosten commented Feb 16, 2024 •

edited

hanjoosten commented Feb 17, 2024

stefjoosten commented Feb 18, 2024 •

edited

Semantics of INCLUDE (this has to do with namespaces) #1467

Semantics of INCLUDE (this has to do with namespaces) #1467

Comments

stefjoosten commented Feb 16, 2024 • edited

Problem

Proposed solution

Alias

Consequences

hanjoosten commented Feb 17, 2024

stefjoosten commented Feb 18, 2024 • edited

stefjoosten commented Feb 16, 2024 •

edited

stefjoosten commented Feb 18, 2024 •

edited