# Identity, Immutability, and .NET Polyglot Notebooks

Michael L Perry
https://michaelperry.net

Learning objectives:

- Learn to create documentation in Polyglot Notebooks
- Understand how immutable data structures work in .NET
- Recognize the benefits of immutable data structures
- Differentiate between intrinsic and extrinsic identity
- Convert a database to use intrinsic identity
- Model a system using immutable records
- Project immutable history into current state
- Determine how state changes as history evolves

Follow along at https://github.com/michaellperry/Immutability

## Create documentation in Polyglot Notebooks

Install Visual Studio Code from https://code.visualstudio.com/.
Search the extensions marketplace for "Polyglot Notebooks".

Create a new file using the "Polyglot Notebook: create new blank workbook" command.
Choose the *.ipynb file extension.

Create Markdown and Code cells.
Click the language selector in the bottom right corner of the cell to change the language.

Type some code in a code cell.
Press Ctrl+Enter to run the code.
If the last line of code is an expression, the result is displayed below the cell.

## Immutable data structures in .NET

Immutability is the conscious decision to not change data.
It is a design choice that has many benefits.

In .NET, we can create immutable data structures using records.

In [1]:
record Person(String name, DateTime dateOfBirth) {}

var michael = new Person("Michael Perry", new DateTime(1971, 5, 10));

michael

Unnamed: 0,Unnamed: 1
name,Michael Perry
dateOfBirth,1971-05-10 00:00:00Z


In [2]:
int AgeOf(Person person)
{
  var today = DateTime.Today;
  var age = today.Year - person.dateOfBirth.Year;
  if (person.dateOfBirth > today.AddYears(-age)) age--;
  return age;
}

AgeOf(michael)

.NET also provides immutable collections.
A useful one is `ImmutableArray<T>`.

In [3]:
using System.Collections.Immutable;

var letters = ImmutableArray.Create('a', 'b', 'c');

letters

You cannot change an immutable collection.

In [4]:
letters.Add('d');

letters

You can just create new immutable collections from existing ones.

In [5]:
var moreLetters = letters.Add('d');

moreLetters

## Tic-Tac-Toe Example

One of the best reasons to use immutable data structures is to search a space.
For example, in a game of Tic-Tac-Toe, we can search the space of all possible moves.
I've created a Tic-Tac-Toe API that uses immutable data structures.

In [6]:
#r "TicTacToe\bin\Debug\net7.0\TicTacToe.dll"

using TicTacToe;

In [7]:
var game = Game.Empty
  .Play(4);

game.Html

In [8]:
game = game.Play(1);

game.Html

Because we are using immutable data structures, I can produce the next state of the game without destroying the previous state.
This makes it easier to search the space of all possible moves.

In [9]:
game.EmptySquares.Select(square => game.Play(square).Html)

index,value
0,Tic-Tac-Toe Board  XOX
1,Tic-Tac-Toe Board  OXX
2,Tic-Tac-Toe Board  OXX
3,Tic-Tac-Toe Board  OXX
4,Tic-Tac-Toe Board  OXX
5,Tic-Tac-Toe Board  OXX
6,Tic-Tac-Toe Board  OXX


We can write a function that evaluates a game and predicts who will win.
If the game is an immediate win, then we return the winner.
If there are no more moves, then we return a draw.
Otherwise, we recursively evaluate all possible moves for the next player.

If one of those is a win for the current player, then we assume they will make that winning move.
If there is no win, but there is a draw, then we assume the current player will make that move.
Otherwise, we assume the opponent will win.

In [11]:
Symbol Evaluate(Game game)
{
  var winner = game.Winner;
  if (winner != Symbol.Empty)
    return winner;
  if (!game.EmptySquares.Any())
    return Symbol.Empty;

  var outcomes = game.EmptySquares
    .Select(square => game.Play(square))
    .Select(nextGame => Evaluate(nextGame))
    .ToImmutableArray();
  if (outcomes.Any(outcome => outcome == game.NextPlayer))
    return game.NextPlayer;
  if (outcomes.Any(outcome => outcome == Symbol.Empty))
    return Symbol.Empty;
  return outcomes.First();
}

game.EmptySquares.Select(square => game.Play(square))
  .Select(nextGame => nextGame.HtmlWithOutcome(Evaluate(nextGame)))

index,value
0,Tic-Tac-Toe Board  XOXX wins
1,Tic-Tac-Toe Board  OXXX wins
2,Tic-Tac-Toe Board  OXXX wins
3,Tic-Tac-Toe Board  OXXX wins
4,Tic-Tac-Toe Board  OXXX wins
5,Tic-Tac-Toe Board  OXXDraw
6,Tic-Tac-Toe Board  OXXX wins


## Identity

When using a SQL database, we often use a primary key to identify a row.
That primary key is often a number that is incremented for each new row.

In [18]:
#r "nuget:Microsoft.DotNet.Interactive.SqlServer,*-*"

#!connect mssql --kernel-name school "Persist Security Info=False; Integrated Security=true; Initial Catalog=School; Server=localhost; TrustServerCertificate=True;"

Kernel added: #!sql-school

In [25]:
#!sql-school
-- Create the Course table if it doesn't exist
IF NOT EXISTS (SELECT * FROM sysobjects WHERE name='Course' and xtype='U')

BEGIN
  CREATE TABLE Course (
    CourseID int IDENTITY(1,1) PRIMARY KEY,
    Code nvarchar(10) NOT NULL,
    Title nvarchar(100) NOT NULL,
    Credits int NOT NULL
  );
END

Commands completed successfully.

Imagine that we have an API that creates a new course whenever the caller POSTs a request.
We use this SQL to insert new rows into this table.

In [29]:
#!sql-school
-- Create a new course with identifier CPSC-301
INSERT INTO Course (Code, Title, Credits)
VALUES ('CPSC-301', 'Analysis of Algorithms', 3);

(1 row affected)

Here's the set of courses that we have in our database.

In [42]:
#!sql-school
SELECT * FROM Course;

(1 row affected)

CourseID,Code,Title,Credits
1,CPSC-301,Analysis of Algorithms,3


What happens if that API call times out?
We might retry the call, and then we would have two rows for the same course.
Try running that SQL again and see what happens.

How can we prevent duplication?
The answer has to do with identity.

### Intrinsic and extrinsic identity

The primary key is not part of the data.
Notice that it was not included in the INSERT statement.
It didn't come from the caller.
This is extrinsic identity.

Maybe there's an identity that is intrinsic to the data.
How about the course code?

To document this, we create a unique constraint on the course code.

In [35]:
#!sql-school
TRUNCATE TABLE Course;

CREATE UNIQUE INDEX IX_Course_Code ON Course (Code);

Commands completed successfully.

With this index in place, our original INSERT statement will fail if we try to insert the same course code twice.
It's better to let this INSERT statement succeed but do nothing if the course code already exists.

In [41]:
#!sql-school
INSERT INTO Course (Code, Title, Credits)
SELECT 'CPSC-301', 'Analysis of Algorithms', 3
WHERE NOT EXISTS (SELECT 1 FROM Course WHERE Code = 'CPSC-301');

(0 rows affected)

The course code is an intrinsic identity.
Since it identifies the course, we should not be able to change it.
This record has some parts that are immutable -- the course code -- and some parts that are mutable -- the title and credits.

See where this is going?

## Modeling a system using immutable records

Let's split the record apart.
The parts that are immutable stay with the Course, and are its intrinsic identity.
We can do this with a C# record.

In [43]:
record Course(string code) {}

The parts that are mutable become separate C# records.

In [44]:
record CourseTitle(Course course, string title) {}
record CourseCredits(Course course, int credits) {}

Let's bring in a library that lets us query these records using LINQ.

In [46]:
#r "nuget:Jinaga, 0.4.0"
#r "nuget:Jinaga.Graphviz, 0.4.0"

using Jinaga;
using Jinaga.Graphviz;

In Jinaga, these immutable records are called Facts.

In [48]:
[FactType("School.Course")]
record Course(string code) {}

[FactType("School.CourseTitle")]
record CourseTitle(Course course, string title) {}

[FactType("School.CourseCredits")]
record CourseCredits(Course course, int credits) {}

Renderer.RenderTypes(typeof(Course), typeof(CourseTitle), typeof(CourseCredits))

Let's create a few fact instances.

In [49]:
var j = JinagaClient.Create();

var ananysisOfAlgorithms = await j.Fact(new Course("CPSC-301"));
var aatitle = await j.Fact(new CourseTitle(ananysisOfAlgorithms, "Analysis of Algorithms"));
var aacredits = await j.Fact(new CourseCredits(ananysisOfAlgorithms, 3));

Renderer.RenderFacts(ananysisOfAlgorithms, aatitle, aacredits)

We can find the title of the course using LINQ.

In [51]:
var titlesOfCourse = Given<Course>.Match((course, facts) =>
  facts.OfType<CourseTitle>()
    .Where(title => title.course == course)
    .Select(title => title.title)
);

var aatitles = await j.Query(ananysisOfAlgorithms, titlesOfCourse);
aatitles

Now let's change the title of the course.

In [52]:
var aatitle2 = await j.Fact(new CourseTitle(ananysisOfAlgorithms, "Algorithms and Data Structures"));

Renderer.RenderFacts(ananysisOfAlgorithms, aatitle, aatitle2)

This course now has two titles.
Go back and run the LINQ query again.
Which one is correct?

### Modeling replacement

To figure out which one is correct, we need to know which one replaced the other.
We can do this by adding a `prior` property to the fact.

In [55]:
[FactType("School.CourseTitle")]
record CourseTitle(Course course, string title, CourseTitle[] prior) {}

[FactType("School.CourseCredits")]
record CourseCredits(Course course, int credits, CourseCredits[] prior) {}

Renderer.RenderTypes(typeof(Course), typeof(CourseTitle), typeof(CourseCredits))

Now we can indicate that the second title replaces the first.

In [59]:
j = JinagaClient.Create();
var aatitle = await j.Fact(new CourseTitle(ananysisOfAlgorithms, "Analysis of Algorithms", new CourseTitle[] { }));
var aatitle2 = await j.Fact(new CourseTitle(ananysisOfAlgorithms, "Algorithms and Data Structures", new CourseTitle[] { aatitle }));

Renderer.RenderFacts(ananysisOfAlgorithms, aatitle, aatitle2)

Now we can clearly see that the second title is the correct one.
It is the one at the bottom of the graph.

We can modify the LINQ query to find the title that has no successor.

In [60]:
var titlesOfCourse = Given<Course>.Match((course, facts) =>
  facts.OfType<CourseTitle>()
    .Where(title => title.course == course &&
      !facts.Any<CourseTitle>(next => next.prior.Contains(title)))
    .Select(title => title.title)
);

var aatitles = await j.Query(ananysisOfAlgorithms, titlesOfCourse);
aatitles

## Determine how state changes as history evolves

One of the things that immutable data structures were good for is searching a space.
We can search the space of specifications to find out how the state of the system changes as we add new facts.
Let's start by looking at the specification that we just wrote.

In [61]:
titlesOfCourse.ToString()

(course: School.Course) {
    title: School.CourseTitle [
        title->course: School.Course = course
        !E {
            next: School.CourseTitle [
                next->prior: School.CourseTitle = title
            ]
        }
    ]
} => title.title


Now let's look at its inverses.

In [62]:
titlesOfCourse.ComputeInverses().Select(inverse => inverse.InverseSpecification.ToString())

When we add a new course title, we have to do two things.
First, we add the new course title.
Second, we remove the old course title.
These two inverses describe how that happens.

This is all possible because specifications are also immutable data structures.