# Demo: Data Types
All data in a computer program has a **type**, which is part of a type system that classifies data and constrains operations. Besides (sort) of standardizing how things are stored, type systems enable detection of mismatches (e.g., passing a string where a number is expected) before or during execution.

**Type System Approaches:**
* **Modern languages** (e.g., [Ruby](https://www.ruby-lang.org/en/) or [Python](https://www.python.org/)) have dynamic type systems that perform checks at runtime, offering flexibility but deferring error detection.
* **Classical languages** (e.g., [C](https://gcc.gnu.org/) or [Java](https://www.java.com/en/)) use static type systems that require explicit declarations and perform checks at compile-time, enabling optimization and early error prevention.
* **Julia's hybrid approach** combines the best of both worlds: it's dynamically typed like Python but performs type inference and generates specialized machine code like C. While dynamically typed at the language level, Julia compiles to optimized, statically-typed machine code.

**What we'll cover:** We'll explore primitive types (numbers, text), collection types (arrays, tuples, sets and dictionaries), and custom composite types, examining their memory representation and manipulation in Julia.
___

## Primitive Data Types
Primitive data types are the basic building blocks provided by programming languages. They're atomic, meaning they're not composed of other types, and represent simple values like numbers, characters, and true/false values.

>**Julia vs. Python storage:**
>Julia's primitive types (`Int64`, `Float64`, `Bool`, `Char`) are stored as _unboxed machine types_ that the compiler uses directly.
> Python's built-in types (int, float, bool) are stored as heap-allocated [PyObject instances](https://docs.python.org/3/c-api/structures.html#c.PyObject), which adds significant overhead because each value requires extra memory for metadata, pointer indirection to access the actual data, and garbage collection tracking.

Let's start with [the `Bool` type](https://docs.julialang.org/en/v1/base/numbers/#Core.Bool) which represents a boolean value, either `true` or `false`. 

### Integer and Boolean Types
An __integer__ type represents whole numbers $x\in\mathbb{Z}$ (positive, negative and zero). Integers are implemented using a _fixed‐width_ binary form (e.g., 32‐ or 64‐bit). A __boolean (Bool)__ type represents truth values (`true` and `false`); Bool values are stored as a single bit or 1 $\times$ byte, i.e., an 8-bit integer value.

__Integers__: Let's look at some integers.

In [4]:
x = 2; # select a whole number ... -2, -1, 0, 1, 2, ...

_What type is `x` in Julia?_ We can find the type of something using [the `typeof(...)` method](https://docs.julialang.org/en/v1/base/base/#Core.typeof):

In [6]:
typeof(x) # this returns the type of the argument

Int64

What is the bitstring of $x$? The bitstring gives the literal bit representation of a primitive type. We can get the bitstring of $x$ using [the `bitstring(...)` method in Julia](https://docs.julialang.org/en/v1/base/numbers/#Base.bitstring). This method shows us the bit pattern of the integer argument:

In [8]:
bitstring(x) # gives the values of the values stored in the memory "boxes"

"0000000000000000000000000000000000000000000000000000000000000010"

__Boolean__: Now let's look at some Boolean types $\mathbb{B} = \left\{\text{true},\text{false}\right\}$. A variable $x\in\mathbb{B}$ can take on either `true` or `false` values. However, we typically model these values as a special type of 8$\times$bit (1 $\times$ byte) integer value.

In [10]:
flag = true; # the flag variable can take on values of {true | false}

_What type is `flag` in Julia?_ We can find the type of something using [the `typeof(...)` method](https://docs.julialang.org/en/v1/base/base/#Core.typeof):

In [12]:
typeof(flag)

Bool

_What is the bitstring of the `flag` variable?_ Let's use [the `bitstring(...)` method in Julia](https://docs.julialang.org/en/v1/base/numbers/#Base.bitstring) to see what our Boolean `flag` variable looks like:

In [14]:
bitstring(flag) # this should be 8xbits wide

"00000001"

___

### Floating point types
Floating-point types model real numbers using three components according to [the IEEE 754 standard](https://en.wikipedia.org/wiki/IEEE_754): a sign bit, an exponent (the scale), and a significand (the fraction). This allows representation of both fractional values and very large or small magnitudes.

> __Julia versus Python floating point numbers__: Julia provides three standard IEEE-754 floating-point types that trade off precision for storage: `Float16` (half-precision), `Float32` (single-precision), and `Float64` (double-precision). Python's built-in `float` type is always 64-bit double-precision.

Let's look at a couple of examples. First, here's a 64-bit number (Julia's default):

In [None]:
let
    x = 54.13; # default: in Julia, the default floating point number is 64-bit.
    bitstring(x)
end

"0100000001001011000100001010001111010111000010100011110101110001"

The same numerical value stored in 32-bits has a different memory layout:

In [None]:
let
    x = 54.13 |> Float32 # we "cast" x to be stored as a Float32, not 64-bits!
    bitstring(x) # gives us a string with the bit pattern
end

"01000010010110001000010100011111"

___

### Character Types
Text on computers is composed of characters, and here's the key insight: characters are just special integers! Each character, whether a letter, digit, punctuation mark, or control code, gets stored as an integer corresponding to a specific encoding scheme. Traditional systems used [ASCII](https://en.wikipedia.org/wiki/ASCII) with one byte per character, while modern systems use [Unicode encodings like UTF-8 or UTF-16](https://en.wikipedia.org/wiki/Unicode) to represent a much wider range of characters.

> __Encodings:__ Character encodings define the mapping between textual symbols and numeric code points (unique integers), enabling text to be stored and transmitted as bytes. Julia's [Char type](https://docs.julialang.org/en/v1/base/strings/#Core.Char) follows this pattern: each character is a 4-byte (32-bit) value that directly encodes a Unicode code point, allowing manipulation either individually or as part of Strings.

Let's explore [the `Char` type in Julia](https://docs.julialang.org/en/v1/base/strings/#Core.Char) (notice the single quotes):

In [49]:
c = '🍣' # example Unicode character in Julia. See: https://docs.julialang.org/en/v1/manual/unicode-input/

'🍣': Unicode U+1F363 (category So: Symbol, other)

What is the code point (special, unique integer) for the character `c`?

In [62]:
code = UInt32(c) # extract code point as UInt32 (4 x bytes)

0x0001f363

_Hmmm, what?_ That's a strange-looking integer! The `code::UInt32` is a [hexadecimal number](https://en.wikipedia.org/wiki/Hexadecimal), i.e., a number written in base 16. The giveaway (which is a convention) is the `0x` prefix. We'll dig into these numbers and count in base b later. 

Can we see the data that each byte contains? Yes! Let's use [the `reinterpret(...)` method](https://docs.julialang.org/en/v1/base/arrays/#Base.reinterpret) and break the 4-bytes in four 1-byte blocks!

In [90]:
reinterpret(Tuple{UInt8, UInt8, UInt8, UInt8}, code) |> collect

4-element Vector{UInt8}:
 0x63
 0xf3
 0x01
 0x00

This factors the 32-bit (4 × byte) number into four 8-bit (1 × byte) numbers. Notice that we're chopping from right to left (same convention as bitstring representations of integers and floating-point numbers above). 
> __Endianness:__ This ordering relates to [Endianness](https://en.wikipedia.org/wiki/Endianness), which describes how computers store the bytes of multi-byte values. In little-endian systems (like most x86/x86-64 and ARM machines), the least significant byte comes first in memory, while big-endian systems store the most significant byte first. So when we reinterpret 0x0001F363 as four UInt8s on a little-endian machine, we get: `[0x63,0xF3,0x01,0x00]`

Characters thus represent our first example of a collection type, an ordered collection of smaller components, in this case a stack of 1 × byte (8-bit) blocks! Let's dig deeper into collection types.

___

## Collection Types
A collection type is a composite data structure aggregating multiple values, often of the same or related types, into a single container (e.g., tuples, arrays, sets, and dictionaries). It is not a primitive type but a collection of primitive types. Let's look at a few examples of collections, starting with one that we have already seen (sort of), namely [Tuples](https://docs.julialang.org/en/v1/manual/functions/#Tuples).

### Tuples
A tuple is an immutable, ordered collection of elements that can hold a fixed number of items, potentially of different types. Once created, their size and contents cannot be changed, making tuples useful for grouping related values without the overhead of a mutable container.

> **Julia tuple memory layout:** Every tuple in Julia is an immutable composite object with a type that encodes its length and element types (e.g., `Tuple{Int64, Float64}`). The memory layout is a contiguous block of fields: if all elements are "isbits" (primitives), the tuple itself is isbits and can be unboxed (often in registers or on the stack). However, if any element is a non-isbits type (like a String), the tuple's fields become [pointers to heap-allocated objects](https://en.wikipedia.org/wiki/Pointer_(computer_programming)), each aligned and stored sequentially.

Let's explore tuples with a concrete example. Since [Tuple types](https://docs.julialang.org/en/v1/base/base/#Core.Tuple) are immutable, they can't be changed once constructed:

In [304]:
example_tuple = let
    tuple = (18,36.6); # populate with data
end;

What is the type of the `example_tuple` variable? Let's use [the `typeof(...)` method](https://docs.julialang.org/en/v1/base/base/#Core.typeof) to find out.

In [306]:
typeof(example_tuple)

Tuple{Int64, Float64}

Tuples are immutable. Let's try to change a value in the `example_tuple::Tuple{Int64, Float64}` variable. This should blow up, because [Tuples in Julia](https://docs.julialang.org/en/v1/base/base/#Core.Tuple) are immutable.

In [308]:
example_tuple[1] = 6 # this will blow up!

LoadError: MethodError: no method matching setindex!(::Tuple{Int64, Float64}, ::Int64, ::Int64)
The function `setindex!` exists, but no method is defined for this combination of argument types.

In [309]:
bitstring(example_tuple) # We can't get the bitstring directly, a Tuple is not primitive!

LoadError: ArgumentError: Tuple{Int64, Float64} not a primitive type

However, we can get the elements of `example_tuple` and their bit layouts by [indexing into the Tuple](https://docs.julialang.org/en/v1/base/base/#Core.Tuple)

In [311]:
bitstring(example_tuple[2]) # get the bitstring of the component i

"0100000001000010010011001100110011001100110011001100110011001101"

We can see the raw bytes associated with the `example_tuple::Tuple{Int64,Float64}` using [the `reinterpret(...)` method](https://docs.julialang.org/en/v1/base/arrays/#Base.reinterpret):

In [331]:
v = reinterpret(NTuple{16,UInt8}, example_tuple) |> collect # we have 16 8-bit blocks (128 bits in total)

16-element Vector{UInt8}:
 0x12
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0xcd
 0xcc
 0xcc
 0xcc
 0xcc
 0x4c
 0x42
 0x40

In [329]:
bitstring(v[1])

"00010010"

___

### Arrays
An array is a contiguous, ordered collection of elements of the same type, allowing constant-time access to its elements via integer indices. In most languages, arrays occupy a single block of memory, with element access computed as the base address plus the index times the memory size of each element.

> **Julia vs. Python arrays:** [Julia's `Array{T}` type](https://docs.julialang.org/en/v1/base/arrays/#Core.Array-Tuple%7BNothing,%20Any%7D) is a built-in, statically typed container that is `1-indexed` and stored [in column-major order](https://en.wikipedia.org/wiki/Row-_and_column-major_order). Python's native lists are heterogeneous and zero-indexed, while [NumPy's homogeneous arrays](https://numpy.org/doc/stable/reference/generated/numpy.array.html) are zero-indexed and row-major (implemented in a separate C library rather than the core language).

Arrays in both Julia and Python are mutable, meaning elements can be changed after we populate the array. Let's explore a Julia array:

In [31]:
a = rand(10) # build a 10-element random array

10-element Vector{Float64}:
 0.6670346527338153
 0.7210793942041424
 0.9188057083789035
 0.8142557374010451
 0.2918320624537132
 0.2641116095445314
 0.8686475773152065
 0.8580170767062709
 0.5204650303731037
 0.1654722041586395

We access the elements of an array by passing the index of the array in square brackets, e.g., `a[3]` returns the third element in Julia (because it is `1`-based):

In [33]:
a[3]

0.9188057083789035

Arrays are __mutable__, i.e., we can change them after we build them. For example:

In [36]:
a[3] = π

π = 3.1415926535897...

In [38]:
a

10-element Vector{Float64}:
 0.6670346527338153
 0.7210793942041424
 3.141592653589793
 0.8142557374010451
 0.2918320624537132
 0.2641116095445314
 0.8686475773152065
 0.8580170767062709
 0.5204650303731037
 0.1654722041586395

Arrays in Julia are `1`-based. This is a somewhat controversial design choice.
> __Hot take!!__ Julia's choice of `1`-based indexing, unlike most modern programming languages, which use `0`-based indexing (like C, Python, and Java), is a __super headache__! However, it is a deliberate design decision grounded in mathematical consistency, readability, and domain alignment. In short, there are a bunch of arguments as to why this is a good idea.

What happens if we try to grab an element that is _outside_ the array?

In [None]:
a[11] # we are asking for the stuff stored at 11, but we only have 10 items!

LoadError: BoundsError: attempt to access 10-element Vector{Float64} at index [11]

___

### Sets and Dictionaries
A [Set type](https://docs.julialang.org/en/v1/base/collections/#Base.Set) is an unordered collection of unique elements that supports fast membership checks, insertions, and removals. A [Dictionary (or map) is an associative container](https://docs.julialang.org/en/v1/base/collections/#Base.Dict) that stores key–value pairs, allowing lookup, insertion, and deletion of values based on their unique keys.

> **Julia vs. Python collections:** Julia's `Set{T}` and `Dict{K,V}` are parametric containers, meaning every element in a `Set` has the same type `T`, and every key–value pair in a `Dict` has types `K` and `V`. However, the elements can be any type `T`, and the keys `K` and values `V` can also be of any type. In contrast, Python's built-in `set` and `dict` are inherently heterogeneous (each slot holds a generic `object` reference), making them more flexible than their Julia equivalents.

Let's build a few examples of set and dictionary collection types:

In [56]:
d = let

    d = Dict{Int64, String}(); # creates a dictionary that models text in a file.
    d[1] = "This is the first line in a text file";
    d[2] = "This is the second line in a text file";
    d[3] = "This is the last line in a text file";

    d
end

Dict{Int64, String} with 3 entries:
  2 => "This is the second line in a text file"
  3 => "This is the last line in a text file"
  1 => "This is the first line in a text file"

We can access the values stored in a dictionary by passing in the `key` pointing to a `value`, i.e., to get line `2`, we would:

In [60]:
d[2]

"This is the second line in a text file"

Dictionaries (in general) do __not__ maintain order (more on this later). For example, we inserted line `2` before line `3`, but the order when the dictionary was printed is 2,3,1. Likewise, there is no notion of order in a set.

Consider the `s::Set{Char}` example:

In [75]:
s = let

    s = Set{Char}(); # empty at this point
    push!(s, 'a'); # we add items to set using push!
    push!(s, 'b');
    push!(s, 'c');
    push!(s, 'd');

    s
end

Set{Char} with 4 elements:
  'a'
  'c'
  'd'
  'b'

We can't access a particular item in the `s::Set{Char}` set by passing in an index (or key) because these ideas don't exist for sets. Instead, we can use [the `pop!(...)` method](https://docs.julialang.org/en/v1/base/collections/#Base.pop!) to pop (get) a random element from a set:

In [80]:
pop!(s)

'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)

All the typical mathematical operations on sets, such as intersection, union, or membership checks, are implemented in most modern programming languages, including Julia; [see the documentation for operations on sets in Julia](https://docs.julialang.org/en/v1/base/collections/#Set-Like-Collections).

___

## Custom composite types
Custom composite types are user-defined data structures that aggregate multiple fields (possibly of different types) under a single name, enabling encapsulation of related data. Think of them as custom containers that you design to hold exactly the data you need for your specific problem.

> **Language differences:** In Julia, these are declared [using the struct keyword](https://docs.julialang.org/en/v1/manual/types/#Composite-Types) with a list of named fields, similar to C. Python uses classes with attributes and methods, which is a more object-oriented approach.

We'll explore this topic in much greater depth later, but for now, let's build some simple examples to illustrate how composite types work in Julia.

In [108]:
struct MyStudentModel

    # data -
    firstname::String # fields hold the data, they have names and types
    lastname::String
    id::Int64

    MyStudentModel(f,l,id) = new(f,l,id); # constructor
end

Now we can create an instance of our `MyStudentModel` type by calling the constructor:

In [95]:
model = MyStudentModel("Test", "Student", 1234)

MyStudentModel("Test", "Student", 1234)

We access the data stored in our composite type using dot syntax:

In [99]:
model.id # returns the value stored in the id field

1234

Here's a key point: because we used the `struct` keyword, our student model is immutable. Once we build it, we cannot change any of the data stored in the model. Let's see what happens when we try:

In [106]:
model.id = 5678 # we are trying to change an immutable struct.

LoadError: setfield!: immutable struct of type MyStudentModel cannot be changed

Sometimes we need to modify our data after creating it. For these cases, we can create mutable composite types by adding the `mutable` keyword when declaring the struct:

In [None]:
mutable struct MyMutableStudentModel

     # data -
    firstname::String # fields hold the data, they have names and types
    lastname::String
    id::Int64

    MyMutableStudentModel() = new(); # builds an empty model

end

We create mutable composite types the same way as immutable ones—by calling the constructor. However, since we can modify the fields afterward, we often use an empty constructor that creates an uninitialized instance, then populate the fields one by one:

In [119]:
mutable_model = MyMutableStudentModel();
mutable_model.id = 6789;
mutable_model.firstname = "Firstname";
mutable_model.lastname = "Lastname";

___