In [None]:
%matplotlib inline

In [None]:
import math
from typing import Callable
from IPython.display import SVG, display
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np

# Analysis of ballistic deposition induced films

**PHYS 395 project 2; **
**Matt Wiens - #301294492**

## Notebook setup 

The first command here sets the default figure size to be a bit larger than normal. The second command sets it so all figure output areas are expanded by default.

In [None]:
# Set default plot size
plt.rcParams["figure.figsize"] = (12, 9)

In [None]:
%%javascript
IPython.OutputArea.auto_scroll_threshold = 9999

# Citations

Due to the format of Jupyter notebooks, I'm going to list my citations at the start of the notebook in this section. Usually citations would be at the end of a paper, but I find that due to the length of Jupyter notebooks, it's easier to scroll *up* through content you've already seen than to scroll *down* through lots of content you haven't yet seen.

Citations will be listed by embedding the first three letters of the first author's last name in square brackets followed (optionally) by a page or page range (e.g., [Lan] corresponds to the book Landau was first author on). The citations are as follows:

+ **[Ban]** Banerjee, K., Shamanna, J., & Ray, S. (2014). Surface morphology of a modified ballistic deposition model. Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics, 90(2), 22111.
+ **[Bar]** Barabási, A.-L., & Stanley, H. E. (1995). Fractal Concepts in Surface Growth. Cambridge University Press. https://doi.org/10.1017/CBO9780511599798
+ **[Lan]** Landau, R. H., Páez, M. J., & Bordeianu, C. C. (2015). Computational Physics: Problem Solving with Python (3rd ed.). Wiley-VCH.
+ **[Li]** Li, J., Du, Q., & Sun, C. (2009). An improved box-counting method for image fractal dimension estimation. Pattern Recognition, 42(11), 2460–2469. https://doi.org/https://doi.org/10.1016/j.patcog.2009.03.001
+ **[Man]** Mandelbrot, B. B. (1983). The Fractal Geometry of Nature. Henry Holt and Company.

# Introduction

In this notebook we'll look at how we can model and understand forms of particle aggregation using "ballistic deposition" models. To get a sense of what we mean by particle aggregation, imagine a manufacturing process in which particles are evaporated and form a film on some surface. Although the physics governing such a process is complicated, it turns out that we can accurately model how such films form with straightforward reasoning and simple mathematics [Bar, p.19]. The forms of particle aggregation we investigate in this notebook are important in the fabrication of nanomaterials and nanodevices, with many applications in medicine [Ban, p.1].

What we will investigate in this notebook is the geometry and growth of films on cylindrical domains with cross-sectional "rings". To be specific with our geometry, we will take our domain $D$ to be given by

\begin{equation}
    D = \{(x, y): \left\Vert (x, y) \right\Vert_2 = r\} \times [0, +\infty)
\end{equation}

for some radius $r$, where $\left\Vert\cdot\right\Vert_2$ is the $L^2$ norm. This is equivalent to a box with infinite height where we have periodic boundary conditions in the horizontal directions. To give a visual representation of such a domain, taking $r = 1$, consider the below plot:

In [None]:
# Plot example domain
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")

ts = np.linspace(0, 2 * np.pi, 64)
zs = np.linspace(0, 10, 2)

ts, zs = np.meshgrid(ts, zs)

xs = np.cos(ts)
ys = np.sin(ts)

ax.plot_surface(xs, ys, zs,)

# Cosmetics
ax.set_xlabel(r"$x$")
ax.set_ylabel(r"$y$")
ax.set_zlabel(r"$z$");

# Set informative perspective
azim = -25.372434017595253
elev = 54.22668240850038
ax.view_init(azim=azim, elev=elev)

I will emphasize that the $z$ values have a minimum value of $0$, but $z$ goes to $+ \infty$ in the positive direction.

# Theory

## Overview

First, we will discuss fractal geometry and some of its features. We will then look at how we can simulate different forms of particle aggregation using ballistic deposition models. Finally, we will concern ourself the geometry and growth of these simulated particle aggregates and explore the connection the growth behaviour has to "correlation length" (this last part will be done after our analysis).

## Fractals

Strictly speaking, a fractal is a set for which the Hausdorff dimension strictly exceeds the topological dimension [Man, p.15] (more on this soon). The point of discussing fractals in a general setting is to introduce and demonstrate the concept of Hausdorff dimension (sometimes called "fractal dimension").

We will see that the type of particle aggregation films we touched on in the introduction are actually fractals, and understanding fractal geometry is important in characterizing the geometry and growth of these films.

### Hausdorff dimension

For brevity we will not give a formal treatment of Hausdorff dimension or topological dimension, and instead give an informal treatment. (Note that the formal treatment requires an early graduate-level understanding in mathematical analysis.) One can think of Hausdorff dimension as characterizing how much "space" a set fills up. For example, to determine the Hausdorff dimension for equilateral polygons with side length $L$ in $N$-dimensional space, we can relate the Hausdorff dimension $d_f$ to the $N$-dimensional volume $V$ with

\begin{equation}
    V \propto L^{d_f}
    .
\end{equation}

(This formula is a generalization of one presented in [Lan, p.384].) For example, a line segment with length $L$ in dimension $1$ has Hausdorff dimension $d_f = 1$ because the distance $l$ of the line trivially has the dependence

\begin{equation}
    l = L^1
    .
\end{equation}

In dimension $2$, squares and equilateral triangles with side length $L$ have Hausdorff dimension $d_f = 2$ because the areas $A$ of these objects have the dependence

\begin{equation}
    A \propto L^2
    .
\end{equation}

The generalization of this is that no equilateral polygons are fractal, since their Hausdorff dimension are equal to their topological dimension. (Note also that circles and cubes too have this property, by considering the characteristic length as the radius $r$, instead of the side length $L$ for equilateral polygons.) In our analysis, we will see demonstrate fractal objects whose Hausdorff dimension exceeds their topological dimension (although bear in mind that pretty much everything in this notebook is fractal).

## Ballistic deposition models

Here we will describe the two models of ballistic deposition we will analyze in this notebook. For more details on either of these models see [Lan, pp.390-391, pp.395-396].

### Ballistic deposition model (BD)

The first ballistic deposition model we consider we will simply call the *ballistic deposition model* (hereafter abbreviated as BD). In the BD model, we will model particles as being two dimensional squares with unit side length. Then, we will partition the x-y portion of our domain $D$ (the cross-sectional ring) into $N$ equally spaced "sites" with width equal to the particle width. Viewing this in three dimensions, each site will form a column extending from $z = 0$ to $z = +\infty$. We will define the height of each column as the highest $z$ value which a particle occupies in that column (if there is no such particle then the height is $0$).

The model works as follows:

1. we randomly drop a particle from $z = +\infty$ into a column $c_1$ that has height $h_{c_1}$
2. the particle "falls" down the column $c_1$ and

    + (a) if the particle encounters another particle in an adjacent column $c_2$ on its descent, with the height of the other particle being at $h_{c_2} > h_{c_1}$, it "sticks" to that particle and gets deposited in column $c_1$ at height $h_{c_2}$;
    + (b) otherwise, the particle sticks to the highest particle in its column, so it gets deposited in column $c_1$ at height $h_{c_1} + 1$.
    
3. repeat

### Correlated deposition model (CBD)

The second ballistic deposition model we will look at is the *correlated ballistic deposition model* (hereafter abbreviated CBD). The CBD model is similar to the BD model except that after steps 1 and 2 in the BD model, the particle is deposited (or discarded) with a probability that depends on its distance $d$ from the most recent previously deposited particle. Let $P(d)$ denote this probability. Then, we have the following steps:

1. we randomly drop a particle from $z = +\infty$ into a column $c_1$ that has height $h_{c_1}$
2. the particle "falls" down the column $c_1$ and

    + (a) if the particle encounters another particle in an adjacent column $c_2$ on its descent, with the height of the other particle being at $h_{c_2} > h_{c_1}$, it "sticks" to that particle and *potentially* gets deposited in column $c_1$ at height $h_{c_2}$;
    + (b) otherwise, the particle sticks to the highest particle in its column, and * potentially* gets deposited in column $c_1$ at height $h_{c_1} + 1$.
    
3. sample the probability distribution $\text{Bernoulli}(P(d))$: if the sampled value is $1$, deposit the particle (otherwise discard it)
4. repeat

This model is useful in modeling particle interactions when particles are dropped together (although not at exactly the same time). For a "Coulomb-type" attraction, we can take

\begin{equation}
    P(d) := \min \left\{\frac{c}{d^2}, 1 \right\}
\end{equation}

where $c$ is a scaling constant.

As a finer point on calculating distances, since our domain $D$ is cylindrical, we take the distance $d$ to be the shortest distance between the two points *along* the cylinder (hence we do not calculate the distance as *through* the cylinder).

### Surface parameters

When analyzing the geometry and growth of the particle aggregate films generated by the BD and CBD models, we will restrict ourselves to analyzing the surface of the film.

Recall that in each ballistic deposition model, we have $N$ sites, and each of these sites has an associated column. We will denote $h(i, t)$ to be the height of the $i$th column (corresponding to the $i$th site) at time $t$. (Recall that the height of a column is the highest $z$ value that a particle occupies in that column, or $0$, if there are no particles in the column.)

We will be interested in two quantities. The first is the *mean height* of the surface, $\bar{h}(t)$, defined by

\begin{equation}
    \bar{h}(t) := \frac{1}{N} \sum_{i = 1}^N h(i, t)
    .
\end{equation}

The second is the *interface width* $w(t)$, which is the root mean square fluctuation in height defined by

\begin{equation}
    w(t) := \sqrt{\frac{1}{N} \sum_{i = 1}^N \left( h(i, t) - \bar{h}(t) \right)^2}
    .
\end{equation}

The interface width characterizes the "roughness" of the surface [Bar, p.22].

# Analysis

Now that we've covered the necessary theory, let's get into the analysis.

## Illustrating fractal behaviour

Now we will look at general fractal behaviour. The point of looking at fractals in a general sense is to demonstrate the Hausdorff dimension and seeing how fractals can arise from simple algorithms.

### The Sierpiński triangle

The first example we'll consider is the Sierpiński triangle. Classically, you can generate this object by starting with any equilateral triangle and subdividing this triangle it into four smaller equilateral triangles of the same size (this subdivision is unique). After removing the "central" triangle, apply a similar subdivision to each of the remaining triangles, and repeat this process ad infinitum. Wikipedia has a good illustration of the first few steps of the procedure as shown below (the image was created by Wikipedia users Saperoud and Wereon). 

In [None]:
svg_url = "https://upload.wikimedia.org/wikipedia/commons/0/05/Sierpinski_triangle_evolution.svg"
display(SVG(url=svg_url))

The Sierpiński triangle has topological dimension $1$ (one way to think about this without getting into the mathematical definition of topological dimension is to recognize that it has no area). However, its Hausdorff dimension $d_f$ is given by

\begin{equation}
    d_f = \frac{\log3}{\log2} \approx 1.585
    .
\end{equation}

One way to solve for $d_f$ is by recognizing that if we double the starting side length $L$ of the equilateral triangle then we create $2$ additional copies of the object and so by our above formula of the Hausdorff dimension we require

\begin{equation}
    3 L^{d_f} = (2 L)^{d_f}
    ,
\end{equation}

which, after solving for $d_f$, gives us the desired result. Hence we have shown the Hausdorff dimension to be strictly greater than the topological dimension, and so the Sierpiński triangle is a fractal that "fills more space" than a one dimensional object but less than a two dimensional object.

### Computing Hausdorff dimension

It's also possible to compute fractal dimension by approximating fractals computationally through formulas called "chaos games". Two such examples are shown below. The Sierpiński triangle can be approximated by following the choas game in [Lan, p.384] and the dimension can be estimated using so-called "box-counting" methods (see [Li] or [Lan, pp.392-393]). For the second example shown below, the three dimensional Barnsley fern (generated by the chaos game in [Lan, pp.387-389]) has no explicit mathematical construction and the Hausdorff dimension can only be estimated computationally.

#### Sierpiński triangle (chaos game)

In [None]:
# Vertices of the equilateral triangle
vertices = np.array([[0.0, 0.0], [0.5, 1.0], [1.0, 0.0]])

# Pick a random point within the triangle to start
s, t = np.sort(np.random.random(2))
init_point = np.array(
    [
        s * vertices[0, 0] + (t - s) * vertices[1, 0] + (1 - t) * vertices[2, 0],
        s * vertices[0, 1] + (t - s) * vertices[1, 1] + (1 - t) * vertices[2, 1],
    ]
)

In [None]:
# Set up an array with all the points we'll compute
num_points = 25000

points = np.zeros((num_points, 2))
points[0, :] = init_point

In [None]:
# Perform the simulation
for i in range(1, num_points):
    r = np.random.randint(3)
    
    points[i, :] = 0.5 * (points[i - 1, :] + vertices[r, :])

In [None]:
# Plot simulation results
_, ax = plt.subplots(figsize=(7, 7))

ax.scatter(x=points[:, 0], y=points[:, 1], s=0.5);

### Barnsley fern (chaos game)

In [None]:
# Set up array to store data
num_points = 25000

points = np.zeros((num_points, 3))

# Initial point is fixed
points[0, :] = np.array([0.5, 0.0, -0.2])

In [None]:
# Setup mode transformation matrices
mode_1_mat = np.array([[0.0, 0.0, 0.0], [0.0, 0.18, 0.0], [0.0, 0.0, 0.0]])
mode_2_mat = np.array([[0.85, 0.0, 0.0], [0.0, 0.85, 0.1], [0.0, -0.1, 0.85]])
mode_3_mat = np.array([[0.2, -0.2, 0.0], [0.2, 0.2, 0.0], [0.0, 0.0, 0.3]])
mode_4_mat = np.array([[-0.2, 0.2, 0.0], [0.2, 0.2, 0.0], [0.0, 0.0, 0.3]])

# Constants to add after each matrix product
mode_1_consts = np.array([0.0, 0.0, 0.0])
mode_2_consts = np.array([0.0, 1.6, 0.0])
mode_3_consts = np.array([0.0, 0.8, 0.0])
mode_4_consts = np.array([0.0, 0.8, 0.0])

# Bundle up all the matrices and constants
mode_mats = [mode_1_mat, mode_2_mat, mode_3_mat, mode_4_mat]
mode_consts = [mode_1_consts, mode_2_consts, mode_3_consts, mode_4_consts]

# Probabilities of obtaining each mode
mode_probabilities = np.array([0.1, 0.6, 0.15, 0.15])

In [None]:
# Perform the simulation
for i in range(1, num_points):
    mode = np.random.choice([0, 1, 2, 3], p=mode_probabilities)

    points[i, :] = mode_mats[mode] @ points[i - 1, :] + mode_consts[mode]

In [None]:
# Plot simulation results
fig = plt.figure(figsize=(7, 7))
ax = fig.add_subplot(111, projection="3d")

# The ordering of the x,y,z points were chosen to get the best
# default view.
ax.scatter(ys=points[:, 0], zs=points[:, 1], xs=points[:, 2], s=0.5);

## Ballistic deposition simulation

Having touched on the Hausdorff dimension, we will now move on to ballistic deposition. Here we will show plots of what is generated by the BD and CBD models discussed in the theory section above. Although we are simulating on our cylindrical-like domain $D$, note that using a "box" domain gives very similar results.

### Ballistic deposition model

First we'll simulate the (uncorrelated) ballistic deposition model using $300$ sites and simulating the deposition of $25000$ particles. Note that having $300$ sites means the radius $r$ for our domain $D$ is given by

\begin{equation}
    r \approx \frac{300}{2 \pi} \approx 47.7
\end{equation}

in units of particle box length.

In [None]:
# Number of particles to simulate
num_particles = 25000

# Number sites on ring substrate
num_substrate_sites = 300

# Array of heights at each substrate site
heights = np.zeros(num_substrate_sites)

# Array of particle positions (once deposited)
points = np.zeros((num_particles, 2))

In [None]:
# Functions to deal with boundary conditions
left_site = lambda site: site - 1
right_site = lambda site: (site + 1) % num_substrate_sites

In [None]:
# Perform the simulation
for i in range(num_particles):
    site = np.random.randint(num_substrate_sites)

    # Determine the maximum height of the neighbouring sites
    neighbour_max_height = np.array(
        [heights[s] for s in [left_site(site), right_site(site)]]
    ).max()

    # Determine where to place the particle
    if heights[site] >= neighbour_max_height:
        heights[site] += 1
    else:
        heights[site] = neighbour_max_height

    points[i, :] = np.array([site, heights[site]])

In [None]:
# Plot simulation results
_, ax = plt.subplots()

ax.scatter(x=points[:, 0], y=points[:, 1], s=1.5)

# Cosmetics
ax.set_xlim([0, num_substrate_sites])
ax.set_ylim(ymin=0)

ax.set_xlabel("substrate site")
ax.set_ylabel("height")
ax.set_aspect("equal")

### Correlated ballistic deposition model

Now we'll simulate the correlated ballistic deposition model (again) using $300$ sites and simulating the deposition of $25000$ particles. We will take the probability $P(d)$ that a particle is deposited based on its distance from the previously deposited particle to be a Coulomb-like attraction given by

\begin{equation}
    P(d) := \min \left\{ \frac{c}{d^2}, 1 \right\}
    ,
\end{equation}

where we have taken the scaling constant $c = 5$ in the simulation below.

Note that this simulation can potentially take much longer to run than the previous simulation, since we will have many more iterations than the number of particles we simulate (due to many of them being discarded by our correlation constraint).

In [None]:
# Number of particles to simulate
num_particles = 25000

# Number sites on ring substrate
num_substrate_sites = 300

# Array of heights at each substrate site
heights = np.zeros(num_substrate_sites)

# Array of particle positions (once deposited)
points = np.zeros((num_particles, 2))

In [None]:
# Functions to deal with boundary conditions
left_site = lambda site: site - 1
right_site = lambda site: (site + 1) % num_substrate_sites

In [None]:
# Perform the simulation
i = 0

while i < num_particles:
    site = np.random.randint(num_substrate_sites)

    # Determine the maximum height of the neighbouring sites
    neighbour_max_height = np.array(
        [heights[s] for s in [left_site(site), right_site(site)]]
    ).max()

    # Determine where to place the particle
    if heights[site] >= neighbour_max_height:
        candidate_height = heights[site] + 1
    else:
        candidate_height = neighbour_max_height

    candidate_point = np.array([site, candidate_height])

    # Determine whether to deposit the particle
    # based on its distance from the previous particle
    if i == 0:
        # First particle always gets placed
        pass
    else:
        # Calculate distance from previous particle. We need
        # to be careful with periodic boundary conditions here.
        dx = abs(points[i - 1, 0] - site)
        mindx = min(dx, num_substrate_sites - dx,)
        dsquared = mindx ** 2 + (points[i - 1, 1] - candidate_height) ** 2

        if np.random.random() < 5 / dsquared:
            pass
        else:
            continue

    # Deposit the particle
    heights[site] = candidate_height
    points[i, :] = candidate_point

    i += 1

In [None]:
# Plot simulation results
_, ax = plt.subplots()

ax.scatter(x=points[:, 0], y=points[:, 1], s=1.5)

# Cosmetics
ax.set_xlim([0, num_substrate_sites])
ax.set_ylim(ymin=0)

ax.set_xlabel("substrate site")
ax.set_ylabel("height")
ax.set_aspect("equal")

## Ballistic deposition interface width growth

Now that we've seen what the simulations for ballistic depositions generate, let's look at how the interface width grows with time.

### Defining a general ballistic deposition function

Since we're going to be computing a lot of data, we'll define a function which takes in

+ the number of sites $N$
+ the number of particles to simulate
+ (optionally) a function $P(d)$ to determine the correlation-dependent probability in the CBD model

and returns an array with the $i$th index giving the interface width $w(i)$ after the $i$th particle has been deposited.

In [None]:
def calculate_new_mean(
    num_sites: int, old_mean: float, old_value: int, new_value: int
) -> float:
    """Calculates new mean given a change in one value."""
    return old_mean + (new_value - old_value) / num_sites


def calculate_interface_width(
    num_sites: int, heights: np.ndarray, mean_height: float
) -> float:
    """Calculates the interface width."""
    return math.sqrt(1 / num_sites * np.sum((heights - mean_height) ** 2))

In [None]:
def bd_interface_widths(
    num_sites: int, num_particles: int, prob_fn: Callable[[float], float] = None
) -> np.ndarray:
    """Simulates a BD model and returns an array of interface widths."""
    # Functions to deal with boundary conditions
    left_site = lambda site: site - 1
    right_site = lambda site: (site + 1) % num_sites

    # Array of heights at each site
    heights = np.zeros(num_sites)

    # Current mean height
    mean_height = 0

    # Array of interface widths to return
    interface_widths = np.zeros(num_particles)

    # Perform the simulation
    i = 0

    # Last particle position (if running CBD)
    last_position = None

    while i < num_particles:
        site = np.random.randint(num_sites)

        # Determine the maximum height of the neighbouring sites
        neighbour_max_height = np.array(
            [heights[s] for s in [left_site(site), right_site(site)]]
        ).max()

        # Determine where to place the particle
        if heights[site] >= neighbour_max_height:
            candidate_height = heights[site] + 1
        else:
            candidate_height = neighbour_max_height

        candidate_point = np.array([site, candidate_height])

        # Determine whether to deposit the particle
        # based on its distance from the previous particle
        # (if running CBD)
        if prob_fn is not None:
            if last_position is None:
                # First particle always gets placed
                pass
            else:
                # Calculate distance from previous particle. We need
                # to be careful with periodic boundary conditions here.
                dx = abs(last_position[0] - site)
                mindx = min(dx, num_substrate_sites - dx,)

                d = math.sqrt(mindx ** 2 + (last_position[1] - candidate_height) ** 2)

                if np.random.random() < prob_fn(d):
                    pass
                else:
                    continue

            last_position = np.array([site, candidate_height])

        # Calculate new mean height
        mean_height = calculate_new_mean(
            num_sites, mean_height, heights[site], candidate_height
        )

        # Increase the height of the column
        heights[site] = candidate_height

        # Calculate the interface width
        interface_widths[i] = calculate_interface_width(num_sites, heights, mean_height)

        i += 1

    return interface_widths

### BD interface growth

Now we'll look at how the interface width in the BD model grows with time. First we'll generate some data, depositing $10^5$ particles.

In [None]:
num_sites = 200
num_particles = 100000

In [None]:
bd_data = bd_interface_widths(num_sites = num_sites, num_particles = num_particles)

Let's plot the interface width versus time (each unit of time is a deposited particle) on a log-log scale.

In [None]:
_, ax = plt.subplots()

ax.scatter(x=range(num_particles), y=bd_data, s=0.5)

# Set log log scaling and proper view
ax.set_xscale("log")
ax.set_yscale("log")

ax.set_xlim([10, num_particles])
ax.set_ylim([0.1, 10])

# Cosmetics
ax.set_xlabel(r"$t$")
ax.set_ylabel(r"$w(t)$");

What you should be seeing is that the interface width eventually saturates!

Note: if you *don't* see that in the above plot that was generated, try running it again. Due to the relatively low number of sites, randomness can play a role in the data that is generated (even then if you use enough particles you should still see the saturation eventually, even if it's noisy); increasing sites will mitigate this, but that also means you would need to increase the number of particles as well to reach the saturation point (more on this soon). There's a performance trade-off for "better quality".

We can see that there are two regimes: initially the interface width $w(t)$ increases as a power of time (since it appears linear on a log-log plot), that is, for some exponent $\beta$ we have that initially (say for $t < t_x$ where we will call $t_x$ the "crossover time")

\begin{equation}
    w(t) \propto t^\beta \quad \text{for } t < t_x.
\end{equation}

However, for $t > t_x$ the interface width saturates at some value $w_{sat}$.

Why does saturation happen? It's unintuitive that a random system like this would saturate (at least it is for me). Before answering this question, let's examine what effect the number of sites plays. (Note: the following code block might take awhile to run.)

In [None]:
# Number of sites to test
num_sites_list = [50, 100, 150, 200, 250, 300]

num_particles = 100000

bd_big_data = np.ndarray((len(num_sites_list), num_particles))

for idx, num_sites in enumerate(num_sites_list):
    bd_big_data[idx, :] = bd_interface_widths(
        num_sites=num_sites, num_particles=num_particles
    )

In [None]:
_, ax = plt.subplots()

for data in bd_big_data:
    ax.scatter(x=range(num_particles), y=data, s=0.1)

# Set log log scaling and proper view
ax.set_xscale("log")
ax.set_yscale("log")

ax.set_xlim([10, num_particles])
ax.set_ylim([0.1, 10])

# Cosmetics
ax.set_xlabel(r"$t$")
ax.set_ylabel(r"$w(t)$")

ax.legend(["%d sites" % num_sites for num_sites in num_sites_list], markerscale=20);

Now admittedly, this plot is hard to look at. To get another view, let's plot all of them separately.

In [None]:
_, axes = plt.subplots(3, 2, figsize=(17, 17), sharex=True, sharey=True)

for ax, data, num_sites in zip(axes.flatten(), bd_big_data, num_sites_list):
    ax.scatter(x=range(num_particles), y=data, s=0.1)

    # Set log log scaling and proper view
    ax.set_xscale("log")
    ax.set_yscale("log")

    ax.set_xlim([10, num_particles])
    ax.set_ylim([0.1, 10])

    # Cosmetics
    ax.set_xlabel(r"$t$")
    ax.set_ylabel(r"$w(t)$")
    ax.set_title("%d sites" % num_sites)

Here we can see a clear finite size effect. As the number of sites $N$ increases, the system takes longer to saturate (i.e., the crossover time $t_x$ increases), and the interface width saturation value $w_{sat}$ increases. It is not clear whether the exponent $\beta$ characterizing the growth regime is affected from these plots; it would appear not to be?

It turns out that both $t_x$ and and $w_{sat}$ increase linearly with the number of sites $N$ on a log-log scale (see [Bar p.23]), and hence there exist exponents $\alpha$ and $\gamma$ such that

\begin{align}
    w_{sat}(N) &\propto N^\alpha, \\
    t_x(N) &\propto N^\gamma
    .
\end{align}

Noting that $N$ is simply the circumference of our cylindrical domain, we can immediately identify $\alpha$ as the Hausdorff dimension of our interface width.

There's a bit more to the story than what I've written above; for example, it turns out that

\begin{equation}
    \gamma = \frac{\alpha}{\beta}
    ;
\end{equation}

for the interest reader, see [Bar pp.23-25]. One observation I will emphasize is that saturation is a finite size effect! According to the above equations for $w_{sat}(N)$ and $t_x(N)$ above, if we take the number of sites $N$ (or equivalently, the radius of our cylindrical domain) to go to infinity, then the interface width (the roughness of the surface) will grow indefinitely and never saturate.

Let's get back to a question I asked (but didn't answer) above: why does the system saturate in the first place. It turns out that sites on the surface are correlated. Each new particle that gets deposited due to a neighbouring site at greater height is essentially transmitting information about its neighbour to the *other* neighbour. Hence in small systems, sites will equilibriate faster and the surface will be less rough because information about local surface heights is transmitted faster. In larger systems, information takes longer to transmit. This "transmission time" is known as the correlation length, and it should be intuitive that for our system, the correlation length is proportional to the number of sites $N$.

(Note that this idea of correlation length is a new concept for me, so I might not be explaining/understanding it that well. See [Bar, pp.25-27] for likely a better explanation.)

### CBD interface growth

Now we'll step back and look at how the interface width in the CBD model grows with time. Here we'll only plot $4 \cdot 10^3$ particles since this simulation takes significantly longer than the BD simulation.

In [None]:
num_sites = 200
num_particles = 4000

In [None]:
cbd_data = bd_interface_widths(
    num_sites=num_sites, num_particles=num_particles, prob_fn=lambda d: 5 / d ** 2
)

Let's plot the interface width versus time (each unit of time is a deposited particle) on a log-log scale.

In [None]:
_, ax = plt.subplots()

ax.scatter(x=range(num_particles), y=cbd_data, s=0.5)

# Set log log scaling and proper view
ax.set_xscale("log")
ax.set_yscale("log")

ax.set_xlim([10, num_particles])
ax.set_ylim([0.1, 10])

# Cosmetics
ax.set_xlabel(r"$t$")
ax.set_ylabel(r"$w(t)$");

We can see here that, predictably the interface width grows much more rapidly as a function of time than in the BD model (if this is not intuitive then refer back to the simulation plots for the CBD model above). Due to the computational intensity of this model, unfortunately I don't have the computational power to see what happens with large numbers of particles (think $10^6$). This is an open question for me really: will the system saturate in the same way the BD model does? I want to say that it depends on the correlation probability function $P(d)$. One can imagine setting $P(d)$ to be such that the newly particle essentially needs to be adjacent or on top of the previously deposited particle. Surely such a system would never saturate?

# Discussion

In this notebook we've investigated how ballistic deposition induced films and their growth can be characterized. In doing so, we've discussed concepts that relate to ballistic deposition such as Hausdorff dimension, finite size effects, and correlation length. However, there is much we haven't yet investigated, and that I'm still interested in. I'll break the "future directions" part into three topics I'm most interested in.

When discussing the growth of the interface width in the BD model we saw that we had

\begin{align}
    w(N, t) &\propto t^\beta \qquad \text{for } t < t_x, \\
    w_{sat}(N) &\propto N^\alpha, \\
    t_x(N) &\propto N^\gamma
    .
\end{align}

The data we had to work with was extremely noisy (due to the randomness of the simulation and relatively small number of sites). How can we fit those parameters to noisy data in a way that essentially cancels out the randomness? Perhaps we can smooth the data by averaging across multiple simulations? As an alternative, certainly by hand we can draw on the log-log plots and determine the exponents easily enough (we just need to draw lines over our data); is there a computational equivalent to this by-hand drawing?

Secondly, when discussing the Hausdorff dimension earlier on, I mentioned that we can approximate it computationally using box-counting methods on fractals generated computationally. As an alternative to fitting the curves we plotted above, can we determine the exponent $\alpha$ using these box-counting methods based on our computed points on the ballistic deposition films?

Lastly, I really want to know what happens in the CBD model with regard to saturation. I think the key here is to find a way to simplify the algorithm to make it less computationally intensive, so, as a starting point, we can produce plots equivalent to what we did in the BD model.