# Visualization and modeling

Python activities to complement [*Measurements and their Uncertainties*](https://www.oupcanada.com/catalog/9780199566334.html) (*MU*), Chapter 5, "Data visualization and reduction."

* [Preliminaries](#Preliminaries)
* [Importing and exporting data](#Importing-and-exporting-data)
    * [Export data to a file](#Export-data-to-a-file)
    * [Import data from a file](#Import-data-from-a-file)
    * [Exercise 1](#Exercise-1)
* [Basic plotting](#Basic-plotting)
    * [Example: Demonstrating Ohm's law](#Example&#58;-Demonstrating-Ohm's-law)
        * [Assign data variables](#Assign-data-variables)
        * [Choose the independent and dependent variables](#Choose-the-independent-and-dependent-variables)
        * [Choose the functional relationship to show](#Choose-the-functional-relationship-to-show)
        * [Choose appropriate scales for the axes](#Choose-appropriate-scales-for-the-axes)
        * [Label the axes](#Label-the-axes)
        * [Compare with model](#Compare-with-model)
        * [Export and check format](#Export-and-check-format)
        * [Adjust format as necessary](#Adjust-format-as-necessary)
* [Linearizing relationships](#Linearizing-relationships)
    * [Semilogarithmic plots](#Semilogarithmic-plots)
    * [Log-log plots](#Log-log-plots)
* [Summary](#Summary)

## Preliminaries
Before proceeding with this notebook you should review the topics from the [previous notebook](4.0-Error-propagation.ipynb) and read *MU* Ch. 5, "Data visualization and reduction," with the following [goals](A.0-Reading-goals.ipynb#Data-visualization-and-reduction) in mind.

1. Be able to recall the "Guidelines for plotting data" in Sec. 5.1, and apply them to your own graphs.
2. Be able to compute appropriate error bars for data in a graph.
3. Be able to assess the quality of a fit from the fraction of data points that lie within one standard error bar from the fitted curve.
4. Recognize that a least-squares fit to a line can be computed from the data using Eqs. (5.1) - (5.6).
5. Be able to explain the meaning and significance of the following terms:
    1. Interpolate;
    2. Extrapolate;
    3. Aliasing;
    4. Residual;
    5. Method of least squares; and
    6. Goodness-of-fit parameter.
6. Be able to explain why the likelihood $P\left(m,c\right)$ in (5.8) is maximized when *&chi;*<sup>2</sup> in (5.9) is minimized, and discuss how this provides a rationale for using *&chi;*<sup>2</sup> to determine optimal fit parameters.
7. Know how to graph data to help identify systematic errors.

The following code cell includes previously used initialization commands that we will need here.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from numpy.random import default_rng

%matplotlib inline

## Importing and exporting data
Normally we record experimental data in a computer file of some sort. The NumPy function [`savetxt`](https://numpy.org/doc/stable/reference/generated/numpy.savetxt.html) saves a NumPy array as formatted text, and [`genfromtxt`](https://numpy.org/doc/stable/reference/generated/numpy.genfromtxt.html) reads data from a formatted text file into a NumPy array.

### Export data to a file
Consider the example given in *MU* Sec. 5.1.2, the period *T* of a pendulum as a function of its length *L*. Use the random number generator to simulate period measurements with a constant standard error *&alpha;<sub>T</sub>*&nbsp;=&nbsp;0.05&nbsp;s, with pendulum lengths ranging from 10&nbsp;cm to 100&nbsp;cm. Use `T` for the model prediction, and `Tm` for the simulated measurements.

In [None]:
# Assign variables
g = 9.8  # (m/s^2)
L = np.arange(10, 105, 10)  # (cm)
L_m = L / 100  # (m)
T = 2 * np.pi * np.sqrt(L_m / g)  # (s)
alpha_t = 0.05  # (s)

# Seed RNG and simulate data
rng = default_rng(0)

Tm = T + alpha_t * rng.normal(size=np.size(T))

# Show results
print("L = ", L)
print()
print("Tm = ", Tm)
print()
print("alpha_t = ", alpha_t)

Now let us save this data as text in a tabular format. We will use the CSV (Comma Separated Value) file format, which  is recognized by many software packages. In this format each column of data represents a different variable, and each row represents a different value for that variable. Usually it is best to include one or more header lines of [metadata](https://en.wikipedia.org/wiki/Metadata) to describe what each column means, measurement units, etc.

The [`savetxt`](https://numpy.org/doc/stable/reference/generated/numpy.savetxt.html) function expects tabular data to be organized in a two-dimensional array, so we use the [`stack`](https://numpy.org/doc/stable/reference/generated/numpy.stack.html) function to arrange the 1D arrays `L`, `Tm`, and `alpha_t_vec` as columns in the 2D array `data_out`. The `axis=1` option indicates that the arrays will be stacked along the second dimension, or columnwise (the first dimension, `axis=0`, corresponds to the rows of a 2D array). We create the 1D array `alpha_t_vec` from the scalar `alpha_t` by multiplying `alpha_t` by an array of [`ones`](https://numpy.org/doc/stable/reference/generated/numpy.ones.html) that has the same size as `L` and `Tm`; we can automatically determine this size by using `.` notation to access the [`shape`](https://numpy.org/doc/stable/reference/generated/numpy.shape.html#numpy.shape) attribute.

In [None]:
# Show shape of L
print("Shape of L:", L.shape)

# Create alpha_t_vec with same size as L,
# with all elements equal to alpha_t
alpha_t_vec = alpha_t * np.ones(np.shape(L))

# Create array from row data L, T, alpha_t_vec
data_out = np.stack((L, Tm, alpha_t_vec), axis=1)

# Show data_out
print("data_out = ", data_out)

# Save data with 2 digit precision, an explanatory header, and comma-delimited
np.savetxt("data/pendulum.csv", data_out, fmt="%5.2f", header="L (cm), T (s), alpha_T (s)", delimiter=",")

You can now verify with your operating system and a text editor that the file `pendulum.csv` in the directory `data` has the specified format. Alternatively, we can show the file contents from within this Jupyter notebook by using the built-in Python function [`open`](https://docs.python.org/3/library/functions.html#open) to create the [file object](https://docs.python.org/3/glossary.html#term-file-object) `file`, which provides Python with methods for interacting with the file that it points to. The [`read`](https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects) method returns the contents of the file. Following [recommended practice](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files), we use the [`with`/`as`](https://docs.python.org/3/reference/compound_stmts.html#with) keywords to assign the output of `open('data/pendulum.csv)` to the `file` variable, which tells the Python interpreter to  execute the subsequent `contents = file.read()` statement within a special [context](https://docs.python.org/3/reference/datamodel.html#context-managers) that will handle file errors gracefully if they occur. If `file.read()` produces an error, for example, Python will close the file before returning the error and stopping execution.

In [None]:
# Show file contents
with open("data/pendulum.csv") as file:
    contents = file.read()

print(contents)

### Import data from a file
The variable `contents` represents the data as a long text string, not a numerical array. The [`genfromtxt`](https://numpy.org/doc/stable/reference/generated/numpy.genfromtxt.html) provides a way to generate a NumPy array from formatted text input. We set `skip_header=1` to ignore the first line and set `delimiter=','` to treat commas as delimeters between array elements.

In [None]:
# Load data from file into array
data_in = np.genfromtxt("data/pendulum.csv", skip_header=1, delimiter=",")
print(data_in)

Once we have loaded the data as a 2D array we can use indexing to assign individual columns to separate variables.

In [None]:
L_in = data_in[:, 0]
Tm_in = data_in[:, 1]
alpha_t_in = data_in[:, 2]

print("L_in = ", L_in)
print()
print("Tm_in = ", Tm_in)
print()
print("alpha_t_in = ", alpha_t_in)

Alternatively, we can use the `unpack` optional keyword to tell `genfromtext` to direct each text column to a different variable.

In [None]:
L_unpack, Tm_unpack, alpha_t_unpack = np.genfromtxt("data/pendulum.csv", skip_header=1, delimiter=",", unpack=True)

print("L_unpack = ", L_unpack)
print()
print("Tm_unpack = ", Tm_unpack)
print()
print("alpha_t_unpack = ", alpha_t_unpack)

### Exercise 1
Create NumPy arrays `current_out`, `voltage_out`, and `alpha_voltage_out` that contain the data described in *MU* Prob. (5.3), and use `savetxt` to save the data to a CSV file `resistance.csv` in the `data` directory with a header that describes the column format. Finally, use `genfromtxt` to read back the contents of the file into the new variables `current_in`, `voltage_in`, and `alpha_voltage_in`.

## Basic plotting

### Example: Demonstrating Ohm's law
Consider the data tabulated in Prob. (5.3), taken from an experiment to verify Ohm's law. In the following code cells we will develop a plot that demonstrates Ohm's law and compares the resistance measurements to our expectations for a 100&nbsp;&Omega; resistor.

#### Assign data variables
Start by assigning the data to [NumPy arrays](https://numpy.org/doc/stable/reference/arrays.ndarray.html) (recall from the [Programming notes 3](4.0-Error-propagation.ipynb#Programming-notes-3) section in the last notebook that mathematical operations are defined with NumPy arrays that are not defined for Python lists).

In [None]:
# Load data
current, voltage, alpha_voltage = np.genfromtxt("data/resistance.csv", skip_header=1, delimiter=",", unpack=True)

#### Choose the independent and dependent variables
The problem states that the voltmeter precision is 0.01 mV, and that the uncertainty in the current is negligible. This suggests we use the current as the independent variable and use an [`errorbar`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.errorbar.html?highlight=errorbar) plot to show the voltage uncertainty.

#### Choose the functional relationship to show
Ohm's law is a linear relationship, so we will just plot the voltage directly as a function of current. Later we will see how to demonstrate certain types of nonlinear relationships by *linearizing* the data.

In [None]:
# Plot data
plt.errorbar(current, voltage, yerr=alpha_voltage)
plt.show()

#### Choose appropriate scales for the axes
Ohm's law states that the voltage should be not just linear in the current, but *proportional* to it, so a line through the data should pass through the origin. We can then provide the viewer with an implicit test of the model by extending the plot range to place the origin at the lower left corner of the plot. The upper limits of both the voltage and current are less critical, but they are both close to a power of 10, so we will adjust these, too.

Note also that the default behavior of `errorbar` is to plot the data points *without markers* and to connect them with straight line segments. Generally it is better to represent data like this *with markers* that are large enough for the viewer to see easily and to leave them *disconnected* so that we can use lines to represent a model fit to the data. The error bars are very small on the scale of the plot, so we will switch over to the regular [`plot`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html) command—we can inform the reader of the voltage uncertainty with text.

In [None]:
# Plot data with circular markers
plt.plot(current, voltage, "o")

# Set axis limits to include origin
plt.xlim(0, 100)
plt.ylim(0, 10)
plt.show()

#### Label the axes
Now label the axes, including appropriate units. Avoid units that cause your axes to have scales that require either that the tick labels have excessive zeros before the decimal (i.e., 0.0001, 0.0002, ...) or that the entire range must be represented with scientific notation (i.e., 1.0 &times; 10<sup>9</sup>, 2.0 &times; 10<sup>9</sup>, ..., which would be represented as 1.0, 2.0,..., with the 10<sup>9</sup> located at the end of the axis where it could easily be missed). Matplotlib supports $\LaTeX$ expressions enclosed by dollar signs, so we write `$\mu$A` for microamperes—Matplotlib also supports [unicode](https://en.wikipedia.org/wiki/Unicode), so you can enter the `µ` with your keyboard if it is available.

In [None]:
# Plot data with circular markers
plt.plot(current, voltage, "o")

# Set axis limits to include origin
plt.xlim(0, 100)
plt.ylim(0, 10)

# Add axis labels
plt.xlabel(r"Current ($\mu$A)")
plt.ylabel("Voltage (mV)")
plt.show()

#### Compare with model
We will discuss how to fit a model to data shortly; for now, we just compare the data with the hypothesis that the resistance is 100&nbsp;&Omega;. If you look carefully you will see that the data fall consistently below the model line, indicating that the measured resistance is slightly less than 100&nbsp;&Omega;. This discrepancy would be revealed more clearly in a residual plot, which we will discuss below.

In [None]:
# Plot data with circular markers
plt.plot(current, voltage, "o")

# Model behavior expected for R = 100 ohms
R_model = 100
current_model = np.linspace(0, 100)
voltage_model = 100 * (1e-6 * current_model) * 1e3

# Plot model with a solid line
plt.plot(current_model, voltage_model, "-")

# Set axis limits to include origin
plt.xlim(0, 100)
plt.ylim(0, 10)

# Add axis labels
plt.xlabel(r"Current ($\mu$A)")
plt.ylabel("Voltage (mV)")
plt.show()

#### Annotate plot
How you annotate a plot will depend on the context. Normally a laboratory notebook should include figures with titles, while research papers and formal laboratory reports should include explanatory information in captions instead of titles. Oral presentation slides typically have their own titles, but sometimes it is useful to include titles for individual figures in a group. Legends are usually acceptable in any context as long as they don't occupy too much space—otherwise, use a caption to describe marker and line associations.

In [None]:
# Plot data with circular markers
plt.plot(current, voltage, "o")

# Model behavior expected for R = 100 ohms
R_model = 100
current_model = np.linspace(0, 100)
voltage_model = 100 * (1e-6 * current_model) * 1e3

# Plot model with a solid line
plt.plot(current_model, voltage_model, "-")

# Set axis limits to include origin
plt.xlim(0, 100)
plt.ylim(0, 10)

# Add axis labels
plt.xlabel(r"Current ($\mu$A)")
plt.ylabel("Voltage (mV)")

# Add legend
plt.legend(["Data", r"100 $\Omega$"])

# Add title
plt.title("Measured current-voltage vs. expected for 100 ohms")
plt.show()

#### Export and check format
The command [tight_layout](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.tight_layout.html) crops the figure for export.

In [None]:
# Plot data with circular markers
plt.plot(current, voltage, "o")

# Model behavior expected for R = 100 ohms
R_model = 100
current_model = np.linspace(0, 100)
voltage_model = 100 * (1e-6 * current_model) * 1e3

# Plot model with a solid line
plt.plot(current_model, voltage_model, "-")

# Set axis limits to include origin
plt.xlim(0, 100)
plt.ylim(0, 10)

# Add axis labels
plt.xlabel(r"Current ($\mu$A)")
plt.ylabel("Voltage (mV)")

# Add legend
plt.legend(["Data", r"100 $\Omega$"])

# Add title
plt.title("Resistance measurement example")

# Crop figure
plt.tight_layout()

# Save PDF for notebook
plt.savefig("resistance_notebook.pdf")
plt.show()

#### Adjust format as necessary
A report that uses 8.5" x 11" paper with 1" margins has a 6.5" text width. We set the figure width to 80% of this and set the height to yield an overall 3:2 aspect ratio. Normally the font sizes for the axis labels, tick labels, and legend should be the same or less than that of the report text, which we assume here is 12 [points](https://en.wikipedia.org/wiki/Point_%28typography%29) (1 point = 1/72 inch). We also explicitly set the data marker size to 5 points and the model line width to 1.25 points—normally, the default marker size is 6 points and the default line width is 1.5 points. Finally, we set [`zorder=3`](https://matplotlib.org/stable/gallery/misc/zorder_demo.html) in the first call to `plot` to put the markers on top of the model line (alternatively, we could also just change the order of the `plot` commands).

In [None]:
# Set figure width to 80% of the 6.5" text width
w_text = 6.5
w = 0.8 * w_text
h = (2.0 / 3.0) * w
fig = plt.figure(figsize=(w, h))

# Plot data with circular markers
# Set zorder=3 to put markers on top of model curve
# Set marker size to 5 points
plt.plot(current, voltage, "o", markersize=5, zorder=3)

# Model behavior expected for R = 100 ohms
R_model = 100
current_model = np.linspace(0, 100)
voltage_model = 100 * (1e-6 * current_model) * 1e3

# Plot model with a solid line
plt.plot(current_model, voltage_model, "-", linewidth=1.25)

# Set axis limits to include origin
plt.xlim(0, 100)
plt.ylim(0, 10)

plt.xticks(fontsize=10)
plt.yticks(fontsize=10)
# Add axis labels
plt.xlabel(r"Current ($\mu$A)", fontsize=12)
plt.ylabel("Voltage (mV)", fontsize=12)

# Add legend
plt.legend(["Data", r"100 $\Omega$"], fontsize=10)

# Crop figure
plt.tight_layout()

# Save PDF for article
plt.savefig("resistance_report.pdf")
plt.show()

### Linearizing relationships
Linearizing a graph can be a very effective way to demonstrate the validity of certain functional relationships between two quantities. Consider first the example given in *MU* Sec. 5.1.2, the period *T* of a pendulum as a function of its length *L*. Use the random number generator to simulate period measurements with a constant standard error *&alpha;<sub>T</sub>*&nbsp;=&nbsp;0.05&nbsp;s, with pendulum lengths ranging from 10&nbsp;cm to 100&nbsp;cm. Use `T` for the model prediction, and `Tm` for the simulated measurements.

In [None]:
# Assign variables
g = 9.8  # (m/s^2)
L = np.arange(10, 100, 10)  # (cm)
T = 2 * np.pi * np.sqrt((L / 100) / g)  # (s)
alpha_t = 0.05  # (s)

# Seed RNG and simulate data
rng = default_rng(0)

Tm = T + alpha_t * rng.normal(size=np.size(T))

# Make plot
plt.errorbar(L, Tm, yerr=alpha_t, fmt="o")
plt.xlabel("Length (cm)")
plt.ylabel("Period (s)")
plt.show()

Now plot `Tm**2` versus `L` to linearize the graph. Remember that we also need to propagate the uncertainty in `Tm` to produce the correct uncertainty in `Tm**2`; we'll use the calculus-based method of Sec. 4.1.2.

In [None]:
Tm_sq = Tm**2
alpha_tm_sq = 2 * alpha_t * Tm

plt.errorbar(L, Tm_sq, alpha_tm_sq, fmt="o", markersize=5)
plt.xlabel("Length (m)")
plt.ylabel("(Period)$^2$ ($s^2$)")
plt.show()

### Semilogarithmic plots
Yet another common linearization method uses the [`semilogy`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.semilogy.html) function, which is useful to demonstrate an *exponential* functional dependence. Consider, for example, the voltage across a 47 nF capacitor as it discharges from an initial value of 1 V across a 100 k&Omega; resistor. Add noise with a 10 microvolt amplitude using the random number generator.

In [None]:
x = np.linspace(0, 100)  # s
R = 100e3  # Ω
C = 47e-9  # F
tau = R * C * 1000  # ms
V0 = 1  # V
alpha_v = 10e-6  # V

V = V0 * np.exp(-x / tau)
Vm = V + alpha_v * rng.normal(size=np.size(V))

First plot the data using linear scales.

In [None]:
plt.plot(x, Vm, "o", x, V, "r-")
plt.xlabel("Time (ms)")
plt.ylabel("Voltage (V)")
plt.show()

Now plot the data using a semilogarithmic scale.

In [None]:
plt.semilogy(x, Vm, "o", x, V, "r-")
plt.xlabel("Time (ms)")
plt.ylabel("Voltage (V)")
plt.show()

The linear plot shows little variation for *t* greater than 20 ms, while the semilogarithmic plot allows this data to be visualized much more effectively. The model function appears as a straight line on the `semilogy` plot, while the data follows a straight line for *t* less than about 50 ms. As *t* increases above 50 ms, the noise begins to dominate and the voltage fluctuates randomly with a 10 microvolt width about zero. About half of these fluctuations have negative values that are excluded from the plot because their logarithm is undefined.

Most of the remaining positive values are greater than 1 microvolt: with 10 microvolt Gaussian noise centered at zero, it is more than ten times more likely that the voltage will be greater than one microvolt than between one microvolt and zero.

### Using a graph to see trends in the data
Reproduce Fig. 5.5. The text gives the fit parameters, and we can estimate the noise level from the tail in Fig. 5.5(b).

In [None]:
# Assign true parameter values
tau = [8.67, 67]
a = [6.5e7, 0.5e7]

alpha_n = 1e5

# Generate evenly spaced points for t
n_points = 200
t = np.linspace(0, 60, n_points)

# Simulate measurements
n = a[0] * np.exp(-t / tau[0]) + a[1] * np.exp(-t / tau[1])

nm = n + alpha_n * rng.normal(size=len(n))

# plot the results
plt.plot(t, nm / 1e6, "o")
plt.xlabel("Time (seconds)")
plt.ylabel("Number of atoms/$10^6$")
plt.text(55, 60, "(a)")
plt.show()


# plot the same results in a semi-log plot
# plot the results
plt.semilogy(t, nm / 1e6, "o")
plt.xlabel("Time (seconds)")
plt.ylabel("Number of atoms/$10^6$")
plt.text(55, 60, "(b)")
plt.show()

To show the trend lines in Fig. 5.5(c), fit separately to the data for *t < 20* and *t > 40*. Use [Boolean array indexing](https://numpy.org/doc/stable/user/basics.indexing.html#boolean-array-indexing) to select data in these two ranges. To see how Boolean array indexing works, consider the following expressions.

In [None]:
seq = np.arange(0, 10)
seq_lim = 5
print(seq)
# find the index of all values that are > seq_lim
print(seq > seq_lim)
# return all the values that are > seq_lim
print(seq[seq > seq_lim])
# find the index of all values that are < seq_lim
print(seq < seq_lim)
# return all the values that are < seq_lim
print(seq[seq < seq_lim])

The expression `seq > seq_lim` evaluates the inequality for each element of `seq` to assign a `True` or `False` value to the corresponding element in a boolean array of the same size. The expression `seq(seq > seq_lim)`, then, returns only those elements of `seq` for which `seq > seq_lim` is `True`.
Now use it this to fit to `t < 20` and `t > 40`.

Your turn: make a plot of the current through an ideal diode as a function of applied voltage.

In [None]:
v = np.arange(0, 0.6, 0.001)  # V

k_b = 1.3806e-23  # J/K
q_e = 1.6022e-19  # C
T_r = 296  # Room temperature, in degrees Kelvin

v_t = k_b * T_r / q_e  # Thermal voltage, in volts
i_s = 1e-12  # Saturation current, in amperes

i = i_s * (np.exp(v / v_t) - 1)
plt.plot(v, i)
plt.show()

### Log-log plots
Yet another way to linearize the same data is to use a [`loglog`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.loglog.html) plot. These are useful for identifying a power-law relationship between two quantities, i.e.,

$$y = A x^\alpha $$

where $A$ and $\alpha$ are unknown. Taking the logarithm of both sides yields a linear relatonship between $y$ and $x$:

$$\log y = \log A + \alpha \log x $$

Use `loglog` to demonstrate this with the `L`, `Tm` data. Show the data as circles (`o`), and add a red line (`r-`) to show the model behavior, *T*, and a blue dashed line (`b--`) to show `sqrt(L)`, to demonstrate the advantage of using `loglog` to identify a power-law functional dependence. The data fall approximately on a straight line, with a *slope* similar to the plot of `sqrt(L)` versus `L`.

In [None]:
plt.loglog(L, Tm, "o", L, T, "r-", L, np.sqrt(L), "b--")
plt.xlabel("Length (cm)")
plt.ylabel("Period (s)")
plt.show()

## Summary
Here is a list of what you should be able to do after completing this notebook.
* Use the NumPy function [`savetxt`](https://numpy.org/doc/stable/reference/generated/numpy.savetxt.html) to save a NumPy array as formatted text.
* Use the NumPy function [`genfromtxt`](https://numpy.org/doc/stable/reference/generated/numpy.genfromtxt.html) to read data from a formatted text file into a NumPy array.
* Use the built-in Python keyword [`with`](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files) to create a context manager for working with files.
* Use the built-in Python function [`open`](https://docs.python.org/3/library/functions.html#open) to create a [file object](https://docs.python.org/3/glossary.html#term-file-object).
* Use the [`read`](https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects) method for file objects.
* Use the NumPy function [`stack`](https://numpy.org/doc/stable/reference/generated/numpy.stack.html) to arrange a set of NumPy arrays into a higher-dimensional NumPy array.
* Use the [`shape`](https://numpy.org/doc/stable/reference/generated/numpy.shape.html#numpy.shape) attribute of a NumPy array to determine its size.
* Use functions from the pyplot library to create, format, and save plots.
* Linearize a graph by applying a transformation to one or both axes.
* Interpret the functional relationship that is represented by a linearized graph.
* Use the [`semilogy`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.semilogy.html), and [`loglog`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.loglog.html) functions from the pyplot library to linearize exponential and power-law relationships, respectively.
* Use [Boolean array indexing](https://numpy.org/doc/stable/user/basics.indexing.html#boolean-array-indexing) to select a subset of a NumPy array.

##### About this notebook
© J. Steven Dodge, 2019.The notebook text is licensed under CC BY 4.0. See more at [Creative Commons](https://creativecommons.org/licenses/by/4.0/). The notebook code is open source under the [MIT License](https://opensource.org/licenses/MIT).