<a href="https://colab.research.google.com/github/SadiaSharmin/Python/blob/main/Matplotlib/Plot_types_Continuous_Data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Plot Types - Continuous Data

## Scatter Plots

The ```plt.scatter``` command ([manual page](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.scatter.html)) allows us to produce a scatter plot. The first two arguments which describe the x and y coordinates of the points respectively. The first entry in x pairs with the first entry in y to give the coordinates of the first point and so on. As a result, these sequences must have the same number of points.

We can also specify the size of points using the "s" optional argument. If this value is a scalar, the size applies to all points. If it's a sequence, it must have the same number of entries as the x and y sequences and each entry in the sequence will apply to a different point. This choice is a fairly common one in ```matplotlib```. - in the documentation, look out for phrases like "float or array-like".

The ```marker``` optional argument accepts a string which specifies a type of point, as specified [here](https://matplotlib.org/stable/api/markers_api.html#module-matplotlib.markers).

In [None]:
import matplotlib.pyplot as plt

# Specify the z and y coordinates of the 4 points
x = [0, 4, -2, 10]
y = [5, 1, 10, 2]

# Define the variable which will specify the size of the points
# This may be a scalar to apply to all points
# May be a sequence, with a value given to each point
s = 100
#s = [20, 20, 20, 100]

# Plot the scatter plot
# Provide x and y to give coordinates
# provide s to set the size(s)
# Set marker to set the marker type for the points
plt.scatter(x, y, s=s, marker="x")

## Line Plots

A line plot can be created using ```plt.plot``` ([manual page](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html#matplotlib.pyplot.plot)). Just like the scatter plot, the first two arguments give the x and y coordinates of a series of points which define the line.

We can also set the scale of the y-axis to be logarithmic by writing ```plt.yscale("log")```. Other options for the scaling of the y-axis can be found [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.yscale.html).

When we use the ```plot``` command we can set a nubmer of properties of the figure and the line. We can set the colour of the line by setting the ```color``` optional arguments. Possible values are recorded [here](https://matplotlib.org/stable/gallery/color/named_colors.html).

We can set the markings of the lines using the ```linestyle``` optional argument, with possible options described [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.lines.Line2D.html#matplotlib.lines.Line2D.set_linestyle). The ```linewidth``` optional argument allows us to specify the width of the line, allowing us to provide a numerical value.

In [None]:
import matplotlib.pyplot as plt

# Set up the values of the points to define the line
x = [-10, -5, 0, 5, 10, 15, 20]
y = [0.1, 0.3, 1.1, 3.5, 10.3, 31, 111]

# Set the y-axis to be logarithmic
plt.yscale("log")

# Plot the line
# Set the colour to red
plt.plot(x, y, color="red", linestyle="-.", linewidth=5)

### Multiple lines

By using the ```plt.plot``` command multiple times, multiple lines can be placed on a single figure. By specifying the ```linestyle``` and ```color``` independently, we can distinguish the lines. Note that, as we havne't specified a ```linewidth```, it defaults to 1.

We can also set a label for each line which can then be included on the figure using the ```legend``` command.  Note that, in the labels here, we've inserted some subscript text. This is achieved by utilising the fact that text inside pairs of dollar signs in Matplotlib are interpreted in "mathmode" (the same as mathmode in LaTeX). This allows the creation of a number of [different mathematical effects](https://matplotlib.org/stable/tutorials/text/mathtext.html), including subscript text by writing an underscore followed by text in a pair of curly brackets.

We can also specify the exact values of the ticks of the x-axis using ```plt.xticks``` and providing a sequence containing the desired values.

In [None]:
import matplotlib.pyplot as plt

# Set up the x-values. These can be used for both lines
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# Set up the y-values for each line
y1 = [10, 8, 6.4, 5, 4, 3.2, 2.5, 2, 1.6, 1.3, 1]
y2 = [0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5]

# Specify the values of the xticks
plt.xticks([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Plot each line
# Give each line a unique appearance and label
plt.plot(x, y1, color="blue", linestyle="-", label="CaCO$_{3}$")
plt.plot(x, y2, color="red", linestyle=":", label="H$_{2}$O $\\frac{1}{2}$")
# Add the legend
plt.legend()

## Error Bars

It's possible to plot data points including error bars using the ```plt.errorbar``` command ([manual page](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.errorbar.html)). This receives arguments giving the x and y coordinates of points and may optionally specify the size of x and y error bars using the ```xerr``` and ``yerr``` optional arguments. Some common values these arguments may receive include a sequence (with one entry referring to each point) or a single value specifying the errors for all points.

By default, the points will be joined with a line. However, this can be suppressed by giving the optional argument ```fmt``` the value ```"."``` which causes the points themselves to be represented with a small circle instead.

As detailed in the documentation, the errorbars can be customised to quite a degree. As an example, the ```capsize``` optional argument can be set to specify the size of the caps at the edges of the error bars.

In [None]:
import matplotlib.pyplot as plt

# Create the coordinates of the points
x = [1, 2, 3, 4]
y = [0, 2.1, 3.8, 6.1]

# Create the errors in the x direction
xerr = [0.1, 0, 0.2, 0.1]

# Plot the error bar plot
# Provide the x and y coordinates of the points
# Provide the unique errors in the x direction using xerr
# Provide the uniform errors in the y direction using yerr
# Use fmt to suppress the line
# Set the capsize to create the lines at the end of the error bars
plt.errorbar(x, y, xerr=xerr, yerr=0.2, fmt="none", capsize=3)

## Exercise

A biologist is conducting an experiment where they measure the number of bacteria in a growth medium as a function of time. They obtain the following measure of bacteria population:

| Time (h) | Population |
|----------|------------|
| 0        | 1          |
| 3        | 2.2        |
| 6        | 3.9        |
| 9        | 8.5        |
| 12       | 16.4       |
| 15       | 30         |
| 18       | 65         |
| 21       | 135        |
| 24       | 240        |

They estimate these observations have an error in time of 30 minutes and a error in population of $\pm$10%.

They also produce a line of best fit with the equation:

$p(t) = \textrm{e} ^{0.231t}$

where $p(t)$ is the population and $t$ is the time measured in hours.

Create a figure with an ```errorbar``` plot and a simple line ```plot``` to compare the line of best fit and the observations. 

The, customise your plot using some fo the commands we've seen so far. Decide for yourself which options display the data most clearly.

### Extension

The biologist then calculates the confidence interval around their line of best fit. The interval is the region whose population is within 8% either side of the line of best fit. Read the [documentation](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.fill_between.html) of the ```fillbetween``` class and add a shaded region to represent the confidence interval.

Hint: the "alpha" of a colour represents how opaque it is.


In [None]:
#@title

import matplotlib.pyplot as plt
import math

# Set up the lists to hold the time and the population
time = [0, 3, 6, 9, 12, 15, 18, 21, 24]
population = [1, 2.2, 3.9, 8.5, 16.4, 30, 65, 135, 240]

# Create the list to hold the errors in the population
yerr = []
for p in population:
  yerr.append(p * 0.1)

# You could make an argument either way for whether the y-axis should be logarithmic
# Comment it out and decide for yourself which looks better
plt.yscale("log")

# Create the lists to hold the y-coordinates of the fit and the upper and lower y-coordinates of the confidence interval
fit = []
shaded_lower = []
shaded_upper = []

# Calcualte the y-coordinates of the fit and the upper and lower y-coordinates of the confidence interval
fit = []
for t in time:
  f = math.exp(0.231 * t)
  fit.append(f)
  shaded_lower.append(f * 0.92)
  shaded_upper.append(f * 1.08)

# Set the ylabel, xlabel and title
plt.ylabel("Bacteria Population (Arbitrary Units)")
plt.xlabel("Time (h)")
plt.title("Bacteria Population as a Function of Time")

# Plot the error bars
# Provide an array for the yerr
# Provide a single value for the constant xerr
# Give it a label so it shows up in the legend
plt.errorbar(time, population, yerr=yerr, xerr=0.5, fmt=".", label="Observations")

# Plot the line of best fit
# Give it a label so it shows up in the legend
plt.plot(time, fit, label="Fit", color="red")

# Create the shaded area for the confidence interval
# Provide "time" as the x values
# Provide the upper and lower bounds of the y-dimension
# "alpha" represents how opaque the region is (on a scale of 0-1). Set it to a low value so it doesn't obscure the line/points
# Make it the same colour as the fir line to show they're related
# Give it a label so it shows up in the legend
plt.fill_between(time, shaded_lower, shaded_upper, alpha=0.3, color="red", label="Confidence Interval")

# Place the legend
plt.legend()