<p style='text-align: center'><a href=https://www.biozentrum.uni-wuerzburg.de/cctb/research/supramolecular-and-cellular-simulations/>Supramolecular and Cellular Simulations</a> (Prof. Fischer)<br>Center for Computational and Theoretical Biology - CCTB<br>Faculty of Biology, University of Würzburg</p>

<p style='text-align: center'><br><br>We are looking forward to your comments and suggestions. Please send them to <a href=sabine.fischer@uni.wuerzburg.de>sabine.fischer@uni.wuerzburg.de</a><br><br></p>

<h1><p style='text-align: center'> Introduction to Python </p></h1>

## Plots

We will now take an extensive look at the `matplotlib` package for visualization in Python. `matplotlib` is a multi-platform data visualization library built on NumPy arrays, and designed to work with the broader SciPy stack. One of Matplotlib’s most important features is its ability to play well with many operating systems and graphics backends. Matplotlib supports dozens of backends and output types, which means you can count on it to work regardless of which operating system you are using or which output format you wish. This cross-platform, everything-to-everyone approach has been one of the great strengths of `matplotlib`. It has led to a large user base, which in turn has led to an active developer base and `matplotlib`'s powerful tools and ubiquity within the scientific Python world.

In order to make plots with `matplotlib.pyplot` we have to import the library. We do this with the command `import matplotlib.pyplot as plt` to make sure we dont have to write matplotlib.pyplot all the time. <br>
Also we `import numpy as np`, as we will need it later on.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

### 1. Line Plot

To make a normal line plot the pyplot method `plot()` is used. It needs at least one input argument for y-axis values which can be an array/list or variables that have an array/list assigned to them. It also takes an argument for x-axis values in the same form and shape of the y-axis argument. The first input argument always appears on the x-axis, whereas the second always appears on the y-axis. If no x-axis argument is given the y values will be plotted against x values starting from `0` and going up `1` for every y value. If a x-axis argument is given it is important that the two arguments always have the same dimension. To invoke the plot we have to use the `show()` method. In our notebooks it would work without `show()` most of the times, but at least in a real Python script you will need it in order to get instant output. Sometimes it is desirable to save a figure and not only show it for the moment. In order to do that the method `savefig('name.png')` is used.

In [None]:
plt.plot([5,7,10,14])
plt.savefig('test.png')

Now a basic first plot is done. But basically a plot without title and labeled axes is more or less useless. To give the plot a title and labels we have to use the methods `title()`, `xlabel()` and `ylabel()`.

In [None]:
plt.plot([1,2,3,4],[5,7,10,14])
plt.title('first plot')
plt.xlabel('X label')
plt.ylabel('Y label')

It is also possible to change the size of our figure using the method `figure()` and passing the values as a tuple of length of rows and columns to the argument figsize. Another possibility of customizing the plot is to use the methods `xlim()` and `ylim()` to set limits for x- or y-axis.

In [None]:
plt.figure(figsize=(15,5))
plt.plot([1,2,3,4],[5,7,10,14])
plt.title('first plot')
plt.xlabel('X label')
plt.ylabel('Y label')
plt.xlim(0,5)
plt.ylim(3,16)

To further format our plot we can pass more keyword arguments to the `plot()` function. With them its possible to adjust the style of the line <br>(`linestyle=`), color (`color=`), marker for the datapoints (`marker=`) and many more: <br>( https://matplotlib.org/api/_as_gen/matplotlib.pyplot.plot.html ). <br>
It is also possible to give a line a label (`label=`), that can be used as legend with the method `legend()`. This is especially helpful when you have more than one graph in a plot, which is often the case.

In [None]:
plt.plot([1,2,3,4],[5,7,10,14], color='forestgreen', linestyle='--', marker='*', label='line 1')
plt.plot([1,2,3,4],[12,7,5,4], color='tomato', linestyle='-', marker='s', label='line 2')
plt.title('first plot')
plt.xlabel('X label')
plt.ylabel('Y label')
plt.legend()

Sometimes it is also useful to add a horizontal or vertical line to point out something or as a border. We can do this by using the arguments `axhline()` or `axvline()`. These lines are customable analog to the plot() lines. Another method that may be used is `grid()` wich takes a boolean (`True` or `False`) as argument.

In [None]:
plt.plot([1,2,3,4],[5,7,10,14], color='forestgreen', linestyle='--', marker='*', label='line 1')
plt.plot([1,2,3,4],[12,7,5,4], color='tomato', linestyle='-', marker='s', label='line 2')
plt.title('first plot')
plt.xlabel('X label')
plt.ylabel('Y label')
plt.axhline(7, linestyle='--')
plt.axvline(2, c='black')
plt.grid(True)
plt.legend()
plt.show()

### 2. Bar Plot
Bar plots are one of the most common types of graphs allowing to show data associated with the categorical variables. In pyplot `bar()`, for vertical bars, or `barh()`, for horizontal bars, are the methods to use for this purpose. This methods take the arguments: categorical variables, values. To customize the bars we can again use keyword arguments like in `plot()`, for example color or label and many other arguments like errorbars (`xerr/yerr=`) with caps (`capsize=`) or width of the bars (`width=`)( https://matplotlib.org/api/_as_gen/matplotlib.pyplot.bar.html ).

In [None]:
category=['A', 'B', 'C', 'D']
values=[44,55,32,41]
error=[5,8,7,9]
plt.bar(category, values, color=['red','blue','green','orange'], width=0.8, yerr=error, capsize=3, edgecolor='black', lw=2)
plt.title('first barplot')
plt.xlabel('category')
plt.ylabel('value')
plt.show()

In [None]:
plt.barh(category, values, color=['red','blue','green','orange'], xerr=error, capsize=3, edgecolor='black', lw=2)
plt.title('first barplot')
plt.ylabel('category')
plt.xlabel('value')
plt.show()

### 3. Histogram
Histograms are a very common type of plot where data is plotted against itr frequency. Especially in statistics histograms are important to get to know the distribution of the data, like normal-distibution, uniform-distribution etc. In `pyplot` histograms are made with the method `hist()` which takes an array of values and the number of bins as arguments. Further optional arguments for costumization include color, edgecolor, label and many more (https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html).  <br>
In order to make some data for a histogram we use the command `np.random.randn(1000)` to create random numbers from a standard normal distribution.

In [None]:
x=np.random.randn(1000)
plt.hist(x, bins=10, color="royalblue", edgecolor='black', lw=2)
plt.title('First histogram')
plt.xlabel('Random data')
plt.ylabel('Frequency')
plt.show()

### 4. Scatter Plot
Scatter plots are graphs that are widely used and very helpful in visualizing regression problems. Here we can use the method `scatter()`. The arguments and optional arguments are very similar to `plot()` (https://matplotlib.org/api/_as_gen/matplotlib.pyplot.scatter.html). 

In [None]:
x=[3,5,6,1,8,11]
y=[7,2,5,4,9,3]
plt.scatter(x,y, marker ='^', color='chartreuse')
plt.title('First scatterplot')
plt.xlabel('X-value')
plt.ylabel('Y-value')
plt.show()

### 5. Pie Chart
One more type of chart is a pie chart which can be made using the method `pie()`. We can also pass in arguments to customize our pie chart to show shadow (shadow=bool), explode a part of it (`explode=`) or tilt it at an angle (`startangle=`)(https://matplotlib.org/api/_as_gen/matplotlib.pyplot.pie.html). Often it is necessary to adjust the axes with `axis('equal')` to get an actually round pie chart.

In [None]:
Firms=['Firm A', 'Firm B', 'Firm C', 'Firm D', 'Firm E']
share=[20, 35, 15, 10 ,20]
Explode=[0,0.1,0,0,0]
plt.pie(share, explode=Explode, labels=Firms, shadow=True, startangle=45)
plt.axis('equal')
plt.legend()
plt.show()

### 6. Box-Whisker Plot
Like a histogram a box-whisker plot is a way of graphically show a distribution of values. In pyplot the method `boxplot()` leads to a box-whisker plot. The argument the method needs is the input data in form of an array (to get a single box) or sequence of vectors (to get more boxes). Optional arguments again can costumize the plot, for example we can make a horizontal box plot with `vert=False`. (https://matplotlib.org/api/_as_gen/matplotlib.pyplot.boxplot.html). 

In [None]:
A=np.random.randn(10)
B=np.random.randn(10)
C=np.random.randn(10)
print(A)
plt.boxplot([A,B,C], labels=['A','B','C'], vert=False)
plt.title('first boxplot')
plt.ylabel('Category')
plt.xlabel('Value')
plt.show()

### 7. How to make graphics with several plots


Often times you have several diagrams you want to plot into one figure. For example we could fit two of the plots we already made into one figure. Therefore the method `subplots()` can be used. `subplots()` takes three arguments: `nrows`, `ncolumns` and `index`. They indicate the number of rows, the number of columns and the index number of the subplot. To give every plot a title we have to use `title()` several times and `suptitle()` to give the whole figure a title. 

In [None]:
plt.subplot(1,2,1)
plt.plot([1,2,3,4],[5,7,10,14], color='forestgreen', linestyle='--', marker='*', label='line 1')
plt.plot([1,2,3,4],[12,7,5,4], color='tomato', linestyle='-', marker='s', label='line 2')
plt.title('first plot')
plt.xlabel('X label')
plt.ylabel('Y label')
plt.legend()

plt.subplot(1,2,2)
plt.boxplot([A,B,C], labels=['A','B','C'], vert=False)
plt.title('first boxplot')
plt.ylabel('Category')
plt.xlabel('Value')


plt.suptitle('first sub-plot')
plt.show()


## Exercises

### Always make pretty plots (title, axis labeling, nice colors, legend, axis limits, grid (where appropriate), ...) !!!

Execute the first cell to get some data you can plot.

In [None]:
import json
import pickle
import numpy as np

with open('Temperatur_Station3.json', 'r') as fp:
    averagetemp_Station3 = json.load(fp)
with open('Temperatur_Station44.json', 'r') as fp:
    averagetemp_Station44 = json.load(fp)
with open('Temperatur_Station73.json', 'r') as fp:
    averagetemp_Station73 = json.load(fp)
with open('Niederschlag_Station3.json', 'r') as fp:
    rainfall_Station3 = json.load(fp)
with open('Niederschlag_Station44.json', 'r') as fp:
    rainfall_Station44 = json.load(fp)
with open('Niederschlag_Station73.json', 'r') as fp:
    rainfall_Station73 = json.load(fp)
Histolist=100+15*np.random.randn(10000)
loglist=[np.exp(-i*0.1) for i in range(1,81)]
with open('PlotData.pkl', 'rb') as fp:
    list3D = pickle.load(fp)
Months=['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']

### <p style='color: green'>easy</p>
1. Import `matplotlib.pyplot` as `plt`

2. Plot average temperatures per month at the three different stations against the months (Months) in three different plots. 
     - Once as scatter plot with legend, different colors, markers and grid.
     - Once as line plot with legend, different colors, markers and grid.

3. Make a bar graph of the rainfall per month at station 3.

4. Make a boxplot of the temperatures at each station.

5. Make a single boxplot that includes the temperatures of all three stations. Add a dashed, horizontal line at 9.5°C.

6. Make a histogramm of `Histolist`, wich contains generated data of an IQ distribution, with `100` bins. Add a black vertical line at `100`

7. Use the internet to find a keyword argument for `hist()` that plots the probability density instead of absolute occurences, i.e. the area under the all bars combined should be exactly `1`. Use this to make the same histogram with `20` bins.

### <p style='color: red'>hard</p>

8. Make a plot of `loglist`, which could resemble the decline of something over time (just come up with something for your labels), without and with a logarithmic y-axis (use the internet).

9. Make a diagram in wich `averagetemp_Station73` are plotted as a line graph and `rainfall_Station73` is plotted as a bar graph. Try to get the axis for rainfall on the left and for temperature on the right side of the diagram.

10. Make a plot with the rainfall of all three stations in one bar graph where the three different bars are shown for every month.

11. `list3D` contains an array of points in 3-dimensions. Make a 3D-Scatterplot.