# Welcome to the Beginner Python Workshop 

**Topic: Scripting and plotting with a dataset**

This notebook will give you a basic introduction to the Python world. Some of the topics mentioned below is also covered in the [tutorials and tutorial videos](https://github.com/GuckLab/Python-Workshops/tree/main/tutorials)

Eoghan O'Connell, Guck Division, MPL, 2021

In [6]:
# notebook metadata you can ignore!
info = {"workshop": "05",
        "topic": ["solution", "exercise",
                  "scripting", "plotting", "pandas",
                  "matplotlib", "csv", "iris", "data",
                  "curve fitting"],
        "version" : "0.0.3"}

### How to use this notebook

- Click on a cell (each box is called a cell). Hit "shift+enter", this will run the cell!
- You can run the cells in any order!
- The output of runnable code is printed below the cell.
- Check out this [Jupyter Notebook Tutorial video](https://www.youtube.com/watch?v=HW29067qVWk).

See the help tab above for more information!


# What is in this Workshop?
In this notebook we cover:
- How to do the exercise from Workshop 05

In [7]:
# import necessary modules
%matplotlib nbagg
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from scipy.optimize import curve_fit

### Excercises

- Scripting
  - You have a excel spreadsheet and want to fit your data with a polynomial curve
     - Filter the data, do some maths on the columns
  - You need to create a plot displaying this curve fit, along with the following
     - Subfigures describing the data, axis labels, error bars
        - Histogram
        - Grouped plots (box, violin)
       
     - Saved figure needs to be publication-ready resolution


In [8]:
df = pd.read_csv(r"../data/iris.csv")

In [74]:
print(df["class"].unique()) # prints the first five values of the file

['Iris-setosa' 'Iris-versicolor' 'Iris-virginica']


In [10]:
df.tail() # prints the last five values of the file

Unnamed: 0,sepallength,sepalwidth,petallength,petalwidth,class
145,6.7,3.0,5.2,2.3,Iris-virginica
146,6.3,2.5,5.0,1.9,Iris-virginica
147,6.5,3.0,5.2,2.0,Iris-virginica
148,6.2,3.4,5.4,2.3,Iris-virginica
149,5.9,3.0,5.1,1.8,Iris-virginica


In [11]:
df.plot.scatter(x="petallength", y="petalwidth", alpha=0.6)

<IPython.core.display.Javascript object>

<AxesSubplot:xlabel='petallength', ylabel='petalwidth'>

In [12]:
df.plot.bar(x="petallength", y="petalwidth", alpha=0.6)

<IPython.core.display.Javascript object>

<AxesSubplot:xlabel='petallength'>

In [13]:
df.plot.box(x="petallength", y="petalwidth")

<IPython.core.display.Javascript object>

<AxesSubplot:>

In [14]:
df.plot.scatter(x="petallength", y="petalwidth", alpha=0.6)

<IPython.core.display.Javascript object>

<AxesSubplot:xlabel='petallength', ylabel='petalwidth'>

In [15]:
def linear_fit(x, m, c):
    y = (m * x) + c
    return(y)

In [16]:
x_data = df['petallength']
y_fit= linear_fit(x= x_data, m=0.5, c=0)
print(y_fit)

0      0.70
1      0.70
2      0.65
3      0.75
4      0.70
       ... 
145    2.60
146    2.50
147    2.60
148    2.70
149    2.55
Name: petallength, Length: 150, dtype: float64


In [17]:
x_data = df['petallength']
y_data = df['petalwidth']
popt, pcov = curve_fit(linear_fit, x_data, y_data)

In [18]:
popt

array([ 0.41641913, -0.36651405])

In [19]:
pcov

array([[ 9.24009095e-05, -3.47304220e-04],
       [-3.47304220e-04,  1.59114368e-03]])

In [20]:
plt.figure() #creates new plot figure
plt.scatter(x_data, y_data, alpha=0.2, color='black')
plt.plot(x_data, linear_fit(x_data, *popt), "r-.")
plt.show()
plt.savefig(r"../data/Liner_fit_plot_3.png", dpi=300)

<IPython.core.display.Javascript object>

In [48]:
plt.figure() #creates new plot figure
plt.scatter(x_data, y_data, alpha=0.2, color='black')
plt.plot(x_data, linear_fit(x_data, *popt), "r-.")

plt.xlabel(df.columns[2]+" in cm")
plt.ylabel(df.columns[3])
plt.title("our fancy graph", fontsize = 25)

plt.xticks(range(0,8), ["0 cm", "1 cm", "2cm", "3 cm", "4 cm", "5 cm", "6 cm", "7 cm"], rotation = -45)
plt.tight_layout()
plt.show()
#plt.savefig(r"../data/Liner_fit_plot_3.png", dpi=300)

<IPython.core.display.Javascript object>

In [49]:
# fig, axs = plt.subplots(2, 2, figsize=(5, 5))
# ax1, ax2, ax3, ax4 = axs.flatten()

fig = plt.figure()
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(234)
ax3 = fig.add_subplot(235)
ax4 = fig.add_subplot(236)

linestyle = ["-", "--", "-.", "-"]

df.plot(ax=ax1, style=linestyle)
df[0:50].plot(ax=ax2, style=linestyle, legend=None)
df[50:100].plot(ax=ax3, style=linestyle, legend=None)
df[100:150].plot(ax=ax4, style=linestyle, legend=None)

ax3.set_xlabel("Setosa Measurement")
ax1.set_ylabel("Length (cm)")
ax2.set_ylabel("Length (cm)")

ax1.set_ylim((-0.5, 10.5))
ax1.legend(ncol=2, loc="upper left")

plt.tight_layout()
plt.show()

<IPython.core.display.Javascript object>

In [88]:
#get the mean of each lenght of each class
measurement = "sepalwidth"
mean_setosa = df[0:50].mean()[measurement]
mean_versicolor = df[50:100].mean()[measurement]
mean_virginica = df[100:150].mean()[measurement]
print(mean_setosa)


3.4180000000000006


In [104]:
fig = plt.figure(figsize = (5,5))
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(234)
ax3 = fig.add_subplot(235)
ax4 = fig.add_subplot(236)


ax1.scatter(x_data, y_data, alpha=0.2, color='black')
ax1.plot(x_data, linear_fit(x_data, *popt), "r-.")
ax1.set_title("subtitle 1")

ax2.bar( df["class"].unique() ,[mean_setosa, mean_versicolor, mean_virginica])
ax2.set_xticklabels(df["class"].unique(), rotation = 90)

plt.suptitle("Whatever name")
plt.tight_layout()
plt.show()



<IPython.core.display.Javascript object>

  ax2.set_xticklabels(df["class"].unique(), rotation = 90)
