# Solar Cells
This notebook will help you complete the data analysis needed for the Solar Cell lab and is split into 4 parts (A-D):
- A: Unit conversions
- B: Create the P vs V plot
- C: Do a parabolic fit
- D: Find the maximum power

It is *essential* to run each of the cells ***sequentially*** and make sure to read all instructions and comments carefully. Happy coding! 

This notebook is also **interactive**, and as such there are lines where you must write the code yourself to complete the tasks. Sections where you must write your own code are numbered and labelled with the heading:

## Task #0:
And detailed instructions for your coding task will be written below. These tasks are **in addition** to adding appropriate filenames where indicated.

There are **six** tasks in this notebook.

Please make sure to read **all** of the instructions and comments **very** carefully, this is essential in making sure your code runs smoothly and error-free.

***Happy Coding!***

## Part A: Unit Conversions

## Task #1
### Save your data in an excel sheet

In the folder where this notebook is stored, you'll find a spreadsheet named **solar_cell_data.xlsx**. The spreadsheet has two columns labelled:
- Voltage
- Resistance

**Please do not change the column names. This will cause the code to not run properly.**

You will need to:

1. Download the excel file
2. Enter your data in the respective columns
3. Save the file
4. Reupload the file into the same folder. **Do not change the name of the file**
5. Click "overwrite" when the dialog asks.

We will now upload the data into python and plot it.

### Load and view the data into python using `pandas`

The data is uploaded using a python module known as `pandas` which will store the data in a *dataframe*. In the blocks below, we will go through how to load the data and view it as well.

In [None]:
# First we need to load the python modules we need
# This is just pandas for now
import pandas as pd

# Now we need to load the data using the pandas read_excel function
# This function reads the data from an excel file and loads it into a pandas dataframe
# This is called 'solar_cell_data' here
solar_cell_data = pd.read_excel('solar_cell_data.xlsx', engine='openpyxl')

# Print the first few rows of the dataframe to look at the data
print(solar_cell_data.head())

As we can see, the dataframe shows the same information as excel. Since it also preserves the column headings, we can use those to look at subsets of the data and call each column separately if necessary. This is important since we will need to do calculations using the stored values.

The syntax to do this is as follows:
1. First, write the name of the variable that represents the *dataframe*
2. Add square brackets
3. Add quotation marks inside the square brackets
4. Inside the quotes, enter the *exact* name of the column

So, for example, if I wanted to view just the *Voltage* I would use this code:

In [None]:
# Viewing only the Voltage column
print(solar_cell_data['Voltage'])

We can also view multiple columns at the same time by separating the column names with a comma:

In [None]:
# Let's view both the Voltage and the Resistance columns
print(solar_cell_data[['Voltage', 'Resistance']])

### Calculate the power

We'll start by saving the the voltage and resistance in separate variables so it is easier to calculate the power. Run the code block below to do this.

In [None]:
# import all the modules
import numpy as np 

# We'll need the voltage so we can extract that
voltage = solar_cell_data["Voltage"]

# We also need the resistance
resistance = solar_cell_data["Resistance"]

# Let's print these to see what they look like
print(voltage.head())
print(resistance.head())

## Task #2

The power can be calculated by looking at the equation for electric power and using Ohm's law to substitute the current:

$$P = IV = V^2/R$$

- I added an *incomplete* variable for the power (labeled `power`). 
- On the *right* side of the equals sign, fill in the correct formula for the `power` using the variables for `voltage` and `resistance`. 

If you need to remember the syntax for operations in Python, they are in the **Tutorial** but as a reminder:
- Adding: `+`
- Subtracting: `-`
- Multiplication: `*`
- Division: `/`
- Exponent: `**`
- Parentheses: `()`

In [None]:
# Calculate the power using the voltage and resistance
# Add the equation for power after the equals sign
power = 

# Print the first few rows of the power to test
print(power.head())

## Part B: Create the P vs V plot
### Plotting using `matplotlib`

To create plots on python, we will be using a library called `matplotlib`, this is a very extensive library that can do a lot of powerful tasks. But we will only use some of is features for this course.

#### Creating a plot canvas
The first thing we want to do when plotting is to make sure our plot is well formatted. To do this, we will create a *canvas*. Think of this like hitting the **scatter plot** button in Excel to create an empty plot. We'll then label our axes.
  
When you run this cell, you should see an empty plot with nicely formatted axes.

In [None]:
# Import matplotlib for plotting
import matplotlib.pyplot as plt

# Create a canvas for the figure thats 8pt wide and 6pt tall
plt.figure(figsize=(8, 6))

# This labels the axis and also changes their fontsize
plt.xlabel("Voltage", fontsize=16)
plt.ylabel("Power", fontsize=16)

# This changes the fontsize of the ticks on the axis
plt.tick_params(labelsize=14)

# This shows the plot
plt.show()

#### Adding data to the plot
This is done by calling `plt.plot` and having the x and y data in parentheses. Therefore, the general format for adding data to a plot looks like this:  
`plt.plot(x, y)`

For example, if I wanted to plot Volatge vs Resistance in a plot it would look like this (line starting with `rv_plot,`):

In [None]:
# Import matplotlib for plotting
import matplotlib.pyplot as plt

# Create a canvas for the figure thats 8pt wide and 6pt tall
plt.figure(figsize=(8, 6))

# Adding data to the plot
# I am doing voltage and resistance as an example
rv_plot, = plt.plot(resistance, voltage)

# This labels the axis and also changes their fontsize
plt.xlabel("Resistance", fontsize=16)
plt.ylabel("Voltage", fontsize=16)

# This changes the fontsize of the ticks on the axis
plt.tick_params(labelsize=14)

# This shows the plot
plt.show()

## Task #3

### Now it's your turn

This task is in two parts:
- Add the axis labels by replacing `X AXIS LABEL HERE` and `Y AXIS LABEL HERE` with your x and y-axis labels respectively. Make sure to add your labels *in-between the quotes*. And don't forget adding the **proper units**!!!
- Replace the `x` and `y` in `pv_plot, = plt.plot(x, y, 'o', markersize=8, label='Data')` with the correct variables for your x and y data on the graph.

In [None]:
# We need matplotlib for plotting the data
import matplotlib.pyplot as plt

# Create the plot canvas
plt.figure(figsize=(8, 6))

# Set the size of the ticks
plt.tick_params(labelsize=14)

# Label the axes
# Replace 'X AXIS LABEL HERE' and 'Y AXIS LABEL HERE' with the correct labels
# Include add your label in-between quotation marks
# Also remember to include the units!
plt.xlabel('X AXIS LABEL HERE', fontsize=16)
plt.ylabel('Y AXIS LABEL HERE', fontsize=16)

# Now to plot as a scatter plot
# Replace 'x' and 'y' with the correct variables
pv_plot, = plt.plot(x, y, 'o', markersize=8, label='Data')

# Showing the plot
plt.tight_layout()
plt.show()

## Part C: Do a parabolic fit

#### Doing a non-linear fit using lmfit
You'll notice that this is a *non-linear* fit, particularly a quadratic fit:

$$y = ax^2 + bx + c$$

We will use a library known as `lmfit` and use the `Model` class to do what's known as a *curve fit* for this.

So we need to define a *function* that defines the equation that we're fitting, in our case it's the quadratic equation above.

For the scope of this course, we will not go too in depth on functions. However, you should know that functions are a nice way to define numerical equations. Functions take in variables (inputs), perform calculations, and then *return* the output.

Please refer to the python tutorial to learn more about the parts of a function.

In [None]:
# Here we are importing the lmfit module we need
!pip install lmfit
from lmfit.models import Model

# Now to define the polynomial
# The inputs are the x values, and the coefficients a, b, and c
# The output is the y values from the polynomial
def polynomial(x, a, b, c):
    # here we are calculating y = ax^2 + bx + c
    y = a*x**2 + b*x + c
    # return the y values
    return y

#### Creating a fit model

We will create a variable for the Model of the fit, where we basically tell lmfit what function we want to fit and what our independent variable is. We then have to create the parameters of the fit, i.e. what values we are trying to obtain by fitting this function (`a`, `b`,  and `c`). Then, we have to add some guesses as to what we think the values *might* be, these guesses do not have to be close, just a starting point for the model.

## Task #4
Replace the `INDEPENDENT VARIABLE HERE` with the correct independent variable in the experiment. This is a little tricky, I'm not asking for the independent variable from your experimental data, but rather what it's defined as in the `polynomial` function above.

In other words, from the options below, which one is the independent variable for the polynomial fit equation?
- x
- a
- b
- c

When you have the correct option, replace the text in between the quotes.

So for example, if I think 'a' is the independent variable, my argument would read `independent_vars='a'`

In [None]:
# Now to create the model and store it as a variable
# Replace 'INDEPENDENT VARIABLE HERE' with the correct variable
# Keep the quotation marks!!!
polyfit = Model(polynomial, independent_vars="INDEPENDENT VARIABLE HERE")

# Some initial guesses on the coefficeints: these do not have to be correct
coeff = polyfit.make_params(a=1, b=1, c=1)

#### Fit the data!
We can now call our fitting model `polyfit` and obtain the regression results. You do not need to know the details of how this is done but for reference, each argument means:
- `data` = the "y" variables, i.e. the power
- `params` = the guesses for the parameters (our `pars` variable)
- `method` = what fitting method we're using, in this course will only use `'leastsq'`
- `x` = the independent variable (**Note:** this will change depending on what we define as our independent variable)
- `scale_covar` = this makes sure the errors obtained are scaled correctly

The total fit is stored in a variable labeled `result`.
After fitting, we'll extrapolate the fitted coefficients, their errors, and the $r^2$ value.

In [None]:
# Now to fit the data
result = polyfit.fit(data=power, x=voltage, params=coeff, method='leastsq', nan_policy='propagate', scale_covar=True)

# get the fitted coefficients
a_fit = result.params['a'].value
b_fit = result.params['b'].value
c_fit = result.params['c'].value

# And their errors
a_err, b_err, c_err = np.sqrt(np.diag(result.covar))

# Calculate the rsquared value
r_squared = 1 - result.residual.var() / np.var(power)

# Show that the fit is successfull and print the rsquared value
print(f'Fit successful!\nr^2 = {r_squared}')

#### Find the fitted x and y-values for plotting

Using the fitted coefficients, we can visualize the "goodness of fit" by plotting both the fitted data and the experimental data on the same plot.  

To do this, we need to calculate the "fitted y-values" by using the calculated fitted coefficients. Additionally, we want to extrapolate the fit, so it spans a longer range than the measurements we took: this will help us better visualize how the two variables relate to each other and how our good fit is.

In [None]:
# First we want to create a smooth curve to represent the fit
# So we need to create a range of x values to plot the fit
# This will range from minimum voltage - 10% of the min voltage to the maximum voltage + 20% of the max voltage
voltage_fit = np.linspace(voltage.min()*0.90, voltage.max()*1.20, 1000)

# Now to calculate the associated power values using the fitted coefficients
power_fit = polynomial(x=voltage_fit, a=a_fit, b=b_fit, c=c_fit)

### Plotting experimental & fitted data

This is very similar to your previous Power vs Voltage plot, but this time you are also adding your fitted line!

## Task #5
There are three parts to this task:
- Same as in Task #2, add the x and y-axis labels by replacing the `X/Y AXIS LABEL HERE` inside the quotes and add your units
- For the *Experimental Data*: replace the `x` and `y` with the appropriate variable names for plotting
- For the *Fitted Data*: replace the `y` with the appropriate variable names for plotting, I've already added the x variable as `voltage_fit`

In [None]:
# create the plot canvas
plt.figure(figsize=(8, 6))

# set the size of the ticks
plt.tick_params(labelsize=14)

# Label the axes
# Replace 'X AXIS LABEL HERE' and 'Y AXIS LABEL HERE' with the correct labels
# Include add your label in-between quotation marks
# Also remember to include the units!
plt.xlabel("X AXIS LABEL HERE", fontsize=16)
plt.ylabel("Y AXIS LABEL HERE", fontsize=16)

# Plotting our specific data

# Plotting the experimental data as a scatter plot
# The "o" argument tells matplotlib to plot the data points as circles
# The markersize argument changes the size of the circles
# The label argument is used to create a legend
# The color argument changes the color of the data points

# Replace the "x" and "y" with the correct variables: remember, variable names are not in quotes
exp_pv_plot, = plt.plot(x, y, "o", markersize=8, label="Experimental Data", color="red")

# Now we do the same for the fitted data but as a line
# The "-" argument tells matplotlib to plot the data points as a line
# everything else is the same, we do not need a markersize for a line

# Replace the "y" with the correct variables: remember, variable names are NOT in quotes
# I added the x variable as voltage_fit already
# In the label, we are also adding the r_squared value
fit_pv_plot, = plt.plot(voltage_fit, y, "-", label=f"Fitted Data\n$R^2$={r_squared:0.3g}", color="black")

# Now we add the legend
plt.legend(fontsize=16)

# This makes sure that the plot is formatted correctly
plt.tight_layout()

plt.show()

## Part D: Finding the Maximum Output

Now we have to find the maximum output using some algebra. First we need to find the vertex, or turning point, $h$:

$$h = -\frac{b}{2a}$$

This formula was found by solving for $x$ when $\frac{\mathrm{d}y}{\mathrm{d}x} = 0$.

Then we need to find $f(h)$, so the y-value at x=h. This can be done easily using the `polynomial` function that we defined above!

## Task #6
1. Write down the code to properly calculate `h`, the vertex, using the formula defined above. Fill in the code after `h=`
    Remember, the `a` and `b` you need should be the *fitted parameters* a.k.a. `b_fit` and `a_fit`. Make sure to also add the correct parentheses and negative sign(s)!!
2. Next, replace the `ADD YOUR UNITS HERE` with the appropriate unit of power and voltage (respectively) from your measurement

In [None]:
# Finding the vertex of the parabola using the formula above
# Fill in the formula after the equals sign to find the vertex
# Make sure to use the correct variable names: a_fit, b_fit, c_fit
# And remember your order of operations!
h =

# now to find the maximum power which is the y value at the vertex
# so instead of the whole voltage array we just use "h"
max_power = polynomial(x=h, a=a_fit, b=b_fit, c=c_fit)

# now to print the results with some fancy syntax
# Replace the ADD YOUR UNITS HERE with the appropriate units from your measurement
print(f'The maximum power is {max_power:.2f} ADD YOUR UNITS HERE at {h:.2f} ADD UNITS HERE of voltage')

Check that this makes sense according to your graph!