<a href="https://colab.research.google.com/github/wallynovak/FPLC_analysis_Gemini_guided/blob/main/FPLC_analysis_Gemini_guided.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Plotting FPLC trace data and Identifying peaks

The purpose of running the S200 column on the FPLC is to determine quaternary structure differences between the wild-type protein and any mutants. We can identify differences by any rightward shifts in the major peaks (the protein runs smaller).

The exercise below help guide you through using Gemini to:

1.   Read a csv (comma separated variable) file
2.   set the baseline volumes and absorbance values to zero
3.    Plot the data
4.    Analyze and print the peaks

You should compare class peak data in the 30-50 mL range to make the most accurate determination of what wild-type fumarase looks like on an FPLC. Be sure to also compare this with your protein gel results - Can you assign any peaks?





<font color ="green">**Step 1. Import a csv file**</font>

1. Click on the file folder at left to display your files.
2. Drag your .csv file from your computer to the Colab folder area to upload it.
3. To use Gemini AI to code, click the word "generate" in the code cell below.
4. You may have to click "ok" in some pop ups, then the Gemini window opens. In the Gemini window, tell Gemini you want to "read a csv file into dataframe called df"
4. You may need to edit the code that Gemini provides to use whatever your csv file name is, so just click "Accept" when it completes the code.
5. After editing your input csv file name, run the cell below.


If your code includes the line:

```
print(df.head())
```

then you should see the frst few lines of your csv file above. It may or may not start at 0 ml and probably does not start at 0 mAU.

<font color ="green">**Step 2. Set minima to zero for clarity**</font>

1. In the code cell below, tell Gemini you want to "shift all the values in the ml column to start at zero and also shift all values in the mAU column to start at zero."
2. Accept and run this code.

If your code includes the line:

```
print(df.head())
```

then you should see the frst few lines of your csv file above. The ml column should start at 0, the mAU column may or may not, but all values should be postitive.

<font color ="green">**Step 3. Plot your data**</font>

1. In the code cell below, tell Gemini you want to "plot this data with ml as the X-axis and mAU as the Y-axis."
2. Accept and run this code.

You should see a plot above indicating your peaks. These can be estimated from the above graph, but we can use Python to identify the peaks and peak heights.

<font color ="green">**Step 4. Identifying peaks**</font>

1. Run the code below to Identify your peaks.

In [None]:
import numpy as np
from scipy.signal import find_peaks
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 6))
peaks, _ = find_peaks(df['mAU'], height=.01, distance=5, prominence=.05)
plt.plot(df['ml'], df['mAU'])
plt.plot(df['ml'][peaks], df['mAU'][peaks], "x") # Marks peaks with 'x'
plt.xlabel('ml')
plt.ylabel('mAU')
plt.grid(True)
plt.show()
for i in peaks: print("Peak found at %f ml with a peak height of %f mAU." % (df['ml'][i], df['mAU'][i]))

In the plot above, the peaks found should be indicated with an <font color="orange">x</font>.

Additionally, you should see a list of your peaks and peak heights.

Copy the graph into Word along with the peaks. Print these data and place them in your notebook.

Be sure to also compare this with your protein gel results - Can you assign any peaks?