# Automatic protocol generation

This notebook demonstrates an automatic protocol generation pipline using Google GenAI

Converting videos or text to protocols involves following steps: 

* Uploading files to Google Cloud Storage (gcs)
* Generating protocol
* Generating benchmark dataset
* Evaluating protocol generation process

This notebook contains example code for the main functions of all these steps.

In [None]:
from __future__ import annotations

%reload_ext autoreload
%autoreload 2

import sys
from pathlib import Path

path_to_append = Path(Path.cwd()).parent
sys.path.append(str(path_to_append))

In [None]:
# Uploading a single file to gcs
import os

from google.cloud import storage
from proteomics_lab_agent.sub_agents import utils

bucket_name = os.getenv("GOOGLE_CLOUD_STORAGE_BUCKET")
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)

path = "/Users/patriciaskowronek/Documents/documentation_agent_few_shot_examples/benchmark_dataset/protocols/Diluting_protocolCorrect.MP4"
file_path, file_uri, filename, blob = utils.upload_file_from_path_to_gcs(
    path, bucket, "input_for_protocol"
)
print(file_path, file_uri, filename, blob.metadata)

In [None]:
# Uploading a folder to gcs
import os

from google.cloud import storage
from proteomics_lab_agent.sub_agents import utils

bucket_name = os.getenv("GOOGLE_CLOUD_STORAGE_BUCKET")
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)

folder_path = "/Users/patriciaskowronek/Documents/documentation_agent_few_shot_examples/benchmark_dataset/protocols"
file_paths = []
for root, _dirs, files in os.walk(folder_path):
    for file in files:
        path = Path(root) / file
        file_path, file_uri, filename, blob = utils.upload_file_from_path_to_gcs(
            path, bucket, "input_for_protocol"
        )
        print(file_path, file_uri, filename, blob.metadata)

In [None]:
# Generate a protocol based on video path
from IPython.display import Markdown
from proteomics_lab_agent.sub_agents.protocol_generator_agent import agent

result = agent.generate_protocols(
    query="""Generate a protocol based on this video "gs://ai-proteomics-advisor/input_for_protocol/ConnectingColumnSampleLine_protocolCorrect.MP4"."""
)

In [None]:
Markdown(result["protocol"])

In [None]:
# Generate a protocol based on text
# ruff: noqa: RUF001

from IPython.display import Markdown
from proteomics_lab_agent.sub_agents.protocol_generator_agent import agent

result2 = agent.generate_protocols(
    query="""Generate a protocol based on these notes:
    #Magnetic beads - Sample preparation

     Aliquot 10 µl bead slurry
    Wash 2x with ddH2O
    Remove supernatant
    Add 100 µl buffer as defined in table
    Dependent on what buffer you want to use for the beads → either specified by the bead protocol itself or Philipps buffer screen
    Add 10 µl plasma
    or platelet contaminated plasma if needed
    Incubate 30 min at 37°C
    Centrifuge at 2000 g, discard supernatant
    Wash with 100 µl buffer
    Discard supernatant
    Wash with 100 µl buffer
    Discard supernatant
    50 µl lysis buffer
    100 mM Tris, 40 mM CAA, 10 mM TCEP
    for 40 ml buffer: 149.6 mg CAA, 526.6 mg Tris and 0.8 mL 0.5M TCEP
    or
    for 40 ml buffer: 149.6 mg CAA, 526.6 mg Tris and 100.1 mg TCEP → check pH !!!!!
    10 min 90°C
    Spin down
    Add 10 µl of trypsin/lysC stock: per sample: 8 µl lysis buffer, 1 µl trypsin, 1 µl lysC per sample
    1200 µl lysis buffer, 150 µl trypsin, 150 µl lysC
    Digest for XX h
    Add 50 µl 0.2% TFA
    Mix
    Transfer 100 µl on a fresh plate, discard bead plate
    Approximation: 10 µl of plasma using the enrichment should results in 800 ng peptides; Load 20 µl of the transferred 100 µl digest (is approx. 160 ng)
    """
)
result2

In [None]:
Markdown(result2["protocol"])

In [None]:
result2["protocol_generation_time"]

In [None]:
# Generting the benchmark_data.csv
# ruff: noqa: RUF003

import pandas as pd

benchmark_dataset_path = "/Users/patriciaskowronek/Documents/proteomics_lab_agent/eval/eval_protocol_generation/benchmark_data.csv"

df_benchmark_data = pd.read_csv(benchmark_dataset_path, index_col=0)
list_new_rows = [
    {
        "eval_set_name": "ConnectingColumnSampleLine",
        "protocol_type": "specialized_equipment",
        "activity_type": "column_handling",
        "user_prompt": """Generate a protocol based on this video \"gs://ai-proteomics-advisor/input_for_protocol/ConnectingColumnSampleLine_protocolCorrect.MP4\".""",
        "input_type": "video",
        "ground_truth_protocol": "# Connecting IonOpticks Column & sample line of Evosep\n\n## Abstract\nThis protocol explains how to connect an IonOpticks column inserted into a timsTOF Ultra with an sample line of an Evosep.\n\n\n## Materials\n\n### Equipment\n- IonOpticks Column\n- timsTOF Ultra Mass Spectrometer\n  - Equipped with UltraSource ion source\n  - Equipped with column oven for temperature control\n- Evosep One LC System\n  - with sample line\n- NanoViper Adapter (black)\n- Pliers\n\n\n## Procedure\n*Estimated timing: 3 minutes*\n\n1. Verfy that the instrument is in standing by mode. If not press the on/off button to switch from operate mode to standing by mode.\n\n### Connect column and sample line\n2. Verify that an IonOpticks column is already inserted into the UltraSource of the timsTOF Ultra.\n3. Attach a black NanoViper adapter to the LC sample line.\n4. Remove any access liquid at the top of the nanoViper of the sample line for instance by snipping it off.\n5. Hold the column fititng of the IonOpticks column with a pliers.\n6. Hand-tighten the NanoViper of the LC sample line with the column fitting (Figure 1).\n   CRITICAL STEP: Be careful not to overtighten the connection between the IonOpticks column and the timsTOF Ultra. Otherwise you can damage the column and the LC sample line.\n7. Remove the NanoViper adapter so that the oven can be tightly closed.\n8. Draw the oven closer to the UltraSource, secure it with the screw on the bottom of the oven (Figure 2).\n9. Lift and place the metal grounding screw at the column-sample line connection to establish proper ESI spray grounding with the column oven.\n10. Close the lid of the oven.\n11. Verify the temperature at 50°C for IonOpticks columns, as indicated by three illuminated LEDs on the column oven, respectively.\n\n### Switch timsTOF to operate and idle flow\n12. In timsControl, transition the instrument to the operation mode by using the on/off button.\n13. Navigate to Hystar and ensure that the idle flow is active. If not right-click on the Evosep logo, choosing 'preparation', and then selecting 'idle flow' and 'Run'.\n14. Return to timsControl. Check the MS signal. It should be around 2.5x10^7.\n\n\n## Expected Results\n- In timsControl, signal intensity should be above 10^7\n- Stable signal in in timsControl (in the windows for Mobilogram (timsView) and mass spectrum)\n\n\n## Figures\n### Figure 1: Connection of a sample line of a LC system with a separation column and emitter.**\n\n### Figure 2: UltraSource.**\n\n\n## References\n1. Skowronek, P., Wallmann, G., Wahle, M. et al. An accessible workflow for high-sensitivity proteomics using parallel accumulation–serial fragmentation (PASEF). Nat Protoc (2025). https://doi.org/10.1038/s41596-024-01104-w\n",
        "ai_protocol": "# Connecting an IonOpticks Aurora Series Column to a Bruker timsTOF for LC-MS/MS Analysis\n\n## Abstract\nThis protocol provides a step-by-step guide for the correct physical and electrical connection of an IonOpticks Aurora Series UHPLC column to the CaptiveSpray Ultra ion source of a Bruker timsTOF series mass spectrometer. The procedure ensures a secure, leak-free fluidic connection and proper grounding, which are essential for generating a stable electrospray. Additionally, the protocol outlines the necessary software commands in Bruker timsControl and HyStar to initiate mobile phase flow and confirm system readiness for data acquisition.\n\n## Materials\n\n### Equipment\n| Equipment | Description | Source/Cat. No. |\n|---|---|---|\n| Mass Spectrometer | Bruker timsTOF SCP, Pro, or similar model equipped with a CaptiveSpray Ultra ion source. | Bruker Daltonik |\n| LC System | An LC system capable of nano-flow rates, such as the Evosep One. | Evosep |\n| UHPLC Column | IonOpticks Aurora Series UHPLC column with integrated emitter. | IonOpticks |\n| Pliers | Needle-nose or similar pliers for securely holding the column fitting. | Standard lab supplier |\n| Gloves | Nitrile or latex laboratory gloves. | Standard lab supplier |\n| Nano-connector adapter | (Optional) A tool to aid in finger-tightening nano-fittings. | - |\n\n### Software\n| Software | Version | Purpose | Source/Link |\n|---|---|---|---|\n| Bruker timsControl | Not specified in video | Instrument control and data monitoring for timsTOF systems. | Provided with Bruker timsTOF systems |\n| Bruker HyStar | Not specified in video | Manages the LC system and autosampler, and coordinates with timsControl. | Provided with Bruker LC-MS systems |\n\n## Procedure\n*Estimated timing: 5-10 minutes (including heating time)*\n\n### System Preparation and State Verification\n\n1.  In the Bruker timsControl software, verify the status of the mass spectrometer. Ensure the instrument is in **‘Standby’** mode. If the status is ‘Operating’, click the corresponding button to switch it to ‘Standby’ before proceeding.\n2.  **CRITICAL STEP:** IonOpticks columns with integrated emitters should not be subjected to mobile phase flow for extended periods without an applied voltage (i.e., while the MS is in 'Standby'). This can lead to poor spray quality or column damage. Minimize the time between starting the idle flow and beginning the analysis.\n\n### Physical Connection of the Column\n\n3.  Wear clean laboratory gloves.\n4.  Confirm that the IonOpticks column is already seated correctly within the column oven of the CaptiveSpray Ultra source.\n5.  Carefully take the fused silica sample line originating from the LC system. Inspect the nano-connector at its tip and wipe it clean with a gloved finger or a lint-free wipe to remove any particulates or residual solvent.\n6.  Use pliers to gently but firmly grip the stationary metal fitting on the IonOpticks column. This prevents the column from twisting during tightening.\n7.  Align the sample line's nano-connector with the column fitting and carefully screw it in by hand until it is **finger-tight**.\n\n    > **! CAUTION:** Do not use tools to tighten the fitting and avoid overtightening. Overtightening can crush the capillary or damage the threads of the column fitting, leading to leaks and poor chromatographic performance.\n\n### Positioning, Grounding, and Heating\n\n8.  Verify the physical position of the column oven assembly. If necessary, loosen the positioning screw and adjust the assembly to bring the column's emitter tip as close as possible to the ion source's transfer capillary without touching it.\n9.  Ensure the grounding screw on the column oven is making firm contact with the metal column fitting. A proper electrical ground is essential for a stable electrospray. For longer columns, alternative grounding points on the oven assembly may be used.\n10. Close the lid of the column oven. A distinct click confirms it is securely shut.\n11. Observe the three green LEDs on the side of the column oven. The LEDs will blink while the oven is heating to its setpoint (e.g., 50 °C). The lights will become solid green once the target temperature has been reached and is stable.\n\n### Establishing Electrospray and Signal\n\n12. In the timsControl software, switch the instrument's status from **‘Standby’** to **‘Operating’**. This will apply the necessary voltages to the ion source.\n13. Navigate to the Bruker HyStar software.\n14. Initiate the mobile phase flow to the column. Right-click within the sample table and select **Idle flow > Run** from the context menu. This will start a low-flow rate to equilibrate the column and establish a spray.\n15. Return to the timsControl software. Monitor the **Chromatogram View** for a stable Total Ion Chromatogram (TIC) and the **Spectrum View** for a consistent ion signal.\n\n## Expected Results\n- A secure, leak-free connection between the sample line and the UHPLC column.\n- The column oven is closed and has reached its set temperature, indicated by solid green LEDs.\n- A stable TIC signal is observed in timsControl, confirming a robust and consistent electrospray. The system is now ready for data acquisition.\n\n## Figures\n\n### Figure 1: Connecting the Sample Line to the Column\n![Placeholder for an image showing the three key steps of the physical connection: (A) Holding the column fitting steady with pliers. (B) Hand-tightening the sample line's nano-connector into the column fitting. (C) The final, finger-tight connection with the grounding screw in contact with the fitting.](placeholder_image_url)\n***(A) The column fitting is held with pliers to prevent rotation. (B) The sample line is screwed in until finger-tight. (C) The completed connection, ensuring the grounding screw makes contact.***\n\n### Figure 2: Software Workflow for Initiating Spray\n![Placeholder for a composite image showing screenshots of the software interfaces: (A) The 'Standby' button in timsControl. (B) The 'Idle flow -> Run' command in the HyStar right-click menu. (C) The timsControl interface showing a stable TIC signal after the system is set to 'Operating' and idle flow has commenced.](placeholder_image_url)\n***(A) Ensure the mass spectrometer is in 'Standby' in timsControl before making connections. (B) Use the HyStar menu to start the 'Idle flow'. (C) After switching to 'Operating' mode, monitor for a stable signal in timsControl.***\n",
        "user_protocol_rating": {
            "Completeness": 5,
            "Technical Accuracy": 5,
            "Logical Flow": 5,
            "Safety": 4,
            "Formatting": 2,
        },
        "comments": "",
    },
    {
        "eval_set_name": "Diluting",
        "protocol_type": "regular_wetlab",
        "activity_type": "liquid_handling",
        "user_prompt": """Generate a protocol based on this video \"gs://ai-proteomics-advisor/input_for_protocol/Diluting_protocolCorrect.MP4\".""",
        "input_type": "video",
        "ground_truth_protocol": "# Dispensing Protocol\n\n## Abstract\nThis protocol describes how to pipette first 400 uL buffer A and then 100 uL buffer B in a Eppendorf tube.\n\n\n## Materials\n\n### Equipment\n- Eppendorf tube\n- Pipette\n\n### Reagents\n- Formic acid (FA)\n  - ! CAUTION: This liquid may be corrosive. It is harmful and can cause damage if direct contact occurs.\n- Acetonitrile\n  - ! CAUTION: This liquid is highly flammable and can be harmful if contact occurs.\n\n### Reagent setup\n- Buffer A: Consists of 0.1% (vol/vol) FA.\n- Buffer B: Consists of 0.1% (vol/vol) FA/99.9% (vol/vol) acetonitrile.\n\n\n## Procedure\n*Estimated timing: 1 minute*\n\n1. Open the bottles of Buffer A and Buffer B.\n2. Adjust the pipette to 400 μL setting.\n3. Attach a pipette tip to the pipette.\n4. Pipette 400 μL of Buffer A from the bottle into the Eppendorf tube.\n5. Adjust the pipette to 100 μL setting.\n6. Discard the used pipette tip in the appropriate waste container.\n7. Attach a pipette tip to the pipette.\n8. Pipette 100 μL of Buffer B from the bottle into the Eppendorf tube.\n9. Discard the used pipette tip in the appropriate waste container.\n10. Mix the solution with a vortexer.\n11. Lable the Eppendorf tube.\n\n\n## Expected Results\nApproximatly 500 uL liquid in the tube\n",
        "ai_protocol": "",
        "user_protocol_rating": {},
        "comments": "",
    },
    {
        "eval_set_name": "DisconnectColumn",
        "protocol_type": "specialized_equipment",
        "activity_type": "column_handling",
        "user_prompt": """Generate a protocol based on this video \"gs://ai-proteomics-advisor/input_for_protocol/DisconnectColumn_protocolCorrect.MP4\".""",
        "input_type": "video",
        "ground_truth_protocol": """# Disconnecting a IonOpticks column from an Evosep System\n\n## Abstract\nThis protocol describes how to disconnect an IonOpticks column from an Evosep liquid chromatography system when the column is installed in the UltraSource of a timsTOF mass spectrometer.\n\n\n## Materials\n\n### Equipment\n- IonOpticks Column\n- timsTOF Mass Spectrometer:\n  - Equipped with UltraSource ion source\n- Evosep One LC System\n  - with sample line\n- NanoViper Adapter (black)\n- Pliers\n\n\n## Procedure\n*Estimated timing: 1 minute*\n\n### Verify timsTOF is in standby mode:\n1. Check timsControl software status in top left corner.\n2. If in "Operate" mode, click the power symbol to transition to "Standby" (Figure 1, top left).\n\n### Disconnect IonOpticks column and Evosep:\n3. Open the lid of the column oven (column toaster).\n4. Loosen the securing screw at the sample line - column connection (Figure 2F), which has the purpose to establish proper ESI spray grounding with the column oven. Lift it up and move the metal grounding screw away from the column-sample line connection.\n5. Attach the NanoViper adapter to the sample line for easier handling.\n6. Hold the column fitting with pliers for easier handling.\n7. Unscrew the NanoViper connector to detach the sample line from the IonOpticks column by turning it counter clock-wise.\n8. Position the sample line ensuring it does not bend. One method is to place it over the transparent bumper of the Evosep.\n\n\n## Expected Results\n- The timsTOF is in standby mode\n- The column should be completely detached from the Evosep\n\n\n## Figures\n### Figure 1: TimsControl\n\n### Figure 2: IonOpticks column connected with UltraSource of timsTOF and Evosep\n\n## References\n1. Bruker Daltonics. (2023). timsTOF User Manual. [Link to user manual]\n2. Skowronek, P., Wallmann, G., Wahle, M. et al. An accessible workflow for high-sensitivity proteomics using parallel accumulation–serial fragmentation (PASEF). Nat Protoc (2025). https://doi.org/10.1038/s41596-024-01104-w\n""",
        "ai_protocol": "",
        "user_protocol_rating": {},
        "comments": "",
    },
    {
        "eval_set_name": "ESIsourceToUltraSource",
        "protocol_type": "specialized_equipment",
        "activity_type": "ion_source_operation",
        "user_prompt": """Generate a protocol based on this video \"gs://ai-proteomics-advisor/input_for_protocol/ESIsourceToUltraSource_protocolCorrect_CapillaryPushedIn.MP4\".""",
        "input_type": "video",
        "ground_truth_protocol": "# Change source: ESI source to UltraSource\n\n\n## Abstract\nThis protocol describes the procedure for switching from the ESI source to UltraSource.\n\n\n## Materials\n\n### Equipment\n- timsTOF Ultra Mass Spectrometer:\n  - Equipped with an ESI ion source\n  - UltraSource ion source ready to attach\n- IonOpticks Column\n- Evosep One LC System with sample line\n- NanoViper Adapter (black)\n- Pliers\n\n\n## Procedure\n*Estimated timing: less than 10 minute*\n\n### Switch timsTOF to standby\n1. In timsControl, verfy if the instrument is in standing by mode. If not press the on/off button to switch from operate mode to standy by mode  (Figure 1).\n2. Verify that the syringe is inactive in the tab 'Source' under 'Syringe Pump'. If not click on 'Stop'.\n3. In the 'source' and 'source type' sections of timsControl, choose 'CaptiveSpray' but do not activate it yet.\n\n### Remove ESI source\n4. At the instrument, disconnect the peak connector of the sample tubing (Figure 2).\n5. Disconnect the nebulizer N₂ line.\n6. Remove the ESI source\xa0housing by hinging it out like a regular door.\n7. Put on laboratory gloves.\n8. Remove the spray shield\n   ! CAUTION: The spray shield is hot.\n9. Remove the capillary cap.\n   ! CAUTION: The capillary cap is hot.\n10. Inspect the capillary position. If it appears to be partially pulled out, gently push it back into proper position without blocking the gas flow.\n\n### Mount UltraSource\n11. Hinge the UltraSource door in and close it (Figure 3).\n12. Slide the UltraSource housing onto the source door and secure it by flipping the handles located at the top right and bottom left by 180°.\n13. Connect the filter tubing to the source.\n\n### Connect column and sample line\n14. Verify that an IonOpticks column is already inserted into the UltraSource of the timsTOF Ultra.\n15. Verify if the LC sample line has a black NanoViper adapter attached. If not, locate an adapter and securely attach it to the sample line.\n16. Remove any access liquid at the top of the nanoViper of the sample line for instance by snipping it off.\n17. Hold the column fititng of the IonOpticks column with a pliers.\n18. Hand-tighten the NanoViper of the LC sample line with the column fitting (Figure 3).\n   CRITICAL STEP: Be careful not to overtighten the connection between the IonOpticks column and the timsTOF Ultra. Otherwise you can damage the column and the LC sample line.\n19. Draw the oven closer to the UltraSource, secure it with the screw on the bottom of the oven (Figure 2).\n20. Remove the NanoViper adapter so that the oven can be tightly closed.\n21. Lift and place the metal grounding screw at the column-sample line connection to establish proper ESI spray grounding with the column oven.\n22. Close the lid of the oven.\n23. Connect the oven to the electrical power supply.\n24. Set the temperature at 50°C for IonOpticks columns, as indicated by three illuminated LEDs on the column oven. Blincking light indicates that the oven is heating up.\n    CRITICAL STEP: Ensure that the IonOpticks column is not left connected to the LC for an extended period unless the mass spectrometer is in operate mode; otherwise, debris may accumulate on the emitter, lead to spitting.\n\n### Switch timsTOF to operate and idle flow\n25. In timsCOntrol, activate the CaptiveSpray function in timsControl (Figure 1)\n26. Transition the instrument to the operation mode by using the on/off button.\n27. Navigate to Hystar and ensure that the idle flow is active. If not right-click on the Evosep logo, choosing 'preparation', and then selecting 'idle flow' and 'Run'.\n28. Return to timsControl and check the MS signal. It should be around 2.5x10^7.\n29. Monitor the internal pressures in the timsTOF device. The funnel pressur is acceptable within a 9-11 mbar range. Aim for a 10 mbar funnel pressure. Adjust the 'funnel pressure' wheel if necessary by turning counter clock-wise to reduce the pressure.\n\n\n## Expected Results\n- In timsControl, signal intensity should be above 10^7\n- Stable signal in timsControl (in the windows for Mobilogram (timsView) and mass spectrum)\n\n\n## Figures\n**Figure 1: TimsControl settings before changing source.**\n\n**Figure 2: Apollo ESI source and gas flow pressure control wheels.**\n\n**Figure 3: UltraSource.**\n\n\n## References\n1. Skowronek, P., Wallmann, G., Wahle, M. et al. An accessible workflow for high-sensitivity proteomics using parallel accumulation–serial fragmentation (PASEF). Nat Protoc (2025). https://doi.org/10.1038/s41596-024-01104-w\n2. timsTOF user manual\n",
        "ai_protocol": "",
        "user_protocol_rating": {},
        "comments": "",
    },
    {
        "eval_set_name": "Evotip",
        "protocol_type": "regular_wetlab",
        "activity_type": "sample_preparation",
        "user_prompt": """Generate a protocol based on this video \"gs://ai-proteomics-advisor/input_for_protocol/Evotip_protocolCorrect.MP4\".""",
        "input_type": "video",
        "ground_truth_protocol": "# Centrifuge Protocol\n\n\n## Abstract\nThis protocol describes how to spin Evotips.\n\n\n## Materials\n\n### Equipment\n- Evotips\n- Evotip holders wilth solvent reservior\n- Centrifuge\n\n\n## Procedure\n*Estimated timing: 1 minute*\n\n1. Equaly distribute Evotips onto two Evotip holders. For instance 12 Evotips each. Do this either manually or by using a multichannel pipette.\n2. Switch on the centrifuge.\n3. Place the Evotip holders within centrifuge holders at opposite positions.\n4. Spin the Evotips at 800g for 1 minute in a centrifuge.\n\n\n## Expected Results\nA balanced centrifuge that spins at 800g for 1 minute.\n",
        "ai_protocol": "",
        "user_protocol_rating": {},
        "comments": "",
    },
    {
        "eval_set_name": "Pipette",
        "protocol_type": "regular_wetlab",
        "activity_type": "liquid_handling",
        "user_prompt": """Generate a protocol based on this video \"gs://ai-proteomics-advisor/input_for_protocol/Pipette_protocolCorrect.MP4\".""",
        "input_type": "video",
        "ground_truth_protocol": "# Dispensing Protocol\n\n\n## Abstract\nThis protocol describes how to pipette 20 uL buffer A in a PCR Plate from A1 to A12.\n\n\n## Materials\n\n### Equipment\n- PCR Plate\n- Pipette\n\n### Reagents\n- Formic acid (FA)\n  ! CAUTION: This liquid may be corrosive. It is harmful and can cause damage if direct contact occurs.\n\n### Reagent setup\n- Buffer A: Consists of 0.1% (vol/vol) FA. The buffers are stable for at least 6 months at room temperature as long as they are protected from sunlight.\n\n\n## Procedure\n*Estimated timing: 1 minute*\n\n1. Open the bottle of Buffer A.\n2. Adjust the pipette to 20 μL setting.\n3. Attach a pipette tip to the pipette.\n4. Pipette 20 μL of Buffer A from the bottle into well A1.\n5. Continue dispensing 20 μL of Buffer A into each consecutive well from A2 to A12.\n6. Discard the used pipette tip in the appropriate waste container.\n\n\n## Expected Results\nApproximatly 20 uL liquid in A1 to A12 in a PCR Plate.\n",
        "ai_protocol": "",
        "user_protocol_rating": {},
        "comments": "",
    },
    {
        "eval_set_name": "PlaceEvotips",
        "protocol_type": "specialized_equipment",
        "activity_type": "sample_preparation",
        "user_prompt": """Generate a protocol based on this video \"gs://ai-proteomics-advisor/input_for_protocol/PlaceEvotips_protocolCorrect.MP4\".""",
        "input_type": "video",
        "ground_truth_protocol": "# Placing Evotips in Evotip Boxes on the Evosep One System\n\n\n## Abstract\nThis protocol describes the proper procedure for inspecting Evotips and placing Evotips in Evotip boxes on the liquid chromatography system Evosep One.\n\n\n## Materials\n\n### Equipment\n- Evotips\n  - Single-use stage tips for sample injection\n  - Rack layout: Two columns (left and right)\n  - Left column (top to bottom): S1, S2, S3\n  - Right column (top to bottom): S4, S5, S6\n  - Within each box: Standard 96-well format with A1 (top left), A12 (top right), H12 (bottom right)\n- Evotip Boxes\n  - 96-well format (A1-H12) (Figure 1)\n- Evosep One System\n  - Liquid chromatography system\n\n### Reagents\n- Formic acid (FA)\n  ! CAUTION: This liquid may be corrosive. It is harmful and can cause damage if direct contact occurs.\n\n### Reagent setup\n- Buffer A: Consists of 0.1% (vol/vol) FA. The buffers are stable for at least 6 months at room temperature as long as they are protected from sunlight.\n\n\n## Procedure\n*Estimated timing: less than 1 minute*\n\n1. Visually inspect the Evotip box intended for S1 to verify it's filled with Buffer A solution to a minimum depth of 1 cm. Verification of the Evotip box placed at S3 is not necessary as it will only contain blanks.\n2. Place Evotip Box at S1 within the rack system of the Evosep instrument. Ensure each box is firmly seated in its designated position.\n3. Place an empty Evotip Box for Blank tips at S3. Ensure each box is firmly seated in its designated position.\n4. Inspect each Evotip before placement to verify its condition. Properly prepared Evotips should display a pale-colored SPE material disc with visible solvent above it (Figure 2).\n   **CRITICAL STEP**: Discard any Evotips showing signs of dryness or displaying a white-colored disc, as these conditions indicate compromised functionality that could affect sample analysis.\n5. Place the verified Evotips into the prepared Evotip boxes at S1 from A1 to A6.\n6. Place empty Evotips, called Blanks, at S3 from A1 to A6.\n7. Document the precise position of each placed Evotip.\n\n\n## Expected Results\nWhen the procedure is performed correctly, you should observe:\n- Properly seated Evotip boxes in the rack system\n- Visible Buffer A solution in boxes (1 cm depth)\n- All non-blank Evotips showing pale-colored SPE material discs & clear solvent meniscus above each SPE disc of each Evotip\n- Accurate documentation of tip positions: Evotips that are placed at S1 from A1 to A6 and blanks placed at S3 from A1 to A6.\n\n\n## Figures\n\n### Figure 1: Evosep positions\n[Placeholder for close-up of single Evotip box showing well positions (A1-H12)]\n\n### Figure 2: Evotip Quality Assessment\n- Most Evotips: Properly hydrated Evotip with pale-colored disc and visible solvent\n- Orange-highlighted Evotip: Compromised Evotip showing white/dry disc\n\n\n## References\n1. Evosep One - User Guide: https://www.evosep.com/wp-content/uploads/2024/06/Evosep-One-User-Guide-v18.pdf\n2. Sample loading protocol for Evotips: https://www.evosep.com/wp-content/uploads/2020/03/Sample-loading-protocol.pdf\n",
        "ai_protocol": "",
        "user_protocol_rating": {},
        "comments": "",
    },
    {
        "eval_set_name": "QueueSamples",
        "protocol_type": "specialized_software",
        "activity_type": "starting_measurment",
        "user_prompt": """Generate a protocol based on this video \"gs://ai-proteomics-advisor/input_for_protocol/QueueSamples_protocolCorrect.mov\".""",
        "input_type": "video",
        "ground_truth_protocol": """# Queue and measure samples in HyStar\n\n\n## Abstract\nThis protocol explains how to queue sample in HyStar for LC-MS measurement.\n\n\n## Materials\n\n### Software\n| Software | Version | Purpose | Source/Link |\n|----------|---------|---------|------------|\n| HyStar | 6.0 | Controls LC and MS | On every instrument computer |\n\n\n## Procedure\n*Estimated timing: 6 minutes*\n\n1. Verify the Evotips were placed on top of the Evosep LC system. In this example, 5 ng HeLa Evotips were placed at S1 from A1 to A6 and blanks at S3 from A1 to A6.\n    Note: Blank Evotis can be unused and dry Evotips.\n2. Verify that the TIMS device in TimsControl is calibrated. The TIMS device should be calibrated each time before you start a sample queue.\n3. In Hystar, navigate to the \'Acquisition\' tab.\n4. Either select \'New\', and subsequently choose \'LC-MS sample table\' (Figure 1) to generate a new sample table or choose an already existing sample table by pressing the arrow down button when hovering over the sample table name in the left sample table column.\n5. In both cases, copy already existing sample table entries to modify them.\n6. Adjust the sample ID so that it follows this pattern: currentDate_massSpec_user_sampleType_projectID_ sampleName. Typical examples for sampleType: "SA_blank", "MA_HeLa", "DIAMA_HeLa".\n7. For performance evaluation of the LC-MS system, we recommend to queue one blank, three dda-PASEF runs, three dia-PASEF runs and ending with another blank.\n    Critical step: Always start the queue with a sacrificial Evotip such as a blank tip as the first run might have altered chromatographic conditions that could introduce technical errors to your measurements.\n8. Verify the column autocompletion settings with right-click on a field in the column \'vial\' such as S1-A1 in Figure 1 and select \'Configure\'. The arrows allow one to define the direction in which the vial positions on the 96-well are autocompleted when dragging values similar to Excel in the sample table. Decide whether the values should increase from A1-A12 indicated by arrows pointing to right. Ensure that the tray type is set to \'Evosep\' and slots 1-6 are designated as \'96Evotip\'. Press \'OK\'.\n9. Match the Evotip position with the sample\'s location in the Evotip box: From S1 from A1 to A6 and blanks at S3 from A1 to A6. To do this, press the arrow next to the value in the \'vial\' column (Figure 1). Select the position where the first Evotip is placed, for instance S1 A1. You can then either specify all remaining positions automatically by dragging the values (similar to Excel\'s auto-fill function) or specify each position individually.\n10. Specify a \'path\' folder for storing the raw files.\n11. Choose an existing separation method or create a new one. To create a new separation method, right-click on the separation method field, select \'new method\' followed by \'edit method\'. Choose the method type listed under \'name\' and then press \'OK\'. Specify its name and save it.\n12. Set the injection method to \'standard\'.\n13. At \'MS method\', load either dda-PASEF and dia-PASEF maintenance methods to check the LC-MS performance or the method intended to be used for measuring the study.\n14. Stop the idle flow on the Evosep by right-clicking on the Evosep logo and selecting \'Cancel maintenance procedure\'.\n15. Save the sample table.\n16. Right-click somewhere on the top row of the freshly defined sample table entries and select \'upload sample conditions\' to pre-check if all LC and MS method can be loaded correctly. The status should be change to loaded.\n17. Press \'start\' and \'start sequence\'.\n\n\n## Expected Results\nTypical chromatogram shape with MS TIC (green line) increasing drastically around 4 minutes and another increase (wash-out) at 15 minutes. Higher MS2 intensities (red line) for dia-PASEF.\n\n\n## Figures\n### Figure 1: Hystar\n[Placeholder for Screenshot of reaauired Hystar settings]\n\n\n## References\n1. Skowronek, P., Wallmann, G., Wahle, M. et al. An accessible workflow for high-sensitivity proteomics using parallel accumulation–serial fragmentation (PASEF). Nat Protoc (2025). https://doi.org/10.1038/s41596-024-01104-w\n2. TODO: Refer to Hystar manual\n3. TODO: Maybe upload video which shows protocol in action\n""",
        "ai_protocol": "",
        "user_protocol_rating": {},
        "comments": "",
    },
    {
        "eval_set_name": "TimsCalibration",
        "protocol_type": "specialized_software",
        "activity_type": "calibration",
        "user_prompt": """Generate a protocol based on this video \"gs://ai-proteomics-advisor/input_for_protocol/TimsCalibration_protocolCorrect.mov\".""",
        "input_type": "video",
        "ground_truth_protocol": """# Calibrate the TIMS device\n\n\n## Abstract\nThis protocol explains how to calibrate a TIMS device in timsControl.\nImportant: We recommend to calibrate the TIMS device each time before a sample queue is started.\n\n\n## Materials\n\n### Software\n| Software | Version | Purpose | Source/Link |\n|----------|---------|---------|------------|\n| timsControl | 6.0 (latest) | Software to control timsTof mass spectrometers | Available on every instrument computer |\n\n\n## Procedure\n*Estimated timing: 3 minutes*\n\nCritical step: If the instrument has been operated with another source than the UltraSource or CaptiveSpray source, it is highly recommended to wait for 3 hours before initiating the TIMS calibration process. This waiting period permits the temperature and pressure to stabilise, thus ensuring a steady TIMS calibration.\n1. Use timsControl to load a performance evaluation method - for instance a dda-PASEF or dia-PASEF method. The method should have an ion mobility range that matches the ion mobility range of the method intended to be use in the study. For proteomics experiments, we typically use an ion mobility range from 0.7 to 1.3 1/K₀, hence the 1/K₀ start and end values can be set to 0.7 and 1.3, respectively (Figure 1: 6, 7). This consistency in ion mobility range allows for the integration of quality control (QC) runs into the sample table as reference points without necessitating recalibration of the TIMS tunnel.\n2. To specifically calibrate a narrow ion mobility range, activate the locked sign at 1/K₀ end (Figure 1: 7).\n3. Adjust the 1/K₀ start from 0.7 to 0.85. This alters the ion mobility range from 0.7-1.3 1/K₀ to 0.85-1.45 1/K₀. The aim here is to shift the ion mobility range without modifying the interval, enabling all three calibrants to be used for linear calibration while maintaining a constant TIMS potential.\n4. Wait until the TIC in Chromatogram View is stable. This can take up to 15 minutes.\n5. Switch the scan mode to \'MS\', set MS averaging to 30 and deactivate the polygon heatmap (Figure 1: 4, 9, 14).\n6. In timsControl, navigate to \'calibration\', then \'mobility\'.\n7. **Not included in video** From reference lists, select the list \'[ESI] Tuning Mix ES-TOF (ESI)\' that contains the calibrant masses 622, 922, 1221.\n8. Specify the linear mode and 5% as detection range and ±0.1 Da as width (Figure 2).\n9. Proceed by selecting \'calibrate\' (Figure 2).\n10. To verify that the calibrants have been picked correctly at the center, click on them in the reference list.\n11. If they are not picked in the center, make adjustments by clicking on the peak in the TIMS view window.\n12. If the score is at 100%, press accept.\n13. Select "Method" > "Load Recent", select the same method, and then click "Discard changes" in the pop-up window.\n14. Set MS averaging to 1.\n\n\n## Expected Results\n- The score in the tab "Calibration" should be at 100%\n\n\n## Figures\n### Figure 1: TimsControl settings\n\n### Figure 2: Ion mobility calibration\n\n## Tables\nTable 1: Gas Flow Parameters for UltraSource\n\n| Instrument Type | Calibrant 922 Voltage [V] |\n|----------------|---------------------------|\n| timsTOF Pro, SCP | 160 |\n| timsTOF HT, Ultra | 200 |\n\n## References\n1. Skowronek, P., Wallmann, G., Wahle, M. et al. An accessible workflow for high-sensitivity proteomics using parallel accumulation–serial fragmentation (PASEF). Nat Protoc (2025). https://doi.org/10.1038/s41596-024-01104-w\n2. TODO: Refer to user manual of timsTof\n3. TODO: Maybe upload video which shows calibration in action\n""",
        "ai_protocol": "",
        "user_protocol_rating": {},
        "comments": "",
    },
    {
        "eval_set_name": "UltraSourceToESIsource",
        "protocol_type": "specialized_equipment",
        "activity_type": "ion_source_operation",
        "user_prompt": """Generate a protocol based on this video \"gs://ai-proteomics-advisor/input_for_protocol/UltraSourceToESIsource_protocolCorrect.MP4\".""",
        "input_type": "video",
        "ground_truth_protocol": "# Change source: UltraSource to ESI source\n\n\n## Abstract\nThis protocol describes the procedure for switching from the UltraSource to the ESI source.\n\n\n## Materials\n\n### Equipment\n- timsTOF Ultra Mass Spectrometer:\n  - Equipped with an UltraSource ion source\n  - ESI source on the side to attach\n\n\n## Procedure\n*Estimated timing: less than 10 minute*\n\n### Switch TimsControl to 'Standby' mode\n1. Verify that the IonOpticks column from an Evosep System is disconnected. If not disconnect them.\n   Critical step: This could be an opportune moment to perform the weekly LC system maintenance.\n2. In timsControl, verfy that the software is in standby mode\n3. Navigate to the 'Source', then proceed to 'SourceType' and select 'ESI'. However, do not activate the source at this moment (Figure 1).\n\n### Remove UltraSource\n4. Disconnect the power supply of the oven.\n5. Disconnect the filter tubing.\n6. Rotate the top right and bottom left handles of the UltraSource housing by 180 degrees.\n7. Slide off the UltraSource housing from both the source door and the glass capillary.\n8. Set the UltraSource housing on the bench.\n   Critical step: Never attempt to open the source door while the UltraSource is connected with the glass capillary (Figure 2). Such an action will cause breakage of the glass capillary.\n9. Remove the source door by opening it and unhinging it, similar to the motion of a normal door.\n10. Set it aside (Figure 2).\n\n### Mount Apollo ESI source\n11. Put on gloves.\n12. Attach the capillary cap securely to the glass capillary (Figure 3)\n   Critical step: Try not to block the vacuum flow to avoid contamination of the timsTOF for instance by blocking the whole of the capillary cap.\n13. Attach the spray shield to the desolvation stage housing.\n14. Hinging the ESI source\xa0housing into position as it would be a door and close it.\n15. Connect the sample inlet of the ESI source and the peak tubing lines, which originate from the syringe, by turning it clock-wise.\n16. Connect the nebulizer gas inlet to the N₂ line.\n\n### Prepare the setup by loading the syringe with Tuning Mix\n17. Remove old solvent.\n18. Withdraw new Tuning Mix liquid.\n19. Ensure it is devoid of air bubbles.\n20. Connect the syringe to the peak tubing of the sample line.\n21. Mount this syringe within the external syringe pump setup. The golden button allows to move the syringe holder to arrange the syringe as in Figure 3.\n22. Press some solved out of the syringe to fill the sample line.\n\n### Switch TimsControl to 'Operate' mode\n23. Activate the ESI source within the TimsControl software (Figure 1).\n24. Transition the instrument into 'operate' mode by clicking on the on/off symbol.\n25. Begin flow at the syringe, ensuring that the following settings are in place in the tab 'Source' (Figure 1): Syringe: Hamilton 500 µL; Flow Rate: 3 µL/min. Press start. Keep the button next to start/stop pressed until a signal is observed in the 'Chromatogram View' and 'TIMS View' windows, which usually happens within 30 seconds.\n\n## Expected Results\n- Signal intensity should reach approximately 1.5x10^7\n- Stable signal in in timsControl (in the windows for Mobilogram (timsView) and mass spectrum)\n\n## Figures\n### Figure 1: TimsControl settings before changing source\n\n### Figure 2: UltraSource.\n\n### Figure 3: Apollo ESI source and gas flow pressure control wheels.\n\n## References\n1. Skowronek, P., Wallmann, G., Wahle, M. et al. An accessible workflow for high-sensitivity proteomics using parallel accumulation–serial fragmentation (PASEF). Nat Protoc (2025). https://doi.org/10.1038/s41596-024-01104-w\n2. TODO: timsTOF user manual\n",
        "ai_protocol": "",
        "user_protocol_rating": {},
        "comments": "",
    },
    {
        "eval_set_name": "MagneticBeadBasedEnrichmentPlasma",
        "protocol_type": "regular_wetlab",
        "activity_type": "plasma_enrichment",
        "user_prompt": """Generate a protocol based on these notes: #Magnetic beads - Sample preparation\n\n Aliquot 10 µl bead slurry\nWash 2x with ddH2O\nRemove supernatant\nAdd 100 µl buffer as defined in table\nDependent on what buffer you want to use for the beads → either specified by the bead protocol itself or Philipps buffer screen\nAdd 10 µl plasma \nor platelet contaminated plasma if needed\nIncubate 30 min at 37°C\nCentrifuge at 2000 g, discard supernatant\nWash with 100 µl buffer\nDiscard supernatant\nWash with 100 µl buffer\nDiscard supernatant\n50 µl lysis buffer\n100 mM Tris, 40 mM CAA, 10 mM TCEP\nfor 40 ml buffer: 149.6 mg CAA, 526.6 mg Tris and 0.8 mL 0.5M TCEP\nor\nfor 40 ml buffer: 149.6 mg CAA, 526.6 mg Tris and 100.1 mg TCEP → check pH !!!!!\n10 min 90°C\nSpin down\nAdd 10 µl of trypsin/lysC stock: per sample: 8 µl lysis buffer, 1 µl trypsin, 1 µl lysC per sample\n1200 µl lysis buffer, 150 µl trypsin, 150 µl lysC\nDigest for XX h\nAdd 50 µl 0.2% TFA\nMix\nTransfer 100 µl on a fresh plate, discard bead plate\nApproximation: 10 µl of plasma using the enrichment should results in 800 ng peptides; Load 20 µl of the transferred 100 µl digest (is approx. 160 ng)""",
        "input_type": "text",
        "ground_truth_protocol": "Protocol: Magnetic Bead-Based Protein Enrichment and On-Bead Digestion from Plasma\n\n## Abstract\nThis protocol details a method for the enrichment of proteins from low-volume plasma samples using magnetic beads. The procedure involves protein binding, washing, on-bead lysis with simultaneous reduction and alkylation, followed by enzymatic digestion. The resulting peptide mixture is suitable for subsequent analysis by liquid chromatography-mass spectrometry (LC-MS).\n\n## Materials\n\n### Equipment\n-   Pipettes (P10, P200, P1000) and corresponding pipette tips\n-   Microcentrifuge tubes (1.5 mL) or a 96-well plate\n-   Magnetic rack compatible with tubes or plates\n-   Thermomixer or heat block capable of reaching 90°C with shaking\n-   Centrifuge (for plates or tubes)\n\n### Reagents\n-   Magnetic beads (e.g., MagResyn HILIC, ReSyn Biosciences)\n-   Human plasma\n-   **Binding/Wash Buffer:** The composition is dependent on the specific application or bead manufacturer's instructions.\n-   **Lysis Buffer (100 mM Tris, 40 mM CAA, 10 mM TCEP):** To prepare 40 mL, combine:\n    -   526.6 mg Tris base\n    -   149.6 mg 2-chloroacetamide (CAA)\n    -   0.8 mL of 0.5 M TCEP solution, or 100.1 mg of solid TCEP\n    -   Adjust to a final volume of 40 mL with ddH₂O.\n    ! **CRITICAL STEP:** If using solid TCEP, verify and adjust the pH of the final buffer to ~8.0.\n-   **Enzymes:** Sequencing-grade Trypsin and Lys-C protease stocks (e.g., 0.5 µg/µL).\n-   **Digestion Mix:** Prepare fresh before use. For each sample, mix:\n    -   8 µL Lysis Buffer\n    -   1 µL Trypsin stock\n    -   1 µL Lys-C stock\n-   **Acidification Solution:** 0.2% (v/v) Trifluoroacetic acid (TFA) in ddH₂O.\n-   Double-distilled water (ddH₂O)\n\n## Procedure\n*Estimated timing: 1.5 hours + overnight digestion*\n\n### Bead Preparation and Protein Binding\n1.  Aliquot 10 µL of the magnetic bead slurry into a microcentrifuge tube or well of a 96-well plate.\n2.  Place the tube/plate on a magnetic rack to capture the beads. Once the supernatant is clear, carefully remove and discard it.\n3.  Wash the beads by adding 100 µL of ddH₂O, resuspending, capturing the beads on the magnetic rack, and discarding the supernatant. Repeat this step once for a total of two washes.\n4.  Resuspend the washed beads in 100 µL of the selected Binding/Wash Buffer.\n5.  Add 10 µL of plasma to the bead suspension.\n6.  Incubate the mixture for 30 minutes at 37°C with gentle shaking to facilitate protein binding.\n\n### Washing and On-Bead Digestion\n7.  Following incubation, capture the beads on the magnetic rack and discard the supernatant.\n8.  Wash the beads twice with 100 µL of Binding/Wash Buffer per wash. For each wash, resuspend the beads completely before capturing them on the magnet and discarding the supernatant.\n9.  Add 50 µL of Lysis Buffer to the washed beads.\n10. Incubate for 10 minutes at 90°C with shaking. This step lyses the bound proteins and simultaneously performs reduction and alkylation.\n11. Briefly centrifuge the tube/plate to collect condensation, then allow it to cool to room temperature.\n12. Add 10 µL of the freshly prepared Digestion Mix to each sample.\n13. Incubate overnight (e.g., 12-16 hours) at 37°C with gentle shaking to digest the proteins.\n\n### Peptide Recovery\n14. To stop the digestion, add 50 µL of 0.2% TFA solution to each sample and mix well by pipetting. The final volume is approximately 110 µL.\n15. Place the tube/plate on the magnetic rack to capture the beads.\n16. Carefully transfer 100 µL of the supernatant, which contains the final peptide solution, to a fresh tube or plate for LC-MS analysis. Discard the plate containing the residual beads.\n\n## Expected Results\n-   Processing 10 µL of plasma with this method should yield approximately 800 ng of total peptides.\n-   For a standard LC-MS analysis, loading 20 µL of the final 100 µL peptide solution corresponds to an injection of approximately 160 ng of peptide material.\n\n## Figures\n### Figure 1: Workflow for Magnetic Bead-Based Proteomic Sample Preparation\n[Placeholder for a diagram illustrating the key steps: (A) Bead washing and protein binding from plasma. (B) On-bead lysis, reduction, and alkylation. (C) Enzymatic digestion. (D) Peptide elution and separation from beads.]",
        "ai_protocol": "# Protocol: On-Bead Protein Digestion of Plasma Samples for Proteomic Analysis\n\n## Abstract\nThis protocol details a robust method for the enrichment and subsequent proteolytic digestion of proteins directly on magnetic beads. The procedure is optimized for plasma samples and involves protein binding to functionally-specific magnetic beads, stringent washing steps to remove non-specific binders, and an efficient on-bead digestion using a Trypsin/Lys-C mixture. The resulting peptide mixture is suitable for direct analysis by liquid chromatography-mass spectrometry (LC-MS/MS). This approach minimizes sample handling losses and is readily adaptable for various affinity-based proteomics workflows, such as targeted protein pulldowns or analysis of post-translationally modified sub-proteomes.\n\n## Materials\n\n### Equipment\n- **Magnetic separator:** A magnetic rack or device compatible with 1.5 mL microcentrifuge tubes or 96-well plates.\n- **Centrifuge:** A microcentrifuge capable of reaching at least 2,000 x g, preferably refrigerated.\n- **Thermomixer or Heat Block:** Capable of incubation at 37°C and 90°C with shaking.\n- **Pipettes:** A set of calibrated single-channel pipettes (e.g., P20, P200, P1000) and corresponding tips.\n- **Microcentrifuge tubes or 96-well plates:** Low-protein-binding 1.5 mL tubes or 96-well PCR/storage plates (e.g., Eppendorf, cat. no. 0030124332).\n\n### Reagents\n- **Magnetic Beads:** The choice of magnetic beads is dependent on the experimental goal (e.g., streptavidin-coated for biotinylated targets, antibody-conjugated for immunoprecipitation). The protocol assumes a slurry format.\n- **Human Plasma:** Collected in EDTA or citrate tubes. Can be used fresh or previously frozen at -80°C.\n    * **Optional:** Platelet-contaminated plasma can be used if relevant to the experimental question.\n- **Deionized, distilled water (ddH₂O):** Ultrapure, MS-grade.\n- **Binding/Wash Buffer:** This buffer is user-defined and dependent on the chemistry of the magnetic beads and the protein interaction being studied. Refer to the manufacturer's instructions for the beads or a validated buffer screen (e.g., \"Philipp's buffer screen\").\n- **Lysis and Digestion Buffer (LDB):** 100 mM Tris, 10 mM TCEP, 40 mM Chloroacetamide (CAA).\n    * To prepare 40 mL:\n        * Tris base: 526.6 mg\n        * Chloroacetamide (CAA): 149.6 mg ! **CAUTION:** CAA is toxic and a carcinogen. Handle with appropriate personal protective equipment (PPE).\n        * 0.5 M TCEP solution: 0.8 mL\n        * Add ddH₂O to a final volume of 40 mL.\n    * **Alternative preparation from solids:** Combine Tris and CAA as above, then add 114.7 mg of TCEP-HCl. Adjust pH to ~8.0-8.5 before adding water to the final volume. **CRITICAL** Always verify the pH after preparation.\n- **Trypsin, MS-grade:** (e.g., Promega, cat. no. V5111). Resuspend to a stock concentration of 0.5 µg/µL in a suitable buffer (e.g., 50 mM acetic acid).\n- **Lys-C, MS-grade:** (e.g., Promega, cat. no. V1671). Resuspend to a stock concentration of 0.5 µg/µL in a suitable buffer (e.g., ddH₂O).\n- **Trifluoroacetic acid (TFA):** (e.g., Thermo Fisher Scientific, cat. no. 28904). Prepare a 0.2% (v/v) solution in ddH₂O. ! **CAUTION:** TFA is corrosive. Handle in a fume hood with appropriate PPE.\n\n## Procedure\n*Estimated timing: ~1.5 hours hands-on time, ~12-18 hours for overnight digestion.*\n\n### Bead Preparation and Protein Binding\n*Timing: ~45 minutes*\n\n1.  In a microcentrifuge tube, aliquot 10 µL of the magnetic bead slurry.\n2.  Place the tube on a magnetic separator. Once the beads are captured and the supernatant is clear, carefully aspirate and discard the supernatant.\n3.  Wash the beads by removing the tube from the magnet, adding 200 µL of ddH₂O, and gently resuspending the beads.\n4.  Recapture the beads on the magnetic separator and discard the supernatant. Repeat this wash step once for a total of two washes.\n5.  Resuspend the washed beads in 100 µL of the user-defined Binding/Wash Buffer.\n6.  Add 10 µL of plasma to the bead suspension.\n    * **CRITICAL STEP:** Ensure plasma is fully thawed and vortexed before addition. If starting with frozen plasma, centrifuge at high speed (e.g., 14,000 x g, 10 min, 4°C) to pellet any cryoprecipitates and use the clear supernatant.\n7.  Incubate the mixture for 30 minutes at 37°C with gentle shaking in a thermomixer to facilitate protein binding.\n\n### Washing\n*Timing: ~15 minutes*\n\n8.  Following incubation, briefly centrifuge the tube at 2,000 x g for 1 minute to pellet the beads.\n    * **Alternative:** Use a magnetic separator to capture the beads. Centrifugation is recommended to ensure collection of all beads.\n9.  Place the tube on the magnetic separator, allow the beads to be captured, and carefully discard the supernatant.\n10. Wash the beads by adding 100 µL of Binding/Wash Buffer and gently resuspending.\n11. Capture the beads on the magnetic separator and discard the supernatant.\n12. Repeat steps 10-11 once for a total of two washes. After the final wash, remove the supernatant as completely as possible.\n    * **CRITICAL STEP:** Complete removal of the wash buffer is essential to avoid interfering with the subsequent digestion step. Use a small-volume pipette tip to remove any residual liquid after the main aspiration.\n\n### On-Bead Digestion\n*Timing: ~25 minutes hands-on, 4 hours to overnight for digestion*\n\n13. Resuspend the washed beads in 50 µL of Lysis and Digestion Buffer (LDB).\n14. Incubate the sample for 10 minutes at 90°C with shaking in a thermomixer to denature, reduce, and alkylate the bound proteins.\n15. After incubation, briefly centrifuge the tube to collect any condensate from the lid and walls. Let the sample cool to room temperature (~5-10 minutes).\n16. Prepare a fresh Trypsin/Lys-C enzyme master mix. For each sample, combine:\n    * Lysis and Digestion Buffer (LDB): 8 µL\n    * Trypsin stock (0.5 µg/µL): 1 µL\n    * Lys-C stock (0.5 µg/µL): 1 µL\n    * *Note: Prepare a master mix for all samples, including a 10% excess to account for pipetting inaccuracies. For example, for 10 samples, mix 88 µL LDB, 11 µL Trypsin, and 11 µL Lys-C.*\n17. Add 10 µL of the enzyme master mix to each sample. The total volume is now ~60 µL.\n18. Digest the proteins by incubating the samples overnight (12-16 hours) at 37°C with gentle shaking.\n    * **PAUSE POINT:** The digestion time can be optimized. A shorter digestion of 4 hours may be sufficient for some applications.\n\n### Sample Acidification and Peptide Collection\n*Timing: ~10 minutes*\n\n19. To stop the digestion, add 50 µL of 0.2% TFA solution to each sample. Mix well by pipetting up and down. The final volume will be approximately 110 µL.\n20. Place the tubes on the magnetic separator to capture the beads.\n21. Carefully transfer 100 µL of the supernatant, which now contains the digested peptides, to a fresh, clean 96-well plate or microcentrifuge tube. Discard the plate/tubes containing the beads. The sample is now ready for LC-MS/MS analysis or further fractionation.\n\n## Expected Results\n-   Starting with 10 µL of human plasma, this enrichment and digestion procedure is expected to yield approximately 800 ng of total peptides in the final 110 µL volume.\n-   A standard injection volume for LC-MS/MS analysis would be 20 µL of the final peptide solution, corresponding to a load of approximately 150-160 ng of peptides on the column.\n\n## Figures\n### Figure 1: Workflow for On-Bead Digestion of Plasma Proteins\n[Placeholder for a flowchart diagram illustrating the main steps: (A) Bead Washing, (B) Protein Binding from Plasma, (C) Post-binding Washes, (D) On-bead Lysis and Digestion, and (E) Peptide Elution and Collection.]",
        "user_protocol_rating": {
            "Completeness": 5,
            "Technical Accuracy": 3,
            "Logical Flow": 5,
            "Safety": 4,
            "Formatting": 5,
        },
        "comments": "",
    },
]

df_benchmark_data = pd.DataFrame(list_new_rows)
# df_benchmark_data = pd.concat([df_benchmark_data, pd.DataFrame([new_row_dict])], ignore_index=True)
df_benchmark_data.to_csv(benchmark_dataset_path, index=False)
df_benchmark_data

In [None]:
# Experimental implementation of an eval strategy with Vertex AIs build in eval pipeline
# I decided in the end for a more elaborate custom approach to better catch nuances
import pandas as pd
from dotenv import load_dotenv
from IPython.display import Markdown
from vertexai.evaluation import EvalTask, PointwiseMetric, PointwiseMetricPromptTemplate
from vertexai.preview.evaluation import notebook_utils

load_dotenv()

EXPERIMENT_NAME = "video-eval-experiment"

protocol_quality_metric = PointwiseMetric(
    metric="protocol_quality",
    metric_prompt_template=PointwiseMetricPromptTemplate(
        criteria={
            "completeness": "What is present in both protocols. What is present in the ground truth but missing from the AI-generated protocol. What is present in the AI-generated protocol but not in the ground truth",
            "technical_accuracy": "The AI-generated protocol demonstrates scientific understanding by properly distinguishing between different techniques and equipment and using appropriate scientific terminology.",
            "logical_flow": "The workflow maintains the chronological sequence of the procedure.",
            "safety": "The appropriate identification and emphasis of critical cautions, warnings, and safety measures.",
            "formatting": "The AI-generated protocol matches the formatting of the ground truth protocol.",
        },
        rating_rubric={
            "5": "Very good: The AI-generated protocol demonstrates exceptional quality in this aspect, with no significant flaws or omissions. Fully meets or exceeds the ground truth protocol.",
            "4": "Good: The AI-generated protocol demonstrates strong quality in this aspect, with only minor shortcomings that don't significantly impact effectiveness. Closely aligns with the ground truth protocol.",
            "3": "Ok: The AI-generated protocol contains most essential elements but has noticeable differences from the ground truth protocol that might slightly impact its effectiveness.",
            "2": "Bad: The AI-generated protocol has significant deficiencies in multiple criteria from the ground truth, missing or wrongly displaying important information that would likely impact its effectiveness.",
            "1": "Very bad: The AI-generated protocol fails to meet minimum standards in this aspect, with fundamental flaws or critical omissions that render the content potentially unusable or unsafe.",
        },
        instruction="""You are an expert evaluator. Your task is to evaluate the quality of a protocol generated by an AI model by comparing it to a ground truth protocol. You will conduct a systematic, section-by-section analysis to determine how well the AI-generated protocol aligns with the ground truth protocol. You should first read the protocols carefully and then evaluate the quality of the AI-generated protocol based on the evaluation criteria below.\n\n
        ## Evaluation Steps

        * Step 1: Read the 'Ground truth protocol' thoroughly and write it down again word-by-word (Verbatim).

        * Step 2: Read the 'AI-generated protocol' thoroughly and write it down again word-by-word (Verbatim).

        * Step 3: Compare each section of the AI-generated protocol with its counterpart in the ground truth protocol:

        1. Title
        2. Abstract
        3. Materials
        5. Each step from Expected Results
        6. Figures
        7. References

        For each section, note how they fullfill the evaluation criteria (Completeness, Technical Accuracy, Logical Flow, Safety, Formatting) using the 1-5 scale based on the Rating Rubric. Treat the ground truth as the gold standard and more trustworthy protocol.

        * Step 4: Comparative Analysis Table
        Create a table summarizing your findings:

        | Section | Ground Truth Protocol | AI-Generated Protocol | Completeness Rating (1-5) | Completeness Explanation | Technical Accuracy Rating (1-5) | Technical Accuracy Explanation | Logical Flow Rating (1-5) | Logical Flow Explanation | Safety Rating (1-5) | Safety Explanation | Formatting Rating (1-5) | Formatting Explanation | Notes |
        |-|-|-|-|-|-|-|-|-|-|-|-|-|-|
        | Title | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |Misaligned/Not applicable] | [Explanation] |
        | Abstract | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |
        | Materials - e.g. Equipment | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |
        | Materials - e.g. Reagents | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |
        | Procedure - Step 1 | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |
        | Procedure - Step 2 | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |
        | Procedure - Step 3 | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |
        | Procedure - Step 4 | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |
        | Procedure - Step 5 | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |
        | [Continue for all steps] | | | | | | | | | | | | | |
        | Expected Results | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |
        | Figures | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |
        | References | [Text from ground truth] | [Text from AI protocol] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [1-5] | [Explanation] | [Additional observations or comments] |

        * Step 5: Overall Summary:
        Overall compare all sections in total and individually for Completeness, Technical Accuracy, Logical Flow, Safety, Formatting

        * Step 6: Overall Rating
        Provide an overall rating (1-5) based on step 5 and the Rating Rubric.
        """,
    ),
)

eval_dataset = pd.DataFrame(
    {
        "prompt": [
            'Generate a protocol based on this video "gs://ai-proteomics-advisor/input_for_protocol/ConnectingColumnSampleLine_protocolCorrect.MP4".',
        ],
        "response": [
            "# Connecting an IonOpticks Aurora Series Column to a Bruker timsTOF for LC-MS/MS Analysis\n\n## Abstract\nThis protocol provides a step-by-step guide for the correct physical and electrical connection of an IonOpticks Aurora Series UHPLC column to the CaptiveSpray Ultra ion source of a Bruker timsTOF series mass spectrometer. The procedure ensures a secure, leak-free fluidic connection and proper grounding, which are essential for generating a stable electrospray. Additionally, the protocol outlines the necessary software commands in Bruker timsControl and HyStar to initiate mobile phase flow and confirm system readiness for data acquisition.\n\n## Materials\n\n### Equipment\n| Equipment | Description | Source/Cat. No. |\n|---|---|---|\n| Mass Spectrometer | Bruker timsTOF SCP, Pro, or similar model equipped with a CaptiveSpray Ultra ion source. | Bruker Daltonik |\n| LC System | An LC system capable of nano-flow rates, such as the Evosep One. | Evosep |\n| UHPLC Column | IonOpticks Aurora Series UHPLC column with integrated emitter. | IonOpticks |\n| Pliers | Needle-nose or similar pliers for securely holding the column fitting. | Standard lab supplier |\n| Gloves | Nitrile or latex laboratory gloves. | Standard lab supplier |\n| Nano-connector adapter | (Optional) A tool to aid in finger-tightening nano-fittings. | - |\n\n### Software\n| Software | Version | Purpose | Source/Link |\n|---|---|---|---|\n| Bruker timsControl | Not specified in video | Instrument control and data monitoring for timsTOF systems. | Provided with Bruker timsTOF systems |\n| Bruker HyStar | Not specified in video | Manages the LC system and autosampler, and coordinates with timsControl. | Provided with Bruker LC-MS systems |\n\n## Procedure\n*Estimated timing: 5-10 minutes (including heating time)*\n\n### System Preparation and State Verification\n\n1.  In the Bruker timsControl software, verify the status of the mass spectrometer. Ensure the instrument is in **‘Standby’** mode. If the status is ‘Operating’, click the corresponding button to switch it to ‘Standby’ before proceeding.\n2.  **CRITICAL STEP:** IonOpticks columns with integrated emitters should not be subjected to mobile phase flow for extended periods without an applied voltage (i.e., while the MS is in 'Standby'). This can lead to poor spray quality or column damage. Minimize the time between starting the idle flow and beginning the analysis.\n\n### Physical Connection of the Column\n\n3.  Wear clean laboratory gloves.\n4.  Confirm that the IonOpticks column is already seated correctly within the column oven of the CaptiveSpray Ultra source.\n5.  Carefully take the fused silica sample line originating from the LC system. Inspect the nano-connector at its tip and wipe it clean with a gloved finger or a lint-free wipe to remove any particulates or residual solvent.\n6.  Use pliers to gently but firmly grip the stationary metal fitting on the IonOpticks column. This prevents the column from twisting during tightening.\n7.  Align the sample line's nano-connector with the column fitting and carefully screw it in by hand until it is **finger-tight**.\n\n    > **! CAUTION:** Do not use tools to tighten the fitting and avoid overtightening. Overtightening can crush the capillary or damage the threads of the column fitting, leading to leaks and poor chromatographic performance.\n\n### Positioning, Grounding, and Heating\n\n8.  Verify the physical position of the column oven assembly. If necessary, loosen the positioning screw and adjust the assembly to bring the column's emitter tip as close as possible to the ion source's transfer capillary without touching it.\n9.  Ensure the grounding screw on the column oven is making firm contact with the metal column fitting. A proper electrical ground is essential for a stable electrospray. For longer columns, alternative grounding points on the oven assembly may be used.\n10. Close the lid of the column oven. A distinct click confirms it is securely shut.\n11. Observe the three green LEDs on the side of the column oven. The LEDs will blink while the oven is heating to its setpoint (e.g., 50 °C). The lights will become solid green once the target temperature has been reached and is stable.\n\n### Establishing Electrospray and Signal\n\n12. In the timsControl software, switch the instrument's status from **‘Standby’** to **‘Operating’**. This will apply the necessary voltages to the ion source.\n13. Navigate to the Bruker HyStar software.\n14. Initiate the mobile phase flow to the column. Right-click within the sample table and select **Idle flow > Run** from the context menu. This will start a low-flow rate to equilibrate the column and establish a spray.\n15. Return to the timsControl software. Monitor the **Chromatogram View** for a stable Total Ion Chromatogram (TIC) and the **Spectrum View** for a consistent ion signal.\n\n## Expected Results\n- A secure, leak-free connection between the sample line and the UHPLC column.\n- The column oven is closed and has reached its set temperature, indicated by solid green LEDs.\n- A stable TIC signal is observed in timsControl, confirming a robust and consistent electrospray. The system is now ready for data acquisition.\n\n## Figures\n\n### Figure 1: Connecting the Sample Line to the Column\n![Placeholder for an image showing the three key steps of the physical connection: (A) Holding the column fitting steady with pliers. (B) Hand-tightening the sample line's nano-connector into the column fitting. (C) The final, finger-tight connection with the grounding screw in contact with the fitting.](placeholder_image_url)\n***(A) The column fitting is held with pliers to prevent rotation. (B) The sample line is screwed in until finger-tight. (C) The completed connection, ensuring the grounding screw makes contact.***\n\n### Figure 2: Software Workflow for Initiating Spray\n![Placeholder for a composite image showing screenshots of the software interfaces: (A) The 'Standby' button in timsControl. (B) The 'Idle flow -> Run' command in the HyStar right-click menu. (C) The timsControl interface showing a stable TIC signal after the system is set to 'Operating' and idle flow has commenced.](placeholder_image_url)\n***(A) Ensure the mass spectrometer is in 'Standby' in timsControl before making connections. (B) Use the HyStar menu to start the 'Idle flow'. (C) After switching to 'Operating' mode, monitor for a stable signal in timsControl.***\n",
        ],
        "reference": [
            "# Connecting IonOpticks Column & sample line of Evosep\n\n## Abstract\nThis protocol explains how to connect an IonOpticks column inserted into a timsTOF Ultra with an sample line of an Evosep.\n\n\n## Materials\n\n### Equipment\n- IonOpticks Column\n- timsTOF Ultra Mass Spectrometer\n  - Equipped with UltraSource ion source\n  - Equipped with column oven for temperature control\n- Evosep One LC System\n  - with sample line\n- NanoViper Adapter (black)\n- Pliers\n\n\n## Procedure\n*Estimated timing: 3 minutes*\n\n1. Verfy that the instrument is in standing by mode. If not press the on/off button to switch from operate mode to standing by mode.\n\n### Connect column and sample line\n2. Verify that an IonOpticks column is already inserted into the UltraSource of the timsTOF Ultra.\n3. Attach a black NanoViper adapter to the LC sample line.\n4. Remove any access liquid at the top of the nanoViper of the sample line for instance by snipping it off.\n5. Hold the column fititng of the IonOpticks column with a pliers.\n6. Hand-tighten the NanoViper of the LC sample line with the column fitting (Figure 1).\n   CRITICAL STEP: Be careful not to overtighten the connection between the IonOpticks column and the timsTOF Ultra. Otherwise you can damage the column and the LC sample line.\n7. Remove the NanoViper adapter so that the oven can be tightly closed.\n8. Draw the oven closer to the UltraSource, secure it with the screw on the bottom of the oven (Figure 2).\n9. Lift and place the metal grounding screw at the column-sample line connection to establish proper ESI spray grounding with the column oven.\n10. Close the lid of the oven.\n11. Verify the temperature at 50°C for IonOpticks columns, as indicated by three illuminated LEDs on the column oven, respectively.\n\n### Switch timsTOF to operate and idle flow\n12. In timsControl, transition the instrument to the operation mode by using the on/off button.\n13. Navigate to Hystar and ensure that the idle flow is active. If not right-click on the Evosep logo, choosing 'preparation', and then selecting 'idle flow' and 'Run'.\n14. Return to timsControl. Check the MS signal. It should be around 2.5x10^7.\n\n\n## Expected Results\n- In timsControl, signal intensity should be above 10^7\n- Stable signal in in timsControl (in the windows for Mobilogram (timsView) and mass spectrum)\n\n\n## Figures\n### Figure 1: Connection of a sample line of a LC system with a separation column and emitter.**\n\n### Figure 2: UltraSource.**\n\n\n## References\n1. Skowronek, P., Wallmann, G., Wahle, M. et al. An accessible workflow for high-sensitivity proteomics using parallel accumulation–serial fragmentation (PASEF). Nat Protoc (2025). https://doi.org/10.1038/s41596-024-01104-w\n",
        ],
    }
)

print("Running video evaluation...")
eval_result = EvalTask(
    dataset=eval_dataset, metrics=[protocol_quality_metric], experiment=EXPERIMENT_NAME
).evaluate()
notebook_utils.display_eval_result(eval_result)

explanation = eval_result.metrics_table.iloc[0]["protocol_quality/explanation"]
display(Markdown(explanation))


# Clean up function (run after evaluation)
def cleanup_experiment(eval_result: dict) -> None:
    """Clean up the experiment run after evaluation."""
    from google.cloud import aiplatform

    aiplatform.ExperimentRun(
        run_name=eval_result.metadata["experiment_run"],
        experiment=eval_result.metadata["experiment"],
    ).delete()
    print("Experiment cleaned up successfully")


cleanup_experiment(eval_result)

In [None]:
# Custom evaluation strategy
import logging
import sys

root_logger = logging.getLogger()
if root_logger.hasHandlers():
    root_logger.handlers.clear()

logging.basicConfig(
    level=logging.INFO,  # Use DEBUG for maximum detail
    format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
    stream=sys.stdout,
)

import pandas as pd
from proteomics_lab_agent.sub_agents.protocol_generator_agent import agent

from eval.eval_protocol_generation.evaluator import _run_single_evaluation

df_benchmark_data = pd.read_csv(
    "/Users/patriciaskowronek/Documents/proteomics_lab_agent/eval/eval_protocol_generation/benchmark_data.csv"
)
results = _run_single_evaluation(
    df_benchmark_data.iloc[1],
    "dummy",
    1,
    {
        "name": "regular_function",
        "function": agent.generate_protocols,
    },
)