# MIHCSME OMERO - Quick Start Demo

A simplified demonstration of the MIHCSME OMERO package with robust error handling.

## 1. Imports and Setup

In [1]:
from pathlib import Path
import json
import pandas as pd

from mihcsme_omero import parse_excel_to_model
from mihcsme_omero.models import (
    MIHCSMEMetadata,
    AssayCondition,
    InvestigationInformation,
    StudyInformation,
    AssayInformation,
)

print("‚úÖ Imports successful!")

‚úÖ Imports successful!


## 2. Parse Excel File

Update the path below to point to your MIHCSME Excel file.

In [2]:
# Update this path!
excel_path = Path("../MIHCSME Template_MH.xlsx")

if not excel_path.exists():
    raise FileNotFoundError(f"Excel file not found: {excel_path.absolute()}")

print(f"üìÑ Parsing: {excel_path.name}")
metadata = parse_excel_to_model(excel_path)

print(f"\n‚úÖ Successfully parsed!")
print(f"   Wells: {len(metadata.assay_conditions)}")
print(f"   Reference sheets: {len(metadata.reference_sheets)}")

üìÑ Parsing: MIHCSME Template_MH.xlsx

‚úÖ Successfully parsed!
   Wells: 72
   Reference sheets: 4


## 3. Inspect Investigation Information

In [3]:
print("üìã Investigation Information:\n")
for group_name, fields in metadata.investigation_information.groups.items():
    print(f"{group_name}:")
    for key, value in fields.items():
        print(f"  ‚Ä¢ {key}: {value}")
    print()

üìã Investigation Information:

DataOwner:
  ‚Ä¢ First Name: Mazene
  ‚Ä¢ Last Name: Hochane
  ‚Ä¢ User name: hochanem
  ‚Ä¢ Institute: Universiteit Leiden
  ‚Ä¢ E-Mail Address: test@leidenuniv.nl
  ‚Ä¢ ORCID investigator: https://orcid.org/0000-0002-7990-6010

InvestigationInformation:
  ‚Ä¢ Project ID: 1337
  ‚Ä¢ Investigation Title: What are we seeing here
  ‚Ä¢ Investigation internal ID: WAWSH
  ‚Ä¢ Investigation description: cells



## 4. Inspect Study Information

In [5]:
print("üî¨ Study Information:\n")
for group_name, fields in metadata.study_information.groups.items():
    print(f"{group_name}:")
    for key, value in fields.items():
        print(f"  ‚Ä¢ {key}: {value}")
    print()

üî¨ Study Information:

Study:
  ‚Ä¢ Study Title: Microscopy investiation
  ‚Ä¢ Study internal ID: 1337
  ‚Ä¢ Study Description: Interesting stuff
  ‚Ä¢ Study Key Words: [microscopy, high-content screening]

Biosample:
  ‚Ä¢ Biosample description: IPSc
  ‚Ä¢ Biosample Organism: Human
  ‚Ä¢ Number of cell lines used: 1

Library:
  ‚Ä¢ Library File Name: whatever.xlsx
  ‚Ä¢ Library File Format: xlsx
  ‚Ä¢ Library Type: List
  ‚Ä¢ Quality Control Description: Nothing

Protocols:
  ‚Ä¢ HCS library protocol: http://eln

Plate:
  ‚Ä¢ Plate type: uclear
  ‚Ä¢ Plate type Manufacturer: Geiner
  ‚Ä¢ Plate type Catalog number: 1337



## 5. Inspect Assay Information

In [6]:
print("üß™ Assay Information:\n")
for group_name, fields in metadata.assay_information.groups.items():
    print(f"{group_name}:")
    for key, value in fields.items():
        print(f"  ‚Ä¢ {key}: {value}")
    print()

üß™ Assay Information:

Assay:
  ‚Ä¢ Assay Title: Look no further
  ‚Ä¢ Assay internal ID: 1234
  ‚Ä¢ Assay Type: high content analysis of cells

ImageData:
  ‚Ä¢ Image number of pixelsX: 512
  ‚Ä¢ Image number of pixelsY: 512
  ‚Ä¢ Image number of  z-stacks: 7
  ‚Ä¢ Image number of channels: 3
  ‚Ä¢ Image number of timepoints: 1
  ‚Ä¢ Image sites per well: 1

ImageAcquisition:
  ‚Ä¢ Microscope id: 3444

Specimen:
  ‚Ä¢ Channel 1 visualization method: Hoechst 33258
  ‚Ä¢ Channel 1 entity: DNA
  ‚Ä¢ Channel 1 label: Nuclei
  ‚Ä¢ Channel 1 id: 0
  ‚Ä¢ Channel 2 visualization method: EGFP
  ‚Ä¢ Channel 2 entity: H2B
  ‚Ä¢ Channel 2 label: Chromatin
  ‚Ä¢ Channel 2 id: 1



## 6. View Assay Conditions (Wells)

Display well-level metadata in a DataFrame.

In [7]:
# Convert to DataFrame for easy viewing
conditions_data = []
for condition in metadata.assay_conditions:
    row = {
        "Plate": condition.plate,
        "Well": condition.well,
        **condition.conditions,
    }
    conditions_data.append(row)

df = pd.DataFrame(conditions_data)
print(f"üìä Assay Conditions ({len(df)} wells):\n")
df.head(10)

üìä Assay Conditions (72 wells):



Unnamed: 0,Plate,Well,Concentration,Unit,RepID
0,plate_day_7,B01,5.0,uM,1
1,plate_day_7,B02,5.0,uM,1
2,plate_day_7,B03,6.0,uM,1
3,plate_day_7,B04,6.0,uM,1
4,plate_day_7,B05,6.5,uM,1
5,plate_day_7,B06,6.9,uM,1
6,plate_day_7,B07,7.3,uM,1
7,plate_day_7,B08,7.7,uM,1
8,plate_day_7,B09,8.1,uM,1
9,plate_day_7,B10,8.5,uM,1


## 7. Get Unique Plates

In [8]:
plates = df['Plate'].unique()
print(f"üìã Unique plates: {list(plates)}")
print(f"\nüî¢ Wells per plate:")
print(df['Plate'].value_counts())

üìã Unique plates: ['plate_day_7']

üî¢ Wells per plate:
Plate
plate_day_7    72
Name: count, dtype: int64


## 8. Get All Condition Keys

In [9]:
all_keys = set()
for condition in metadata.assay_conditions:
    all_keys.update(condition.conditions.keys())

print(f"üîë Condition keys found ({len(all_keys)}):")
for key in sorted(all_keys):
    print(f"  ‚Ä¢ {key}")

üîë Condition keys found (3):
  ‚Ä¢ Concentration
  ‚Ä¢ RepID
  ‚Ä¢ Unit


## 9. Export to JSON

Save the parsed metadata to JSON format.

In [10]:
output_json = Path("metadata_export.json")

with open(output_json, "w") as f:
    json.dump(metadata.model_dump(), f, indent=2)

print(f"‚úÖ Exported to: {output_json.absolute()}")
print(f"   Size: {output_json.stat().st_size / 1024:.1f} KB")

‚úÖ Exported to: /var/home/maartenpaul/Documents/GitHub/MIHCSME_OMERO/examples/metadata_export.json
   Size: 52.3 KB


## 10. Convert to OMERO Dictionary Format

In [11]:
# Convert to legacy OMERO format
omero_dict = metadata.to_omero_dict()

print("üì¶ OMERO Dictionary Structure:")
print(f"\n   Top-level keys: {list(omero_dict.keys())}")
print(f"\n   Investigation groups: {list(omero_dict.get('InvestigationInformation', {}).keys())}")
print(f"   Study groups: {list(omero_dict.get('StudyInformation', {}).keys())}")
print(f"   Assay groups: {list(omero_dict.get('AssayInformation', {}).keys())}")
print(f"\n   Total wells in AssayConditions: {len(omero_dict.get('AssayConditions', []))}")

üì¶ OMERO Dictionary Structure:

   Top-level keys: ['InvestigationInformation', 'StudyInformation', 'AssayInformation', 'AssayConditions', '_fbbiVisualizationMethods', '_fbbiImagingMethods', '_efo_studytypes', '_efo_assaytypes']

   Investigation groups: ['DataOwner', 'InvestigationInformation']
   Study groups: ['Study', 'Biosample', 'Library', 'Protocols', 'Plate']
   Assay groups: ['Assay', 'ImageData', 'ImageAcquisition', 'Specimen']

   Total wells in AssayConditions: 72


## 11. Create Metadata Programmatically

Example of creating metadata from scratch in Python.

In [None]:
# Create a simple metadata object
custom_metadata = MIHCSMEMetadata(
    investigation_information=InvestigationInformation(
        groups={
            "Project": {
                "Investigation Title": "Demo Investigation",
                "Investigation Description": "Created programmatically",
            }
        }
    ),
    study_information=StudyInformation(
        groups={
            "Study": {
                "Study Title": "Demo Study",
                "Study Description": "Example study",
            }
        }
    ),
    assay_information=AssayInformation(
        groups={
            "Assay": {
                "Assay Title": "Demo Assay",
                "Assay Type": "High Content Screening",
            }
        }
    ),
    assay_conditions=[
        AssayCondition(
            plate="DemoPlate",
            well="A1",  # Automatically normalized to "A01"
            conditions={
                "Compound": "DMSO",
                "Concentration": "0.1%",
            },
        ),
        AssayCondition(
            plate="DemoPlate",
            well="A2",
            conditions={
                "Compound": "Drug X",
                "Concentration": "10 ŒºM",
            },
        ),
    ],
)

print("‚úÖ Created custom metadata")
print(f"   Wells: {len(custom_metadata.assay_conditions)}")
print(f"   Well names (auto-normalized): {[c.well for c in custom_metadata.assay_conditions]}")

## 12. Test Well Name Validation

In [12]:
print("‚úÖ Valid well formats (auto-normalized):\n")
valid_wells = ["A1", "A01", "B12", "P48"]
for well in valid_wells:
    condition = AssayCondition(plate="Test", well=well, conditions={})
    print(f"   '{well}' ‚Üí '{condition.well}'")

print("\n‚ùå Invalid well formats:\n")
invalid_wells = ["Q1", "A49", "AA1", "A0"]
for well in invalid_wells:
    try:
        condition = AssayCondition(plate="Test", well=well, conditions={})
        print(f"   '{well}': Unexpected success!")
    except ValueError as e:
        print(f"   '{well}': {str(e)[:60]}")

‚úÖ Valid well formats (auto-normalized):

   'A1' ‚Üí 'A01'
   'A01' ‚Üí 'A01'
   'B12' ‚Üí 'B12'
   'P48' ‚Üí 'P48'

‚ùå Invalid well formats:

   'Q1': 1 validation error for AssayCondition
well
  Value error, In
   'A49': 1 validation error for AssayCondition
well
  Value error, In
   'AA1': 1 validation error for AssayCondition
well
  Value error, In
   'A0': 1 validation error for AssayCondition
well
  Value error, In


## Summary

This notebook demonstrated:

‚úÖ Parsing MIHCSME Excel files  
‚úÖ Inspecting metadata (Investigation, Study, Assay)  
‚úÖ Working with well-level conditions  
‚úÖ Exporting to JSON  
‚úÖ Converting to OMERO format  
‚úÖ Creating metadata programmatically  
‚úÖ Well name validation  

### Next Steps:

1. **CLI Usage**: Try the command-line interface
   ```bash
   mihcsme parse file.xlsx --output metadata.json
   mihcsme validate file.xlsx
   ```

2. **OMERO Upload**: Use `upload_metadata_to_omero()` to push metadata to OMERO

3. **Integration**: Incorporate into your analysis workflows