# 🦠 Viral AI Variants Explorer

Query viral variants from the VirusSeq collection on Viral AI.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mfiume/omics-ai-python-library/blob/main/Viral_AI_Variants_Explorer.ipynb)

In [None]:
# Install and import
!pip install git+https://github.com/mfiume/omics-ai-python-library.git --quiet
from omics_ai import list_collections, list_tables, get_schema_fields, query
import pandas as pd
print("Ready for viral genomics!")

In [None]:
# Connect to Viral AI
collections = list_collections("viral")
print(f"Viral AI: {len(collections)} collections")

# Find VirusSeq
virusseq = next(c for c in collections if c['slugName'] == 'virusseq')
print(f"Found: {virusseq['name']}")

In [None]:
# List tables in VirusSeq
tables = list_tables("viral", "virusseq")
print(f"VirusSeq tables: {len(tables)}")
for table in tables:
    print(f"- {table['display_name']} ({table['size']} rows)")

In [None]:
# Query variants table
print("Querying viral variants...")
result = query("viral", "virusseq", "collections.virusseq.variants", limit=10)

data = result['data']
print(f"Retrieved {len(data)} variants")

# Display as table
if data:
    df = pd.DataFrame(data)
    print(f"\nViral variants table ({df.shape[0]} rows, {df.shape[1]} columns):")
    display(df.head(10))

In [None]:
# Query with filters - specific chromosome
filters = {
    "chrom": [{
        "operation": "EQ",
        "value": "chr1",
        "type": "STRING"
    }]
}

print("Querying chromosome 1 variants...")
result_filtered = query("viral", "virusseq", "collections.virusseq.variants", 
                       filters=filters, limit=10)

data_filtered = result_filtered['data'] 
print(f"Retrieved {len(data_filtered)} variants from chromosome 1")

# Display filtered results as table
if data_filtered:
    df_filtered = pd.DataFrame(data_filtered)
    print(f"\nChromosome 1 variants ({df_filtered.shape[0]} rows, {df_filtered.shape[1]} columns):")
    display(df_filtered.head(10))
else:
    print("No variants found on chromosome 1")

In [None]:
# Summary analysis
print("Summary of viral variants data:")
if 'df' in locals() and not df.empty:
    print(f"- Total variants queried: {len(df)}")
    print(f"- Data columns: {len(df.columns)}")
    
    # Show data types
    print(f"- Sample data types:")
    for col in df.columns[:5]:
        print(f"  {col}: {df[col].dtype}")
    
    # Show some basic stats if numeric columns exist
    numeric_cols = df.select_dtypes(include=['number']).columns
    if len(numeric_cols) > 0:
        print(f"- Numeric columns: {list(numeric_cols)[:3]}...")
else:
    print("- No data available for analysis")

## Done!

You've successfully queried viral genomics data from the VirusSeq collection.

**Filter examples:**
- Chromosome: `{"chrom": [{"operation": "EQ", "value": "chr1", "type": "STRING"}]}`
- Position: `{"pos": [{"operation": "GT", "value": 1000000, "type": "INTEGER"}]}`

**Links:**
- [VirusSeq Collection](https://viral.ai/collections/virusseq/)
- [GitHub](https://github.com/mfiume/omics-ai-python-library)