## 1. Installation and Setup

First, ensure you have SciCoDa installed:

```bash
pip install scicoda
```

For full functionality including PDB Chemical Component Dictionary (CCD):

```bash
pip install "scicoda[ccd]"
```

**Important:** CCD datasets are **not bundled** with the package due to their size (~70 MB). The first time you call `scicoda.pdb.ccd()`, the datasets will be automatically downloaded and processed (one-time setup, may take a few minutes).
The `ccd` optional dependencies are required to download and process the data. 

In [None]:
# Import the package
import scicoda
import polars as pl

# Set Polars configuration for better display
pl.Config.set_tbl_rows(10)
pl.Config.set_tbl_cols(-1)

print("SciCoDa successfully imported!")

## 2. Periodic Table Data

SciCoDa provides comprehensive periodic table data from PubChem with additional properties from the Blue Obelisk repository.

In [None]:
# Get the complete periodic table
periodic_table = scicoda.atom.periodic_table()

print(f"Total elements: {len(periodic_table)}")
print(f"\nColumns: {periodic_table.columns}")
print("\nFirst 5 elements:")
periodic_table.head()

### Exploring Element Properties

In [None]:
# Get properties of a specific element (Carbon)
carbon = periodic_table.filter(pl.col("symbol") == "C")
print("Carbon properties:")
carbon

In [None]:
# Get all noble gases
noble_gases = periodic_table.filter(pl.col("block") == "noble gas")
print("Noble gases:")
noble_gases.select(["z", "symbol", "name", "mass", "vdwr"])

In [None]:
# Find elements with specific properties
# Example: Elements with high electronegativity (> 3.0)
high_en = periodic_table.filter(
    pl.col("en_pauling") > 3.0
).select(["symbol", "name", "en_pauling", "block"])

print("Elements with high electronegativity (> 3.0):")
high_en

In [None]:
# Get elements by period and group
period_3 = periodic_table.filter(
    pl.col("period") == 3
).select(["z", "symbol", "name", "group", "block", "mass"])

print("Period 3 elements:")
period_3

### Statistical Analysis

In [None]:
# Calculate average properties by block
block_stats = periodic_table.group_by("block").agg([
    pl.count("symbol").alias("count"),
    pl.col("mass").mean().alias("avg_mass"),
    pl.col("vdwr").mean().alias("avg_vdwr"),
    pl.col("en_pauling").mean().alias("avg_electronegativity")
]).sort("count", descending=True)

print("Average properties by block:")
block_stats

## 3. AutoDock Atom Types

AutoDock atom types are essential for molecular docking simulations. SciCoDa provides these definitions with hydrogen bonding properties.

In [None]:
# Get AutoDock atom types
autodock_types = scicoda.atom.autodock_atom_types()

print(f"Total AutoDock atom types: {len(autodock_types)}")
print("\nAll AutoDock atom types:")
autodock_types

In [None]:
# Filter hydrogen bond acceptors
hb_acceptors = autodock_types.filter(pl.col("hbond_acceptor"))
print("Hydrogen bond acceptor atom types:")
hb_acceptors

In [None]:
# Filter hydrogen bond donors
hb_donors = autodock_types.filter(pl.col("hbond_donor"))
print("Hydrogen bond donor atom types:")
hb_donors

In [None]:
# Get atom types by element
carbon_types = autodock_types.filter(pl.col("element") == "C")
print("Carbon atom types in AutoDock:")
carbon_types.select(["type", "description", "hbond_acceptor", "hbond_donor"])

## 4. PDB Chemical Component Dictionary

The Chemical Component Dictionary (CCD) contains detailed information about all chemical components found in PDB structures.

**Note:** The following examples require `pip install "scicoda[ccd]"`. The CCD datasets (~70 MB) are not bundled with the package. On first use, they will be automatically downloaded from the PDB in CIF format and processed (one-time setup, may take a few minutes).

### Basic Component Information

In [None]:
# Get information about all components in the CCD
comp = scicoda.pdb.ccd()
print("All components information:")
comp

In [None]:
# Get information about ATP (adenosine triphosphate)
# When querying specific components, SciCoDa uses lazy evaluation
# This loads only the matching rows from disk, not the entire dataset

# Efficient: only loads matching rows
comp_atp = scicoda.pdb.ccd("ATP")
print("ATP component information:")
comp_atp

In [None]:

# Get multiple components at once
nucleotides = scicoda.pdb.ccd(
    ["ATP", "ADP", "AMP", "GTP", "GDP"],
)
print("Nucleotide information:")
nucleotides.select(["id", "name", "formula", "formula_weight"])

### Atom Details

In [None]:
# Get atom information for ATP
atp_atoms = scicoda.pdb.ccd("ATP", category="chem_comp_atom")
print(f"ATP has {len(atp_atoms)} atoms")
print("\nFirst 10 atoms:")
atp_atoms.head(10)

In [None]:
# Count atoms by element in ATP
atom_counts = atp_atoms.group_by("type_symbol").agg([
    pl.count().alias("count")
]).sort("count", descending=True)

print("Atom composition of ATP:")
atom_counts

### Bond Information

In [None]:
# Get bond information for ATP
atp_bonds = scicoda.pdb.ccd("ATP", category="chem_comp_bond")
print(f"ATP has {len(atp_bonds)} bonds")
print("\nFirst 10 bonds:")
atp_bonds.head(10)

In [None]:
# Count bond types
bond_counts = atp_bonds.group_by("value_order").agg([
    pl.count().alias("count")
]).sort("count", descending=True)

print("Bond types in ATP:")
bond_counts

### Chemical Identifiers and Descriptors

In [None]:
# Get chemical identifiers (InChI, InChIKey, etc.)
atp_identifiers = scicoda.pdb.ccd(
    comp_id="ATP",
    category="pdbx_chem_comp_identifier"
)
print("ATP chemical identifiers:")
atp_identifiers

In [None]:
# Get chemical descriptors (SMILES, InChI strings, etc.)
atp_descriptors = scicoda.pdb.ccd(
    comp_id="ATP",
    category="pdbx_chem_comp_descriptor"
)
print("ATP chemical descriptors:")
atp_descriptors

### Amino Acid Components

In [None]:
# Get information about amino acids
# Note: Use variant="aa" for amino acid components
amino_acids = scicoda.pdb.ccd(
    comp_id=["ALA", "GLY", "VAL", "LEU", "ILE"],
    category="chem_comp",
    variant="aa"
)
print("Amino acid information:")
amino_acids.select(["id", "name", "formula", "type"])

## 5. Advanced Usage

### Combining Data Sources

In [None]:
# Example: Get van der Waals radii for elements in a compound

# Get atoms in ATP
atp_atoms = scicoda.pdb.ccd(comp_id="ATP", category="chem_comp_atom")

# Get periodic table
ptable = scicoda.atom.periodic_table()

# Join to get van der Waals radii
atp_with_radii = atp_atoms.join(
    ptable.select(["symbol", "vdwr", "vdwr_bo"]),
    left_on="type_symbol",
    right_on="symbol",
    how="left"
)

print("ATP atoms with van der Waals radii:")
atp_with_radii.select(["atom_id", "type_symbol", "charge", "vdwr", "vdwr_bo"])

## Summary

This quickstart guide covered:

1. âœ… Loading and exploring periodic table data
2. âœ… Accessing AutoDock atom type definitions
3. âœ… Querying the PDB Chemical Component Dictionary
4. âœ… Combining data from multiple sources

For more information:
- Read the [README.md](README.md) for detailed API documentation
- Check the inline docstrings: `help(scicoda.atom.periodic_table)`
- Report issues or contribute at the GitHub repository

Happy coding! ðŸš€