# CSV Query Template (BMC Discovery)

Use this notebook to load a pre-exported CSV (e.g. from a DisMAL query) into pandas, optionally rename columns, and re-save it with a `Discovery Instance` identifier.


## Requirements

The template relies on `pandas` and the Python standard library. Uncomment the install cell below if your environment is missing pandas.


In [None]:
# %pip install -q pandas

import pandas as pd
from pathlib import Path
from typing import Dict, List


In [None]:
# --- User Inputs ---------------------------------------------------------
CSV_PATH = Path('path/to/query_results.csv')   # Update to your CSV export
DISCOVERY_INSTANCE = 'prod'                    # Label to insert into the dataset
OUTPUT_DIR = Path('output_csv')                # Where to write the processed CSV
OUTPUT_FILENAME = 'query_results.csv'          # Name of the output file

COLUMN_MAP: Dict[str, str] = {
    # Example: 'Original Column Name': 'Renamed Column'
}

NUMERIC_COLUMNS: List[str] = []  # Columns to coerce to integer (if present)


## Load the CSV
The helper below reads the CSV from disk and applies optional renaming and type conversions.


In [None]:
def load_csv(path: Path) -> pd.DataFrame:
    expanded = path.expanduser()
    if not expanded.exists():
        raise FileNotFoundError(f"CSV not found: {expanded}")
    return pd.read_csv(expanded)

def apply_column_map(df: pd.DataFrame) -> pd.DataFrame:
    return df.rename(columns=COLUMN_MAP) if COLUMN_MAP else df

def convert_numeric_columns(df: pd.DataFrame) -> pd.DataFrame:
    converted = df.copy()
    for col in NUMERIC_COLUMNS:
        if col in converted.columns:
            converted[col] = pd.to_numeric(converted[col], errors='coerce').astype('Int64')
    return converted


In [None]:
df = load_csv(CSV_PATH)
df.insert(0, 'Discovery Instance', DISCOVERY_INSTANCE)
df = apply_column_map(df)
df = convert_numeric_columns(df)
display(df.head(10))
print(f"Rows loaded: {len(df)}")


## Save processed CSV
The output directory will be created if it does not exist.


In [None]:
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
output_path = OUTPUT_DIR / OUTPUT_FILENAME
df.to_csv(output_path, index=False)
print(f'Saved to {output_path}')


---
### Notes
- Ensure `CSV_PATH` points to a valid CSV export from your query or report.
- Adjust `COLUMN_MAP` and `NUMERIC_COLUMNS` as needed for your dataset.
- You can extend the template with additional pandas transformations or visualisations.
