Notebook 1: Setup & Raw Data Ingestion

* Goal: Load the raw PRIS data, understand its structure, and confirm our project's objective.

Here is the Python code for this notebook. It will load the file you've provided, Reactor Database - master(pris.iaea.xlsx - in.csv, and perform the initial inspection.

In [2]:
import pandas as pd
import numpy as np

# Set display options for better inspection
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)

print("--- Notebook 1: Setup & Raw Data Ingestion ---")

# 1. Define the file path
file_path = 'Reactor Database - master(pris.iaea.xlsx'

# 2. Load the dataset
try:
    df_raw = pd.read_excel(file_path)
    print(f"Successfully loaded file: {file_path}")

    # 3. Initial Inspection: Column Names
    print("\n--- All Column Names ---")
    print(df_raw.columns.tolist())

    # 4. Initial Inspection: First 5 Rows
    print("\n--- First 5 Rows (head) ---")
    print(df_raw.head())

    # 5. Initial Inspection: Data Info (Types and Nulls)
    print("\n--- DataFrame Info (Dtypes and Non-Null Counts) ---")
    df_raw.info()

    print("\nNotebook 1 complete. We have loaded and inspected the raw data.")

except FileNotFoundError:
    print(f"Error: The file '{file_path}' was not found.")
except Exception as e:
    print(f"An error occurred: {e}")

--- Notebook 1: Setup & Raw Data Ingestion ---
Successfully loaded file: Reactor Database - master(pris.iaea.xlsx

--- All Column Names ---
['Country', 'Reactor name', 'Status', 'Type', 'Model', 'Owner', 'Operator', 'Net Capacity, MWe', 'Design Net Capacity, MWe', 'Gross Capacity, MWe', 'Thermal Capacity, MWt', 'Construction Start Date', 'First Critically Date', 'Grid Connection Date', 'Commercial Operation Date', 'Electricity Supplied, TW.h', 'Energy Availability Factor', 'Operation Factor', 'Energy Unavailability Factor', 'Load Factor', 'Year', 'Electricity Supplied, GW.h', 'Reference Unit Power, MW', 'Annual Time On Line, h', 'Operation Factor.1', 'Energy Availability Factor Annual', 'Energy Availability Factor Cumulative', 'Load Factor Annual', 'Load Factor Cumulative']

--- First 5 Rows (head) ---
     Country Reactor name       Status  Type     Model  \
0  ARGENTINA     ATUCHA-1  Operational  PHWR  PHWR KWU   
1  ARGENTINA     ATUCHA-1  Operational  PHWR  PHWR KWU   
2  ARGENTINA