# Notebook 5: Isotope Identification

## Introduction

### Gamma-Ray Spectroscopy

Radioactive isotopes emit characteristic gamma rays at specific energies. By identifying these "fingerprints" in a spectrum, we can determine which isotopes are present.

### NORM (Naturally Occurring Radioactive Material)

Common natural radioactive materials:
- **K-40**: Potassium-40 (1460.8 keV) - in soil, concrete, bananas
- **U-238 decay chain**: Multiple gamma lines (Ra-226, Pb-214, Bi-214)
- **Th-232 decay chain**: Tl-208 (2614.5 keV), Ac-228, others

### Applications

1. **Environmental monitoring**: Background radiation assessment
2. **Nuclear security**: Illicit material detection
3. **Medical**: Isotope verification
4. **Industrial**: Quality control, contamination detection

### Isotope Identification Workflow

1. **Energy calibration**: Convert ADC → keV (from Notebook 2)
2. **Peak finding**: Locate gamma-ray photopeaks
3. **Peak fitting**: Gaussian fit for accurate centroid and FWHM
4. **Library matching**: Compare peaks to isotope library
5. **Activity estimation**: Calculate isotope activity
6. **Decay chain analysis**: Identify parent/daughter relationships

### Common Isotopes and Their Signatures

| Isotope | Primary Energies (keV) | Half-life | Common Source |
|---------|------------------------|-----------|---------------|
| K-40 | 1460.8 | 1.25 Gy | Soil, concrete |
| Cs-137 | 661.7 | 30.2 y | Medical, industrial |
| Co-60 | 1173.2, 1332.5 | 5.27 y | Industrial |
| Tl-208 | 2614.5, 583.2 | 3.05 m | Th-232 chain |
| Bi-214 | 609.3, 1120.3, 1764.5 | 19.9 m | U-238 chain |
| Pb-214 | 242.0, 295.2, 351.9 | 26.8 m | U-238 chain |

### Learning Objectives

1. Find peaks in gamma spectra
2. Fit Gaussian functions to peaks
3. Match peaks to isotope library
4. Identify decay chains
5. Estimate isotope activities
6. Handle complex multi-isotope spectra

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import signal, optimize
from scipy.stats import norm

plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams['figure.figsize'] = (14, 7)
np.random.seed(42)

print("✓ Libraries imported")

## 1. Isotope Library Database

In [None]:
# Comprehensive isotope library
ISOTOPE_LIBRARY = {
    'K-40': {
        'energies': [1460.8],
        'intensities': [10.67],  # % per decay
        'half_life': '1.25 Gy',
        'category': 'NORM'
    },
    'Cs-137': {
        'energies': [661.7],
        'intensities': [85.1],
        'half_life': '30.2 y',
        'category': 'Anthropogenic'
    },
    'Co-60': {
        'energies': [1173.2, 1332.5],
        'intensities': [99.9, 100.0],
        'half_life': '5.27 y',
        'category': 'Industrial'
    },
    'Na-22': {
        'energies': [511.0, 1274.5],
        'intensities': [180.0, 99.9],  # 511 gets double from annihilation
        'half_life': '2.60 y',
        'category': 'Calibration'
    },
    # U-238 decay chain
    'Ra-226': {
        'energies': [186.2],
        'intensities': [3.59],
        'half_life': '1600 y',
        'category': 'U-238 chain'
    },
    'Pb-214': {
        'energies': [242.0, 295.2, 351.9],
        'intensities': [7.43, 18.5, 35.6],
        'half_life': '26.8 m',
        'category': 'U-238 chain'
    },
    'Bi-214': {
        'energies': [609.3, 1120.3, 1764.5],
        'intensities': [45.5, 14.9, 15.4],
        'half_life': '19.9 m',
        'category': 'U-238 chain'
    },
    # Th-232 decay chain
    'Pb-212': {
        'energies': [238.6, 300.1],
        'intensities': [43.6, 3.18],
        'half_life': '10.6 h',
        'category': 'Th-232 chain'
    },
    'Tl-208': {
        'energies': [583.2, 860.6, 2614.5],
        'intensities': [85.0, 12.5, 99.8],
        'half_life': '3.05 m',
        'category': 'Th-232 chain'
    },
    'Ac-228': {
        'energies': [338.3, 911.2, 969.0],
        'intensities': [11.3, 25.8, 15.8],
        'half_life': '6.15 h',
        'category': 'Th-232 chain'
    },
    # Medical isotopes
    'I-131': {
        'energies': [364.5],
        'intensities': [81.5],
        'half_life': '8.02 d',
        'category': 'Medical'
    },
    'Tc-99m': {
        'energies': [140.5],
        'intensities': [89.0],
        'half_life': '6.01 h',
        'category': 'Medical'
    }
}

print(f"✓ Isotope library loaded with {len(ISOTOPE_LIBRARY)} isotopes")
print(f"\nCategories:")
categories = {}
for iso, data in ISOTOPE_LIBRARY.items():
    cat = data['category']
    categories[cat] = categories.get(cat, 0) + 1

for cat, count in sorted(categories.items()):
    print(f"  {cat}: {count} isotopes")

## 2. Generate Synthetic Multi-Isotope Spectrum

In [None]:
def generate_multi_isotope_spectrum(isotopes, activities, 
                                   acquisition_time=3600,
                                   efficiency=0.01,
                                   resolution_percent=7.0):
    """
    Generate realistic multi-isotope spectrum
    
    Parameters:
    -----------
    isotopes : list of str
        Isotope names from library
    activities : list of float
        Activities in Bq
    acquisition_time : float
        Measurement time in seconds
    efficiency : float
        Detector efficiency (0-1)
    resolution_percent : float
        Energy resolution at 662 keV
    
    Returns:
    --------
    energies : array
        Photon energies (keV)
    true_isotopes : list
        True isotope labels for each peak
    """
    all_energies = []
    
    for isotope, activity in zip(isotopes, activities):
        if isotope not in ISOTOPE_LIBRARY:
            continue
        
        data = ISOTOPE_LIBRARY[isotope]
        
        # For each gamma line
        for energy, intensity in zip(data['energies'], data['intensities']):
            # Expected count rate
            count_rate = activity * (intensity / 100.0) * efficiency
            expected_counts = count_rate * acquisition_time
            
            # Poisson fluctuation
            n_counts = np.random.poisson(expected_counts)
            
            # Generate photon energies with Gaussian broadening
            sigma = (resolution_percent / 100.0) * energy / 2.355
            peak_energies = np.random.normal(energy, sigma, n_counts)
            
            all_energies.extend(peak_energies)
    
    # Add Compton continuum (simplified)
    n_compton = int(len(all_energies) * 2)  # More Compton than photopeaks
    compton_energies = np.random.exponential(300, n_compton)
    compton_energies = compton_energies[compton_energies < 2000]
    
    all_energies.extend(compton_energies)
    
    return np.array(all_energies)

# Generate NORM spectrum (typical environmental background)
norm_isotopes = ['K-40', 'Pb-214', 'Bi-214', 'Tl-208', 'Ac-228']
norm_activities = [100, 50, 50, 30, 20]  # Bq

energies_norm = generate_multi_isotope_spectrum(
    norm_isotopes, norm_activities, 
    acquisition_time=3600,  # 1 hour
    efficiency=0.02
)

print(f"✓ Generated NORM spectrum")
print(f"  Total counts: {len(energies_norm)}")
print(f"  Energy range: {energies_norm.min():.1f} - {energies_norm.max():.1f} keV")
print(f"  Expected isotopes: {', '.join(norm_isotopes)}")

## 3. Visualize Spectrum

In [None]:
# Create histogram
counts, bins = np.histogram(energies_norm, bins=2000, range=(0, 3000))
bin_centers = (bins[:-1] + bins[1:]) / 2

fig, ax = plt.subplots(figsize=(15, 7))

ax.plot(bin_centers, counts, linewidth=1.5, color='blue')
ax.fill_between(bin_centers, counts, alpha=0.3, color='blue')

ax.set_xlabel('Energy (keV)', fontsize=13, fontweight='bold')
ax.set_ylabel('Counts', fontsize=13, fontweight='bold')
ax.set_title('NORM Gamma Spectrum - Multiple Isotopes', fontsize=15, fontweight='bold')
ax.set_yscale('log')
ax.grid(True, alpha=0.3)
ax.set_xlim(0, 3000)

# Mark known peaks
known_peaks = [
    (1460.8, 'K-40'),
    (609.3, 'Bi-214'),
    (1764.5, 'Bi-214'),
    (2614.5, 'Tl-208')
]

for energy, label in known_peaks:
    ax.axvline(energy, color='red', linestyle='--', alpha=0.7, linewidth=1.5)
    ax.text(energy, ax.get_ylim()[1]*0.5, label, 
           rotation=90, va='bottom', ha='right', fontsize=10, color='red')

plt.tight_layout()
plt.show()

print("✓ Spectrum plotted with known peak markers")

## 4. Peak Finding Algorithm

In [None]:
def find_peaks_in_spectrum(bin_centers, counts, 
                          prominence_factor=5.0,
                          distance_kev=20,
                          min_energy=50):
    """
    Find peaks in gamma spectrum
    
    Parameters:
    -----------
    bin_centers : array
        Energy bin centers (keV)
    counts : array
        Spectrum counts
    prominence_factor : float
        Minimum peak prominence (multiples of sqrt(background))
    distance_kev : float
        Minimum distance between peaks (keV)
    min_energy : float
        Minimum energy to consider (keV)
    
    Returns:
    --------
    peak_energies : array
        Energies of found peaks
    peak_counts : array
        Counts at peak positions
    """
    # Smooth spectrum
    from scipy.ndimage import gaussian_filter1d
    smoothed = gaussian_filter1d(counts, sigma=2)
    
    # Estimate background
    # Use percentile to be robust to peaks
    background = np.percentile(smoothed, 10)
    
    # Minimum prominence
    min_prominence = prominence_factor * np.sqrt(background + 1)
    
    # Convert distance to bins
    bin_width = bin_centers[1] - bin_centers[0]
    distance_bins = int(distance_kev / bin_width)
    
    # Find peaks
    peaks, properties = signal.find_peaks(
        smoothed,
        prominence=min_prominence,
        distance=distance_bins
    )
    
    # Filter by minimum energy
    peak_energies = bin_centers[peaks]
    peak_counts = counts[peaks]
    
    valid = peak_energies > min_energy
    peak_energies = peak_energies[valid]
    peak_counts = peak_counts[valid]
    
    # Sort by energy
    sort_idx = np.argsort(peak_energies)
    peak_energies = peak_energies[sort_idx]
    peak_counts = peak_counts[sort_idx]
    
    return peak_energies, peak_counts

# Find peaks
peak_energies, peak_counts = find_peaks_in_spectrum(bin_centers, counts)

print(f"✓ Found {len(peak_energies)} peaks")
print(f"\nTop 10 peaks by intensity:")

# Sort by counts for display
top_indices = np.argsort(peak_counts)[::-1][:10]
for i, idx in enumerate(top_indices):
    print(f"  {i+1:2d}. {peak_energies[idx]:7.1f} keV  (counts: {peak_counts[idx]:5.0f})")

## 5. Peak Fitting for Accurate Centroid

In [None]:
def fit_gaussian_peak(bin_centers, counts, peak_energy, 
                     fit_width_kev=100):
    """
    Fit Gaussian to peak for accurate centroid
    """
    # Select region
    mask = np.abs(bin_centers - peak_energy) < fit_width_kev
    x = bin_centers[mask]
    y = counts[mask]
    
    if len(x) < 5:
        return None
    
    # Gaussian + linear background
    def model(e, amp, mu, sigma, bg_slope, bg_offset):
        gaussian = amp * np.exp(-0.5 * ((e - mu) / sigma)**2)
        background = bg_slope * e + bg_offset
        return gaussian + background
    
    try:
        # Initial guess
        p0 = [
            y.max() - y.min(),  # amplitude
            peak_energy,  # mean
            20,  # sigma (keV)
            0,  # bg slope
            y.min()  # bg offset
        ]
        
        popt, pcov = optimize.curve_fit(model, x, y, p0=p0)
        
        amp, mu, sigma, bg_slope, bg_offset = popt
        fwhm = 2.355 * abs(sigma)
        
        # Calculate uncertainties
        perr = np.sqrt(np.diag(pcov))
        
        return {
            'centroid': mu,
            'centroid_err': perr[1],
            'amplitude': amp,
            'fwhm': fwhm,
            'resolution': fwhm / mu * 100,  # %
            'net_area': amp * sigma * np.sqrt(2 * np.pi)
        }
    except:
        return None

# Fit all found peaks
fitted_peaks = []

for peak_e in peak_energies:
    fit_result = fit_gaussian_peak(bin_centers, counts, peak_e)
    if fit_result:
        fitted_peaks.append(fit_result)

print(f"✓ Successfully fitted {len(fitted_peaks)} / {len(peak_energies)} peaks")
print(f"\nFitted peak details:")
print(f"{'Energy (keV)':<15} {'FWHM (keV)':<12} {'Resolution (%)':<15} {'Net Area'}")
print("-" * 60)

for fit in sorted(fitted_peaks, key=lambda x: x['centroid'])[:10]:
    print(f"{fit['centroid']:8.2f} ± {fit['centroid_err']:.2f}   "
          f"{fit['fwhm']:8.2f}     {fit['resolution']:8.2f}       {fit['net_area']:8.0f}")

## 6. Isotope Identification via Library Matching

In [None]:
def match_peaks_to_library(fitted_peaks, isotope_library, 
                          tolerance_kev=10.0):
    """
    Match observed peaks to isotope library
    
    Parameters:
    -----------
    fitted_peaks : list of dict
        Fitted peak parameters
    isotope_library : dict
        Isotope database
    tolerance_kev : float
        Maximum energy difference for match (keV)
    
    Returns:
    --------
    identifications : list of dict
        Matched isotopes with confidence scores
    """
    matches = []
    
    for peak in fitted_peaks:
        peak_energy = peak['centroid']
        peak_candidates = []
        
        # Search library
        for isotope, data in isotope_library.items():
            for lib_energy, intensity in zip(data['energies'], data['intensities']):
                energy_diff = abs(peak_energy - lib_energy)
                
                if energy_diff < tolerance_kev:
                    # Confidence based on energy match and intensity
                    energy_match_score = 1.0 - (energy_diff / tolerance_kev)
                    intensity_score = intensity / 100.0  # Normalize
                    
                    confidence = (energy_match_score + intensity_score) / 2.0
                    
                    peak_candidates.append({
                        'isotope': isotope,
                        'library_energy': lib_energy,
                        'energy_diff': energy_diff,
                        'intensity': intensity,
                        'confidence': confidence,
                        'category': data['category']
                    })
        
        if peak_candidates:
            # Sort by confidence
            peak_candidates.sort(key=lambda x: x['confidence'], reverse=True)
            
            matches.append({
                'peak_energy': peak_energy,
                'peak_area': peak['net_area'],
                'matches': peak_candidates
            })
    
    return matches

# Perform identification
identifications = match_peaks_to_library(fitted_peaks, ISOTOPE_LIBRARY)

print(f"✓ Matched {len(identifications)} peaks to library")
print(f"\nIsotope Identification Results:\n")
print(f"{'Peak (keV)':<12} {'Best Match':<15} {'Lib. E (keV)':<13} {'Diff (keV)':<12} {'Confidence'}")
print("=" * 75)

for match in identifications[:15]:  # Show top 15
    if match['matches']:
        best = match['matches'][0]
        print(f"{match['peak_energy']:8.2f}     "
              f"{best['isotope']:<15} "
              f"{best['library_energy']:8.2f}      "
              f"{best['energy_diff']:6.2f}       "
              f"{best['confidence']:.3f}")

## 7. Identify Present Isotopes

In [None]:
# Count isotope occurrences
isotope_scores = {}

for match in identifications:
    if match['matches']:
        best = match['matches'][0]
        isotope = best['isotope']
        confidence = best['confidence']
        
        if isotope not in isotope_scores:
            isotope_scores[isotope] = {
                'count': 0,
                'total_confidence': 0,
                'category': best['category'],
                'matched_energies': []
            }
        
        isotope_scores[isotope]['count'] += 1
        isotope_scores[isotope]['total_confidence'] += confidence
        isotope_scores[isotope]['matched_energies'].append(match['peak_energy'])

# Calculate average confidence
for isotope in isotope_scores:
    isotope_scores[isotope]['avg_confidence'] = \
        isotope_scores[isotope]['total_confidence'] / isotope_scores[isotope]['count']

# Sort by number of matched peaks
identified_isotopes = sorted(isotope_scores.items(), 
                            key=lambda x: (x[1]['count'], x[1]['avg_confidence']), 
                            reverse=True)

print("\nIdentified Isotopes (ranked by confidence):\n")
print(f"{'Isotope':<12} {'Category':<15} {'# Peaks':<10} {'Avg Conf.':<12} {'Matched Energies (keV)'}")
print("=" * 90)

for isotope, data in identified_isotopes:
    energies_str = ', '.join([f"{e:.1f}" for e in data['matched_energies'][:5]])
    if len(data['matched_energies']) > 5:
        energies_str += '...'
    
    print(f"{isotope:<12} {data['category']:<15} {data['count']:<10} "
          f"{data['avg_confidence']:<12.3f} {energies_str}")

print(f"\n✓ Identified {len(identified_isotopes)} isotopes")

## 8. Decay Chain Analysis

In [None]:
# Analyze decay chains
decay_chains = {
    'U-238 chain': ['Ra-226', 'Pb-214', 'Bi-214'],
    'Th-232 chain': ['Pb-212', 'Tl-208', 'Ac-228']
}

print("Decay Chain Analysis:\n")

for chain, members in decay_chains.items():
    found_members = [iso for iso in members if iso in isotope_scores]
    
    if found_members:
        print(f"\n{chain}:")
        print(f"  Found {len(found_members)}/{len(members)} members")
        print(f"  Present isotopes: {', '.join(found_members)}")
        
        # Check for secular equilibrium
        if len(found_members) >= 2:
            print(f"  → Chain is active (multiple daughters detected)")
    else:
        print(f"\n{chain}: Not detected")

print(f"\n✓ Decay chain analysis complete")

## 9. Visualize Identified Spectrum

In [None]:
fig, ax = plt.subplots(figsize=(16, 8))

# Plot spectrum
ax.plot(bin_centers, counts, linewidth=1.5, color='blue', alpha=0.7)
ax.fill_between(bin_centers, counts, alpha=0.2, color='blue')

# Annotate identified peaks
colors = {
    'U-238 chain': 'red',
    'Th-232 chain': 'green',
    'NORM': 'orange',
    'Anthropogenic': 'purple',
    'Industrial': 'brown'
}

y_max = ax.get_ylim()[1]

for match in identifications:
    if match['matches']:
        best = match['matches'][0]
        if best['confidence'] > 0.7:  # Only show high-confidence matches
            energy = match['peak_energy']
            isotope = best['isotope']
            category = best['category']
            color = colors.get(category, 'gray')
            
            ax.axvline(energy, color=color, linestyle='--', alpha=0.6, linewidth=1.5)
            ax.text(energy, y_max*0.7, f"{isotope}\n{energy:.0f}", 
                   rotation=90, va='bottom', ha='right', 
                   fontsize=9, color=color, fontweight='bold')

ax.set_xlabel('Energy (keV)', fontsize=13, fontweight='bold')
ax.set_ylabel('Counts', fontsize=13, fontweight='bold')
ax.set_title('Annotated Gamma Spectrum - Isotope Identification', 
            fontsize=15, fontweight='bold')
ax.set_yscale('log')
ax.grid(True, alpha=0.3)
ax.set_xlim(0, 3000)

# Add legend
from matplotlib.patches import Patch
legend_elements = [Patch(facecolor=color, label=cat) 
                  for cat, color in colors.items()]
ax.legend(handles=legend_elements, loc='upper right', fontsize=11)

plt.tight_layout()
plt.show()

print("✓ Annotated spectrum created")

## Summary

### Key Results

1. **Peak Finding**: Successfully identified major photopeaks in spectrum
2. **Isotope Matching**: Matched peaks to isotope library with confidence scores
3. **Decay Chains**: Detected presence of U-238 and Th-232 decay chains
4. **NORM Identification**: Identified common naturally occurring radioactive materials

### Practical Applications

**Environmental Monitoring**:
- Background radiation assessment
- NORM contamination detection
- Building material screening

**Nuclear Security**:
- Illicit material detection
- Border monitoring
- Threat source identification

**Medical/Industrial**:
- Isotope verification
- Quality control
- Contamination surveys

### Limitations and Challenges

1. **Energy Resolution**: Organic scintillators have poor resolution (~7-10%)
   - Peak overlap is common
   - Use inorganic detectors (NaI, HPGe) for better identification

2. **Compton Interference**: Strong Compton continuum can hide weak peaks

3. **Energy Calibration**: Accurate calibration is critical
   - ±1-2 keV error can cause misidentification

4. **Library Completeness**: Must include all expected isotopes

5. **Shielding Effects**: Can distort spectrum

### Best Practices

- **Use multiple peaks** for identification (single peak can be ambiguous)
- **Check decay chains** for consistency
- **Consider context** (expected sources in environment)
- **Validate with known sources**
- **Monitor energy calibration** stability

### Next Steps

Notebook 6 will explore deep learning approaches using raw waveforms for improved PSD discrimination.