# ENDF Data Retrieval and Caching in PLEIADES

This notebook demonstrates how to use the PLEIADES nuclear data management system to efficiently retrieve and cache ENDF (Evaluated Nuclear Data File) resonance data for nuclear calculations.

## What You'll Learn

1. How to configure the PLEIADES nuclear data cache
2. Two different methods for retrieving ENDF data:
   - **DIRECT Method**: Downloads complete ENDF files and extracts resonance data
   - **API Method**: Uses the IAEA EXFOR API to download only the resonance section

## Benefits of Each Approach

- **DIRECT Method**: 
  - Gets the complete ENDF file (all sections)
  - Useful when you need the entire dataset
  - Higher storage requirements

- **API Method**:
  - Downloads only the resonance data section
  - Faster and more efficient
  - Lower storage requirements
  - Ideal when you only need resonance parameters

In [1]:
import time
import shutil
from pathlib import Path
from pleiades.nuclear.manager import NuclearDataManager, EndfLibrary
from pleiades.nuclear.models import DataRetrievalMethod
from pleiades.utils.config import get_config
import logging

# Configure logging to see debug information
logging.basicConfig(level=logging.INFO)

In [2]:
# Get configuration and show available settings
config = get_config()
print(f"Cache directory: {config.nuclear_data_cache_dir}")
print(f"Available data sources: {config.nuclear_data_sources}")

# Create output directory for parameter files
output_dir = Path("./endf_test")
output_dir.mkdir(exist_ok=True)
print(f"Output directory: {output_dir.absolute()}")

# Option to clear the cache (set to True to clear)
CLEAR_CACHE = True  # Change to True if you want to clear the cache

if CLEAR_CACHE:
    user_input = input("This will delete all cached ENDF files. Are you sure? (yes/no): ")
    if user_input.lower() == "yes":
        # Remove all files in the cache directory
        if config.nuclear_data_cache_dir.exists():
            shutil.rmtree(config.nuclear_data_cache_dir)
            print("Cache directory cleared.")
            # Recreate the empty directory
            config.nuclear_data_cache_dir.mkdir(parents=True, exist_ok=True)
    else:
        print("Cache clearing skipped. Note that notebook outputs may differ from descriptions due to existing cached files.")

Cache directory: /Users/8cz/.pleiades/nuclear_data
Available data sources: {'DIRECT': 'https://www-nds.iaea.org/public/download-endf', 'API': 'https://www-nds.iaea.org/exfor/servlet'}
Output directory: /Users/8cz/github.com/PLEIADES/notebook/tutorial/endf_test
Cache directory cleared.


In [3]:
# Initialize NuclearDataManager
nuclear_manager = NuclearDataManager()
print(f"Default ENDF library: {nuclear_manager.default_library}")

# List cache directories
cache_root = config.nuclear_data_cache_dir
print("Cache structure:")
if cache_root.exists():
    for method_dir in cache_root.iterdir():
        if method_dir.is_dir():
            print(f"  Method: {method_dir.name}")
            for library_dir in method_dir.iterdir():
                if library_dir.is_dir():
                    file_count = len(list(library_dir.glob("*.dat")))
                    print(f"    Library: {library_dir.name} ({file_count} files)")

Default ENDF library: EndfLibrary.ENDF_B_VIII_0
Cache structure:
  Method: DataRetrievalMethod.DIRECT
    Library: EndfLibrary.JENDL_5 (0 files)
    Library: EndfLibrary.TENDL_2021 (0 files)
    Library: EndfLibrary.JEFF_3_3 (0 files)
    Library: EndfLibrary.ENDF_B_VIII_1 (0 files)
    Library: EndfLibrary.ENDF_B_VIII_0 (0 files)
    Library: EndfLibrary.CENDL_3_2 (0 files)
  Method: DataRetrievalMethod.API
    Library: EndfLibrary.JENDL_5 (0 files)
    Library: EndfLibrary.TENDL_2021 (0 files)
    Library: EndfLibrary.JEFF_3_3 (0 files)
    Library: EndfLibrary.ENDF_B_VIII_1 (0 files)
    Library: EndfLibrary.ENDF_B_VIII_0 (0 files)
    Library: EndfLibrary.CENDL_3_2 (0 files)


## Isotope Setup

First, we'll load information about the U-238 isotope which will be used throughout this tutorial.

In [4]:
isotope_str = "U-238"  # Test with U-238
print(f"Testing with isotope: {isotope_str}")

# Get isotope info
isotope_info = nuclear_manager.isotope_manager.get_isotope_info(isotope_str)
print(f"Isotope details: {isotope_info}")

Testing with isotope: U-238
Searching for mass.mas20 in cached files for FileCategory.ISOTOPES: {PosixPath('/Users/8cz/github.com/PLEIADES/src/pleiades/nuclear/isotopes/files/mass.mas20'), PosixPath('/Users/8cz/github.com/PLEIADES/src/pleiades/nuclear/isotopes/files/neutrons.list'), PosixPath('/Users/8cz/github.com/PLEIADES/src/pleiades/nuclear/isotopes/files/isotopes.info')}
Checking file: mass.mas20
Found file: /Users/8cz/github.com/PLEIADES/src/pleiades/nuclear/isotopes/files/mass.mas20
Searching for isotopes.info in cached files for FileCategory.ISOTOPES: {PosixPath('/Users/8cz/github.com/PLEIADES/src/pleiades/nuclear/isotopes/files/mass.mas20'), PosixPath('/Users/8cz/github.com/PLEIADES/src/pleiades/nuclear/isotopes/files/neutrons.list'), PosixPath('/Users/8cz/github.com/PLEIADES/src/pleiades/nuclear/isotopes/files/isotopes.info')}
Checking file: mass.mas20
Checking file: neutrons.list
Checking file: isotopes.info
Found file: /Users/8cz/github.com/PLEIADES/src/pleiades/nuclear/iso

## Method 1: DIRECT Download

This method downloads the complete ENDF file and extracts only the resonance parameters.
- Downloads entire ENDF file (all sections)
- Extracts only the resonance parameter lines
- Stores the complete file in cache for future use

In [5]:
%%time
# Download and extract resonance file (first time should download)
print("\nDIRECT Method - First download (should fetch from remote):")
output_path = nuclear_manager.download_endf_resonance_file(
    isotope=isotope_info,
    library=EndfLibrary.ENDF_B_VIII_0,
    output_dir=str(output_dir),
    method=DataRetrievalMethod.DIRECT,
    use_cache=True
)
print(f"Output file: {output_path}")

INFO:pleiades.nuclear.manager:Downloading complete ENDF data from https://www-nds.iaea.org/public/download-endf/ENDF-B-VIII.0/n/n_9237_92-U-238.zip
INFO:pleiades.nuclear.manager:This will download and cache the entire ENDF file (all sections)



DIRECT Method - First download (should fetch from remote):


INFO:pleiades.nuclear.manager:Extracting resonance parameters from complete ENDF file
INFO:pleiades.nuclear.manager:Resonance parameters extracted and written to endf_test/092-U-238.B-VIII.0.par
INFO:pleiades.nuclear.manager:Note: Complete ENDF file was downloaded and resonance data was extracted (DIRECT method)


Output file: endf_test/092-U-238.B-VIII.0.par
CPU times: user 115 ms, sys: 63.9 ms, total: 179 ms
Wall time: 2.12 s


In [6]:
# Examine the cached file
cache_file_path = nuclear_manager._get_cache_file_path(
      method=DataRetrievalMethod.DIRECT,
      library=EndfLibrary.ENDF_B_VIII_0,
      isotope=isotope_info,
      mat=isotope_info.material_number
  )
print(f"Cache file path: {cache_file_path}")
print(f"Exists in cache: {cache_file_path.exists()}")

if cache_file_path.exists():
    # Get file size
    file_size = cache_file_path.stat().st_size
    print(f"Cache file size: {file_size / 1024:.2f} KB")

    # Preview first few lines
    print("\nFirst 5 lines of cached file:")
    with open(cache_file_path, 'r') as f:
        for i, line in enumerate(f):
            if i >= 5:
                break
            print(line.strip())

Cache file path: /Users/8cz/.pleiades/nuclear_data/DataRetrievalMethod.DIRECT/EndfLibrary.ENDF_B_VIII_0/n_9237_92-U-238.dat
Exists in cache: True
Cache file size: 15741.04 KB

First 5 lines of cached file:
Retrieved by E4-util: 2018/02/07,18:09:12                            1 0  0    0
9.223800+4 2.360058+2          1          1          0          19237 1451    1
0.000000+0 0.000000+0          0          0          0          69237 1451    2
1.000000+0 3.000000+7          0          0         10          89237 1451    3
0.000000+0 0.000000+0          0          0        566        2469237 1451    4


In [7]:
%%time

print("\nDIRECT Method - Second download (should use cache):")
output_path2 = nuclear_manager.download_endf_resonance_file(
    isotope=isotope_info,
    library=EndfLibrary.ENDF_B_VIII_0,
    output_dir=str(output_dir),
    method=DataRetrievalMethod.DIRECT,
    use_cache=True
)
print(f"Output file: {output_path2}")

# Check output file size
output_size = output_path2.stat().st_size
print(f"Output file size: {output_size / 1024:.2f} KB (only resonance parameters)")

# Compare with full ENDF file
full_file_size = cache_file_path.stat().st_size if cache_file_path.exists() else 0
if full_file_size > 0:
    print(f"Size difference: The resonance-only file is {output_size/full_file_size*100:.2f}% of the full ENDF file size")

INFO:pleiades.nuclear.manager:Using cached ENDF data from /Users/8cz/.pleiades/nuclear_data/DataRetrievalMethod.DIRECT/EndfLibrary.ENDF_B_VIII_0/n_9237_92-U-238.dat
INFO:pleiades.nuclear.manager:Extracting resonance parameters from complete ENDF file
INFO:pleiades.nuclear.manager:Resonance parameters extracted and written to endf_test/092-U-238.B-VIII.0.par
INFO:pleiades.nuclear.manager:Note: Complete ENDF file was downloaded and resonance data was extracted (DIRECT method)



DIRECT Method - Second download (should use cache):
Output file: endf_test/092-U-238.B-VIII.0.par
Output file size: 655.75 KB (only resonance parameters)
Size difference: The resonance-only file is 4.17% of the full ENDF file size
CPU times: user 33.4 ms, sys: 12.3 ms, total: 45.7 ms
Wall time: 43 ms


## Method 2: API-Based Retrieval

This method uses the IAEA EXFOR API to directly download only the resonance section of the ENDF file.
- Only downloads the neutron resonance section (much smaller file)
- More efficient when you only need resonance parameters
- Still supports caching for improved performance on repeated use

In [8]:
# Examine the API cached file
api_cache_file_path = nuclear_manager._get_cache_file_path(
      method=DataRetrievalMethod.API,
      library=EndfLibrary.ENDF_B_VIII_0,
      isotope=isotope_info,
      mat=isotope_info.material_number
  )
print(f"API cache file path: {api_cache_file_path}")
print(f"Exists in cache: {api_cache_file_path.exists()}")

if api_cache_file_path.exists():
    # Get file size
    file_size = api_cache_file_path.stat().st_size
    print(f"API cache file size: {file_size / 1024:.2f} KB")

    # Compare with DIRECT method cache size
    if cache_file_path.exists():
        direct_size = cache_file_path.stat().st_size
        print(f"Size comparison: API cache is {file_size/direct_size*100:.2f}% the size of DIRECT cache")

    # Preview first few lines
    print("\nFirst 5 lines of API cached file:")
    with open(api_cache_file_path, 'r') as f:
        for i, line in enumerate(f):
            if i >= 5:
                break
            print(line.strip())

API cache file path: /Users/8cz/.pleiades/nuclear_data/DataRetrievalMethod.API/EndfLibrary.ENDF_B_VIII_0/n_092-U-238_9237_resonance.dat
Exists in cache: False


In [9]:
%%time
# Second download using API method (should use cache)
print("\nAPI Method - Second download (should use cache):")
output_path_api2 = nuclear_manager.download_endf_resonance_file(
    isotope=isotope_info,
    library=EndfLibrary.ENDF_B_VIII_0,
    output_dir=str(output_dir),
    method=DataRetrievalMethod.API,
    use_cache=True
)
print(f"Output file: {output_path_api2}")

# Check output file size
api_output_size = output_path_api2.stat().st_size if output_path_api2.exists() else 0
if api_output_size > 0 and output_size > 0:
    print(f"Output file comparison: Both API and DIRECT methods produce the same resonance-only output files")
    print(f"API output size: {api_output_size / 1024:.2f} KB")
    print(f"DIRECT output size: {output_size / 1024:.2f} KB")

INFO:pleiades.nuclear.manager:Searching for neutron resonance data for U-238 in ENDF/B-VIII.0 via IAEA EXFOR API
INFO:pleiades.nuclear.manager:Note: This will download ONLY the resonance section, not the complete ENDF file



API Method - Second download (should use cache):


INFO:pleiades.nuclear.manager:Found neutron resonance data (SectID: 19173028) for U-238 in INDEN-Aug2023
INFO:pleiades.nuclear.manager:Downloaded neutron resonance data section for U-238
INFO:pleiades.nuclear.manager:Cached as: /Users/8cz/.pleiades/nuclear_data/DataRetrievalMethod.API/EndfLibrary.ENDF_B_VIII_0/n_092-U-238_9237_resonance.dat
INFO:pleiades.nuclear.manager:API method returned resonance data directly, no extraction needed
INFO:pleiades.nuclear.manager:Resonance parameters written to endf_test/092-U-238.B-VIII.0.par
INFO:pleiades.nuclear.manager:Note: Only resonance data was downloaded (API method)


Output file: endf_test/092-U-238.B-VIII.0.par
Output file comparison: Both API and DIRECT methods produce the same resonance-only output files
API output size: 278.60 KB
DIRECT output size: 655.75 KB
CPU times: user 18.8 ms, sys: 7.89 ms, total: 26.7 ms
Wall time: 1.58 s


## Performance Comparison

Let's compare the performance of both methods by downloading data for another isotope (U-235).
We'll measure the time taken for each method and examine the cache sizes.

In [10]:
isotope_str2 = "U-235"
print(f"\nTesting with another isotope: {isotope_str2}")

isotope_info2 = nuclear_manager.isotope_manager.get_isotope_info(isotope_str2)
print(f"Isotope details: {isotope_info2}")


Testing with another isotope: U-235
Isotope details: IsotopeInfo class for: U-235


In [11]:
%%time
# Download U-235 data using DIRECT method
print("Downloading U-235 data using DIRECT method...")
start_time = time.time()
output_direct_u235 = nuclear_manager.download_endf_resonance_file(
    isotope=isotope_info2,
    library=EndfLibrary.ENDF_B_VIII_0,
    output_dir=str(output_dir),
    method=DataRetrievalMethod.DIRECT,
    use_cache=True
)
direct_time = time.time() - start_time
print(f"DIRECT method time: {direct_time:.2f} seconds")
print(f"Output file: {output_direct_u235}")

INFO:pleiades.nuclear.manager:Downloading complete ENDF data from https://www-nds.iaea.org/public/download-endf/ENDF-B-VIII.0/n/n_9228_92-U-235.zip
INFO:pleiades.nuclear.manager:This will download and cache the entire ENDF file (all sections)


Downloading U-235 data using DIRECT method...


INFO:pleiades.nuclear.manager:Extracting resonance parameters from complete ENDF file
INFO:pleiades.nuclear.manager:Resonance parameters extracted and written to endf_test/092-U-235.B-VIII.0.par
INFO:pleiades.nuclear.manager:Note: Complete ENDF file was downloaded and resonance data was extracted (DIRECT method)


DIRECT method time: 2.76 seconds
Output file: endf_test/092-U-235.B-VIII.0.par
CPU times: user 219 ms, sys: 177 ms, total: 396 ms
Wall time: 2.76 s


In [12]:
%%time
# Download U-235 data using API method
print("Downloading U-235 data using API method...")
start_time = time.time()
output_api_u235 = nuclear_manager.download_endf_resonance_file(
    isotope=isotope_info2,
    library=EndfLibrary.ENDF_B_VIII_0,
    output_dir=str(output_dir),
    method=DataRetrievalMethod.API,
    use_cache=True
)
api_time = time.time() - start_time
print(f"API method time: {api_time:.2f} seconds")
print(f"Output file: {output_api_u235}")

# Compare times
if direct_time > 0 and api_time > 0:
    print(f"\nPerformance comparison:")
    print(f"DIRECT method: {direct_time:.2f} seconds")
    print(f"API method: {api_time:.2f} seconds")
    
    if direct_time > api_time:
        print(f"API method was {direct_time/api_time:.2f}x faster")
    else:
        print(f"DIRECT method was {api_time/direct_time:.2f}x faster")

INFO:pleiades.nuclear.manager:Searching for neutron resonance data for U-235 in ENDF/B-VIII.0 via IAEA EXFOR API
INFO:pleiades.nuclear.manager:Note: This will download ONLY the resonance section, not the complete ENDF file


Downloading U-235 data using API method...


INFO:pleiades.nuclear.manager:Found neutron resonance data (SectID: 19172584) for U-235 in INDEN-Aug2023
INFO:pleiades.nuclear.manager:Downloaded neutron resonance data section for U-235
INFO:pleiades.nuclear.manager:Cached as: /Users/8cz/.pleiades/nuclear_data/DataRetrievalMethod.API/EndfLibrary.ENDF_B_VIII_0/n_092-U-235_9228_resonance.dat
INFO:pleiades.nuclear.manager:API method returned resonance data directly, no extraction needed
INFO:pleiades.nuclear.manager:Resonance parameters written to endf_test/092-U-235.B-VIII.0.par
INFO:pleiades.nuclear.manager:Note: Only resonance data was downloaded (API method)


API method time: 1.76 seconds
Output file: endf_test/092-U-235.B-VIII.0.par

Performance comparison:
DIRECT method: 2.76 seconds
API method: 1.76 seconds
API method was 1.57x faster
CPU times: user 21.6 ms, sys: 8.39 ms, total: 30 ms
Wall time: 1.76 s


## Cache Structure Summary

Let's examine the cache directory structure to see how files are organized and how the two methods compare in terms of storage requirements.

In [13]:
print("Cache content summary with filename patterns:")
cache_root = config.nuclear_data_cache_dir
total_direct_size = 0
total_api_size = 0

if cache_root.exists():
    for method_dir in cache_root.iterdir():
        if method_dir.is_dir():
            method_size = 0
            print(f"\nMethod: {method_dir.name}")
            for library_dir in method_dir.iterdir():
                if library_dir.is_dir():
                    files = list(library_dir.glob("*.dat"))
                    lib_size = sum(f.stat().st_size for f in files)
                    method_size += lib_size
                    
                    if files:
                        print(f"  Library: {library_dir.name}")
                        print(f"  Total size: {lib_size / 1024:.2f} KB")
                        for file in files:
                            file_size = file.stat().st_size
                            print(f"    - {file.name} ({file_size / 1024:.2f} KB)")
                            
                            # Check filename pattern
                            if "_" in file.name and file.name.split("_")[1].isdigit():
                                print(f"      Pattern: MAT_FIRST (non-zero-padded Z)")
                            else:
                                print(f"      Pattern: ELEMENT_FIRST (zero-padded Z)")
            
            # Add to method totals
            if method_dir.name == 'DIRECT':
                total_direct_size = method_size
            elif method_dir.name == 'API':
                total_api_size = method_size
            
            print(f"  Total {method_dir.name} cache size: {method_size / (1024*1024):.2f} MB")
    
    # Overall comparison
    if total_direct_size > 0 and total_api_size > 0:
        print(f"\nOverall storage comparison:")
        print(f"Total DIRECT cache: {total_direct_size / (1024*1024):.2f} MB")
        print(f"Total API cache: {total_api_size / (1024*1024):.2f} MB")
        print(f"API method uses {total_api_size/total_direct_size*100:.2f}% of the DIRECT method's storage")

Cache content summary with filename patterns:

Method: DataRetrievalMethod.DIRECT
  Library: EndfLibrary.ENDF_B_VIII_0
  Total size: 54604.15 KB
    - n_9228_92-U-235.dat (38863.12 KB)
      Pattern: MAT_FIRST (non-zero-padded Z)
    - n_9237_92-U-238.dat (15741.04 KB)
      Pattern: MAT_FIRST (non-zero-padded Z)
  Total DataRetrievalMethod.DIRECT cache size: 53.32 MB

Method: DataRetrievalMethod.API
  Library: EndfLibrary.ENDF_B_VIII_0
  Total size: 539.63 KB
    - n_092-U-235_9228_resonance.dat (261.04 KB)
      Pattern: ELEMENT_FIRST (zero-padded Z)
    - n_092-U-238_9237_resonance.dat (278.60 KB)
      Pattern: ELEMENT_FIRST (zero-padded Z)
  Total DataRetrievalMethod.API cache size: 0.53 MB


## Conclusion: Choosing the Right Method

Both methods have their advantages:

- **DIRECT Method**:
  - Downloads the complete ENDF file
  - Good when you need access to other sections beyond resonance data
  - Higher storage requirements
  - May take longer for initial download

- **API Method**:
  - Downloads only the resonance section
  - Faster downloads and lower storage requirements
  - Ideal when you only need resonance parameters for calculations
  - Particularly useful for large-scale processing of many isotopes

### Usage Guidelines:
- Use **DIRECT** when you need complete ENDF files or multiple sections
- Use **API** when you only need resonance parameters and want to optimize performance and storage

Both methods benefit from caching, which significantly speeds up subsequent retrievals.