# ENSDF and RIPL Parsing

This notebook demonstrates the use of the ENSDF utilities to parse and generate the ENSDF csv files. It also demonstrates the use of the RIPL level parameteres file to cut-off the ENSDF csv file.

In [52]:
import sys
sys.paath.append("../..")

import nucml.datasets as nuc_data
import nucml.ensdf.parsing_utilities as ensdf_utils

In [88]:
import importlib
importlib.reload(nuc_data)
importlib.reload(ensdf_utils)

<module 'nucml.ensdf.parsing_utilities' from '../..\\nucml\\ensdf\\parsing_utilities.py'>

# Parsing ENSDF/RIPL Data

The ENSDF data is being extracted from the RIPL formated zXXX.dat files. 

In [89]:
# the `nuc_data` module contains all unique exfor elements
elements = nuc_data.exfor_elements

The `get_ripl_names()` function allows to get the path to all `.dat` files in the levels directory. We will use these to create our dataset. 

In [91]:
ripl_levels_dir = "../RIPL_3/levels/levels/"
names = ensdf_utils.get_ripl_names(ripl_levels_dir)

INFO:root:RIPL: Searching ../RIPL_3/levels/levels/ directory for .dat files...
INFO:root:RIPL: Finished. Found 118 .dat files.


We can now extract the header for each file which contains information we will be useful like the number of levels per isotope. 

In [57]:
tmp_directory = "../CSV_Files/"
ensdf_utils.get_header(names, tmp_directory)

INFO:root:HEADER: Extracting ENSDF headers ...
INFO:root:HEADER: Finished. Saved to ../tmp/all_ensdf_headers_formatted.csv


We now specify the directory we're the element-wise ensdf files will be stored. Whatever directory is specified, the `generate_elemental_ensdf()` function will create three directories where different types of files will be stored.

In [60]:
elemental_dir = "../Elemental_ENSDF/"
ensdf_utils.generate_elemental_ensdf(names, tmp_directory, elemental_dir)

INFO:root:GEN UTILS: Directory already exists. Re-initializing...
INFO:root:GEN UTILS: Directory restarted.
INFO:root:ENSDF Elemental: Extracting ENSDF data per element with header...
INFO:root:GEN UTILS: Directory already exists. Re-initializing...
INFO:root:GEN UTILS: Directory restarted.
INFO:root:ENSDF Elemental: Removing header from ENSDF elemental files...
INFO:root:GEN UTILS: Directory already exists. Re-initializing...
INFO:root:GEN UTILS: Directory restarted.
INFO:root:ENSDF Elemental: Formatting files...
INFO:root:ENSDF Elemental: Finished formating data.


The elemental files contain all known nuclear levels, if for some reason you only need a ground state CSV file. This can be created using the `get_stable_states()` utility.

In [67]:
ensdf_utils.get_stable_states(names, tmp_directory)

STABLE STATES: Extracting stable states from .dat files...
STABLE STATES: Formatting text file...
STABLE STATES: Finished.


With the elemental files created, we can know get a single ENSDF file containing all nuclear levels for all isotopes.

In [69]:
ensdf_utils.generate_ensdf_csv(tmp_directory, elemental_dir)

INFO:root:Creatign DataFrame with Basic ENSDF data ...
INFO:root:Finished creating list of dataframes.


# Cutting ENSDF using RIPL CutOff Parameters

We can easily remove nuclear levels above the given parameteres using the the RIPL level-params.dat file. If you wish to create your own cut-off parameters be sure to modify the level-params.dat following the original formatting, otherwise this script will not work.

First we start by parsing the levels-params.data file from RIPL:

In [None]:
ripl_level_params_dir = "../RIPL_3/levels/"
ensdf_utils.get_level_parameters(ripl_level_params_dir, saving_directory=tmp_directory)

Now we can use the created cut-off csv file to remove nuclear levels from the previuosly created ensdf.csv file.

In [87]:
elemental_hf_dir = "../Elemental_ENSDF/Elemental_ENSDF_no_Header_F/"
ensdf_utils.generate_cutoff_ensdf(tmp_directory, elemental_hf_dir)

INFO:root:ENSDF CutOff: Loading ENSDF and RIPL parameters...
INFO:root:ENSDF CutOff: Cutting off ENSDF...
INFO:root:ENSDF CutOff: Finished.


# PREVIOUS

In [22]:
# # Search all files withing the ENSDF directory
# directory = "../ENSDF_Files/"

# print("Searching directory for RIPL ENSDF files...")
# names = []
# for root, dirs, files in os.walk(directory):
#     for file in files:
#         if file.endswith(".dat"):
#             names.append(os.path.join(root, file))
            
# print("Gathered {} RIPL ENSDF files.".format(len(names)))
# names = natsorted(names)

# # We use the list of documents to extract only the data we need
# print("Extracting ENSDF headers ...")
# for i in names:
#     with open(i) as infile, open(resulting_files_dir + 'all_ensdf_headers.txt', 'a') as outfile:
#         for line in infile:
#             for z in elements:
#                 if z in line.split():
#                     outfile.write(line)
# print("Finished extracting headers.")

# # Using the document with all data we insert commas following the EXFOR format
# print("Formatting ENSDF header data...")
# with open(resulting_files_dir + "all_ensdf_headers.txt") as infile, open(resulting_files_dir + 'all_ensdf_headers_formatted.csv', 'w') as outfile:
#     for line in infile:
#         if line.strip():
#             string = list(line)
#             for i, j in enumerate([5, 10, 15, 20, 25, 30, 35, 47]):
#                 string.insert(i + j, ';')
#             outfile.write("".join(string))
# print("Finished formating data.")

# ensdf_index_col = ["SYMB", "A", "Z", "Nol", "Nog", "Nmax", "Nc", "Sn", "Sp"]
# ensdf_index = pd.read_csv(os.path.join(saving_directory, "all_ensdf_headers_formatted.csv"), names=ensdf_index_col, sep=";")
# ensdf_index["Text_Filenames"] = ensdf_index["SYMB"].apply(lambda x: x.strip())

# Verify that all EXFOR isotopes have information avaliable in ENSDF database.

# len(elements) == len(ensdf_index.SYMB.unique())

# element_list_endf = ensdf_index.SYMB.tolist() # string that files start with
# element_list_names = ensdf_index.Text_Filenames.tolist() # same strings but stripped

# ensdf_index.head()

### Extracting ENSDF Data per Element

In [366]:
# print("Extracting ENSDF data per element with header ...")
# for e in element_list_endf:
#     for i in names:
#         with open(i, "r") as infile, open(("Elemental_ENSDF/" + str(e).strip() + '.txt'), 'a') as outfile:
#             lines = infile.readlines()
#             for z, line in enumerate(lines):
#                 if line.startswith(str(e)):
#                     for y in range(0, 1 + ensdf_index[ensdf_index["SYMB"] == e][["Nol"]].values[0][0] + ensdf_index[ensdf_index["SYMB"] == e][["Nog"]].values[0][0]):
#                         outfile.write(lines[z + y])
# print("Finished extracting data per element with header.")

# "./Elementa_ENSDF/".strip("/") + "_v1/"

# os.path.join("./Elementa_ENSDF/", "").strip("/") + "_v1/"

Extracting ENSDF data per element with header ...
Finished extracting data per element with header.


### Extracting Stable States Only

In [305]:
# print("Extracting stable states ...")
# for e in element_list_endf:
#     for i in names:
#         with open(i, "r") as infile, open((resulting_files_dir + "ensdf_stable_state.txt"), 'a') as outfile:
#             lines = infile.readlines()
#             for z, line in enumerate(lines):
#                 if line.startswith(str(e)):
#                     outfile.write(e + lines[1 + z])
# print("Finished extracting stable states.")

# print("Formatting ENSDF stable states file ...")
# with open(resulting_files_dir + "ensdf_stable_state.txt") as infile, open(resulting_files_dir + 'ensdf_stable_state_formatted.csv', 'w') as outfile:
#     for line in infile:
#         if line.strip():
#             string = list(line)
#             for i, j in enumerate([5, 10, 19, 25, 28, 39, 42, 44, 46, 59, 68, 71, 74, 85, 93, 96, 107, 115]):
#                 string.insert(i + j, ';')
#             outfile.write("".join(string))
# print("Finished formating data.")

Extracting stable states ...
Finished extracting REACTION NOTATION.


### Extracting ENSDF Data per Element without Header

In [367]:
# print("Extracting ENSDF data per element without header ...")
# for e in element_list_endf:
#     for i in names:
#         with open(i, "r") as infile, open(("Elemental_ENSDF_v2/" + str(e).strip() + '.txt'), 'a') as outfile:
#             lines = infile.readlines()
#             for z, line in enumerate(lines):
#                 if line.startswith(str(e)):
#                     for y in range(1, 1 + ensdf_index[ensdf_index["SYMB"] == e][["Nol"]].values[0][0] + ensdf_index[ensdf_index["SYMB"] == e][["Nog"]].values[0][0]):
#                         outfile.write(lines[z + y])
# print("Finished extracting data per element without header.")

# print("Formatting ENSDF data...")
# for i in element_list_names:
#     with open("Elemental_ENSDF_v2/" + i + ".txt") as infile, open("Elemental_ENSDF_v3/" + i + ".txt", 'w') as outfile:
#         for line in infile:
#             if line.strip():
#                 string = list(line)
#                 for i, j in enumerate([4, 15, 20, 23, 34, 37, 39, 43, 54, 65, 66]):
#                     string.insert(i + j, ';')
#                 outfile.write("".join(string))
# print("Finished formating data.")

Extracting ENSDF data per element without header ...
Finished extracting data per element without header.


### Making DataFrame for ENSDF Inferal

In [16]:
# print("Creatign DataFrame with Basic ENSDF data ...")
# appended_data = []
# ensdf_cols = ["Level_Number", "Level_Energy", "Spin", "Parity", "Half_Life", 
#               "Number_Gammas", "Flag_Spin", "Flag_Energy", "Other", "Other2", "Other3", "Other4"]

# for e in element_list_names:
#     with open("./ENSDF/Elemental_ENSDF_v3/" + e + ".txt", "r") as infile:
#         element_ensdf = pd.read_csv(infile, sep=";", names=ensdf_cols)
#         element_ensdf["Level_Number"] = element_ensdf["Level_Number"].astype(str)
#         element_ensdf["Level_Number"] = element_ensdf["Level_Number"].apply(lambda x: x.strip())
#         element_ensdf["Level_Number"] = element_ensdf["Level_Number"].replace(to_replace="", value=np.nan)
#         element_ensdf = element_ensdf.dropna().reset_index(drop=True)
#         element_ensdf["Element_w_A"] = e
#         appended_data.append(element_ensdf)
# print("Finished creating list of dataframes.")

# appended_data = pd.concat(appended_data)

# appended_data = appended_data[["Level_Number", "Level_Energy", "Spin", "Parity", "Element_w_A"]]

# appended_data.head()

# len(appended_data["Element_w_A"].value_counts())

# appended_data_2 = pd.merge(appended_data, df[["Target_Protons", "Target_Neutrons", "Atomic_Mass_Micro", "Target_Mass_Number", "Element", "Element_w_A"]].drop_duplicates(subset=['Target_Protons', 'Target_Neutrons']), on='Element_w_A')

# appended_data.shape[0] == appended_data_2.shape[0]

# appended_data_2.to_csv("./ENSDF/ensdf_v1.csv", index=False)

# appended_data_2 = pd.read_csv("./ENSDF/ensdf_v1.csv")

# This dataset is for ENSDF prediction.

# appended_data_2.head()

# appended_data_2[appended_data_2.Target_Protons == 92]

Creatign DataFrame with Basic ENSDF data ...
Finished creating list of dataframes.


### Adding Stable 

In [39]:
# columns_ensdf = ["Element_w_A", "N1", "Elv[MeV]", "spin", "parity", "state_half_life", "Ng", "J", "unc", "spins", "nd", 
#                  "m", "percent", "mode", "other", "other1", "other2", "other3", "other4"]
# ensdf_final = pd.read_csv(resulting_files_dir + "ensdf_stable_state_formatted.csv", names=columns_ensdf, sep=";")
# ensdf_final["spin"] = ensdf_final["spin"].replace(to_replace=-1.0, value=3.5) 
# ensdf_final["parity"] = ensdf_final["parity"].replace(to_replace=0, value=1.0)
# ensdf_final["Element_w_A"] = ensdf_final["Element_w_A"].apply(lambda x: x.strip())
# ensdf_final = ensdf_final[["Element_w_A", "spin", "parity"]]

# df2 = pd.merge(df, ensdf_final, on='Element_w_A')

# df2.to_csv("../ML_Data/working_xs_v2_unsk.csv", index=False)

# Cutoff Energy

In [None]:
# # Using the document with all data we insert commas following the EXFOR format
# print("Formatting ENSDF cutoff data...")
# with open(resulting_files_dir + "levels-param.data.txt") as infile, open(resulting_files_dir + 'cut_off_ensdf_energies.csv', 'w') as outfile:
#     for line in infile:
#         if line.strip():
#             string = list(line)
#             for i, j in enumerate([4, 8, 11, 21, 31, 41, 51, 55, 59, 63, 67, 76, 85, 96, 98, 100, 104, 116]):
#                 string.insert(i + j, ';')
#             outfile.write("".join(string))
# print("Finished formating cutoff data.")

In [29]:
# cut_off_cols = ["Z", "A", "Element", "Temperature_MeV", "Temperature_U", "Black_Shift", 
#                 "Black_Shift_U", "N_Lev_ENSDF", "N_Max_Lev_Complete", "Min_Lev_Complete", 
#                 "Num_Lev_Unique_Spin", "E_Max_N_Max", "E_Num_Lev_U_Spin", "Other", "Other2", 
#                 "Flag", "Nox", "Other3", "Other4", "Spin_Cutoff"]
# cut_off = pd.read_csv("./ENSDF/Resulting_Files/cut_off_ensdf_energies.csv", names=cut_off_cols, sep=";")

# cut_off.tail()

# cut_off = cut_off[["Z", "A", "Element", "N_Lev_ENSDF", "N_Max_Lev_Complete", "E_Max_N_Max"]]
# cut_off["Element"] = cut_off["Element"].apply(lambda x: x.strip())
# cut_off["Element_w_A"] = cut_off["A"].astype(str) + cut_off["Element"]
# cut_off = cut_off[~cut_off.Element.str.contains(r'\d')]

# print("Reading data into dataframe...")
# df = pd.read_csv("./ENSDF/ensdf_v1.csv")
# print("Data read into dataframe!")

# # Converting specific columns to datatype 'string'
# str_cols = ["Spin", "Parity", "Element_w_A", "Element"]
# df[str_cols] = df[str_cols].astype('category')

# # Converting remaining columns to numeric type. 
# for col in list(df.columns):
#     if col not in str_cols:
#         df[col] = df[col].astype(float)

# # Converting proton, neutron and mass number features to integers
# int_cols = ["Level_Number", "Target_Protons", "Target_Neutrons", "Target_Mass_Number"]
# df[int_cols] = df[int_cols].astype(int)

# basic_cols = ["Level_Number", "Level_Energy", "Target_Protons", "Target_Neutrons", "Atomic_Mass_Micro", "Element_w_A"]
# df = df[basic_cols]

# element_list_names = df.Element_w_A.unique()

# print("Creatign Cut-off Dataframe ...")
# appended_data = []
# ensdf_cols = ["Level_Number", "Level_Energy", "Spin", "Parity", "Half_Life", 
#               "Number_Gammas", "Flag_Spin", "Flag_Energy", "Other", "Other2", "Other3", "Other4"]

# for e in element_list_names:
#     with open("./ENSDF/Elemental_ENSDF_v3/" + e + ".txt", "r") as infile:
#         element_ensdf = pd.read_csv(infile, sep=";", names=ensdf_cols)
#         element_ensdf["Level_Number"] = element_ensdf["Level_Number"].astype(str)
#         element_ensdf["Level_Number"] = element_ensdf["Level_Number"].apply(lambda x: x.strip())
#         element_ensdf["Level_Number"] = element_ensdf["Level_Number"].replace(to_replace="", value=np.nan)
#         element_ensdf = element_ensdf.dropna().reset_index(drop=True)
#         element_ensdf["Element_w_A"] = e
#         x = cut_off[cut_off.Element_w_A == e].N_Max_Lev_Complete.values[0]
#         if x == 0:
#             element_ensdf = element_ensdf.iloc[0:1]
#         else:
#             element_ensdf = element_ensdf.iloc[0:x]
#         appended_data.append(element_ensdf)
# print("Finished creating list of dataframes.")

# appended_data = pd.concat(appended_data)
# appended_data = appended_data[["Level_Number", "Level_Energy", "Spin", "Parity", "Element_w_A"]]

# appended_data_2 = pd.merge(appended_data, df[["Target_Protons", "Target_Neutrons", "Atomic_Mass_Micro", "Element_w_A"]].drop_duplicates(subset=['Target_Protons', 'Target_Neutrons']), on='Element_w_A')

# appended_data_2.to_csv("./ENSDF/ensdf_v2.csv", index=False)

Unnamed: 0,Z,A,Element,Temperature_MeV,Temperature_U,Black_Shift,Black_Shift_U,N_Lev_ENSDF,N_Max_Lev_Complete,Min_Lev_Complete,Num_Lev_Unique_Spin,E_Max_N_Max,E_Num_Lev_U_Spin,Other,Other2,Flag,Nox,Other3,Other4,Spin_Cutoff
3348,117,293,17,0.0,0.0,0.0,0.0,1,1,1,1,0.0,0.0,,,,0,,0.0,
3349,118,293,18,0.0,0.0,0.0,0.0,1,1,1,1,0.0,0.0,,,,0,,0.0,
3350,117,294,17,0.0,0.0,0.0,0.0,1,1,1,1,0.0,0.0,,,,0,,0.0,
3351,118,294,18,0.0,0.0,0.0,0.0,1,1,1,1,0.0,0.0,,,,0,,0.0,
3352,118,295,18,0.0,0.0,0.0,0.0,1,1,1,1,0.0,0.0,,,,0,,0.0,
