# Processing C4 Files and Generating usable EXFOR Data

This notebook is the template for which the `.csv_creator()` function is based on. For an actual example of extracting and processing exfor data look at notebook number 1. 

In [5]:
import os
import sys
import logging
import pandas as pd
import numpy as np
import importlib

pd.set_option('display.max_columns', 500)
pd.set_option('display.max_rows', 50)
sys.path.append("../../")

import nucml.exfor.parsing_utilities as exfor_parsing
import nucml.general_utilities as gen_utils
import nucml.objects.objects as objects

In [6]:
# FOR PROTOTYPE
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

In [7]:
importlib.reload(exfor_parsing)
importlib.reload(gen_utils)
importlib.reload(objects)
print("Libraries reloaded.")

Libraries reloaded.


# All-in-one Function:  Extracting C4 Content

`NucML` comes with the `.get_all()` function. This function will save all information needed for later compilation in two directories: a directory where the heavy resulting files will be stored (given by the `heavy_path`) and a temporary folder where files generated during the processing are stored (given by the `tmp_path`). 

Be sure to specify a full path to a non-existent directory as whatever path is given the script will erease and re-create.

The function extracts the experimental data avaliable in each file in the `neutrons` directory and format it to be read by python. The data format is given by EXFOR so we will insert commas at the specified positions. Tools like `pandas` can read character-delimited files but the EXFOR files contain no delimiter. Because of this we insert the commas at position: 5, 11, 12, 15, 19, 22, 31, 40, 49, 58, 67, 76, 85, 94, 95, 122, and 127. 

The EXFOR files contain more information in the header of each experimental campaign. These needs to be extracted. Some of these features can be extracted using the following keywords:

- #AUTHOR1
- #YEAR
- #INSTITUTE 
- #TITLE
- #REFERENCE
- #DATE
- #REACTION
- #DATA (for number of data points in each experimental campaing)

All of these will result in light files therefore storing them in our tmp_path.

These functions show how to extract the intended information one at a time. We can get the same data faster by using the following optimized funciton.

In [11]:
exfor_directory = "../C4_Files/neutrons_2019_07_18/" # Path to the c4 files
mode = 'neutrons' # type of data that you are extracting

# I am going to define two different paths 
# Notice that if size is not an issue you can define both directories as the same one.
tmp_dir = "../tmp/"
heavy_dir = "../CSV_Files/"

# This will be appended to the previous directories
tmp_path = os.path.join(tmp_dir, "Extracted_Text_" + mode + "/")
heavy_path = os.path.join(heavy_dir, "EXFOR_" + mode + "/")

ame_dir = "../../AME/CSV_Files/"

In [12]:
# Gets a list of all .c4 file names from the EXFOR directory
c4_list = exfor_parsing.get_c4_names(exfor_directory)
exfor_parsing.get_all(c4_list, heavy_dir, tmp_path)

INFO:root:C4: Searching ../C4_Files/neutrons_2019_07_18/ directory for .c4 files...
INFO:root:C4: Finished. Found 623 .c4 files.
INFO:root:GEN UTILS: Directory does not exists. Creating...
INFO:root:GEN UTILS: Directory created.
INFO:root:GEN UTILS: Directory does not exists. Creating...
INFO:root:GEN UTILS: Directory created.
INFO:root:EXFOR: Extracting experimental data, authors, years, institutes, and dates...
INFO:root:EXFOR: Finished extracting experimental data, authors, years, institutes, and dates.
INFO:root:EXFOR: Extracting titles, references, and number of data points per experiment...
INFO:root:EXFOR: Finished extracting titles, references, and number of data points per experiment.
INFO:root:EXFOR: Formatting experimental data...
INFO:root:EXFOR: Finished formating experimental data.
INFO:root:EXFOR: Finished.


# Formatting the Extracted Data

The extracted data does not have much use in the current state. We need to bring it together to create a CSV file that we can use for any purpose. As mentioned, this notebook goes step by step into the `csv_creator()` function. It is the template for said function. For an actual example look at the next notebook.

### Cleaning Data

Data contains whitespace and special characters that we need to deal with. Additionally, we see that some columns do not have values but they have a value: a string of spaces. Pandas does not recognizes them as NaN values so we have to manually take care of them. We will also drop the references the YY and the SubEntry Number. 

In [13]:
colnames = ["Projectile", "Target_ZA", "Target_Metastable_State", "MF", "MT", "Product_Metastable_State", \
            "EXFOR_Status", "Center_of_Mass_Flag", "Energy",  "dEnergy",  "Data", "dData",   "Cos/LO",   "dCos/LO", \
            "ELV/HL",  "dELV/HL", "I78", "Short_Reference", "EXFOR_Accession_Number", "EXFOR_SubAccession_Number", \
            "EXFOR_Pointer"]
df = pd.read_csv(os.path.join(heavy_path, "all_cross_sections_v1.txt"), names=colnames, header=None, 
                 index_col=False, sep=";")

  interactivity=interactivity, compiler=compiler, result=result)


In [14]:
df.head()

Unnamed: 0,Projectile,Target_ZA,Target_Metastable_State,MF,MT,Product_Metastable_State,EXFOR_Status,Center_of_Mass_Flag,Energy,dEnergy,Data,dData,Cos/LO,dCos/LO,ELV/HL,dELV/HL,I78,Short_Reference,EXFOR_Accession_Number,EXFOR_SubAccession_Number,EXFOR_Pointer
0,1,1,,3,1,,D,,8.8200+7,882000.0,0.03,1.5232-3,,,,,,"D.F.MEASDAY,ET.AL. (66)",11152,2,
1,1,1,,3,1,,D,,9.8100+7,981000.0,0.0291,1.5162-3,,,,,,"D.F.MEASDAY,ET.AL. (66)",11152,2,
2,1,1,,3,1,,D,,1.1000+8,1100000.0,0.0279,1.4147-3,,,,,,"D.F.MEASDAY,ET.AL. (66)",11152,2,
3,1,1,,3,1,,D,,1.1960+8,1196000.0,0.0264,1.4031-3,,,,,,"D.F.MEASDAY,ET.AL. (66)",11152,2,
4,1,1,,3,1,,D,,1.2940+8,1294000.0,0.0256,1.3972-3,,,,,,"D.F.MEASDAY,ET.AL. (66)",11152,2,


In [15]:
# make string version of original column
df['Target_ZA'] = df['Target_ZA'].astype(str)

# Making Sure all rows have the same number of values
max_length = 5
df.Target_ZA = df.Target_ZA.apply(lambda x: '0'*(max_length - len(x)) + x)

# Target feature is formated as ZZAAA
df['Z'] = df['Target_ZA'].str[0:2].astype(int).fillna(0)
df['A'] = df['Target_ZA'].str[2:5].astype(int).fillna(0)

# Calculating number of neutrons = mass number - protons
df['N'] = df['A'] - df["Z"]

We assume that `Target_Metastable_State` with unknown values are not Ground State. Instead they are filled with `All` per IAEA instructions.

In [16]:
df["Target_Metastable_State"].unique()

array([' ', 'M', '1', '2'], dtype=object)

In [17]:
df["Product_Metastable_State"].unique()

array([' ', 'L', 'M', 'G', '2', '?', '+', '1'], dtype=object)

In [18]:
metastate_dict = {" ": "All_or_Total", "G": "Ground", "1": "M1", "2": "M2", "3": "M3", "4": "M4", 
                  "5": "M5", "?": "Unknown", "+": "More_than_1", "T": "All_or_Total"}
df = df.replace({"Target_Metastable_State": metastate_dict, "Product_Metastable_State": metastate_dict})

In [19]:
df["Target_Metastable_State"].unique()

array(['All_or_Total', 'M', 'M1', 'M2'], dtype=object)

In [20]:
df["Product_Metastable_State"].unique()

array(['All_or_Total', 'L', 'M', 'Ground', 'M2', 'Unknown', 'More_than_1',
       'M1'], dtype=object)

We assume that the `Frame` feature unknown values are `L` for Lab Frame.

In [21]:
df["EXFOR_Status"].unique()

array(['D', ' ', 'A', 'P', 'C', 'O', 'U', 'R'], dtype=object)

In [22]:
exfor_status_dict = {"U":"Un_normalized", "A":"Approved_by_Author", "C":"Correlated", "D":"Dependent", 
                     "O":"Outdated", "P":"Preliminary", "R":"Re_normalized", "S":"Superseded", " ":"Other"}
df = df.replace({"EXFOR_Status": exfor_status_dict})

In [23]:
df["EXFOR_Status"].unique()

array(['Dependent', 'Other', 'Approved_by_Author', 'Preliminary',
       'Correlated', 'Outdated', 'Un_normalized', 'Re_normalized'],
      dtype=object)

In [24]:
df['Center_of_Mass_Flag'].unique()

array([' ', 'C'], dtype=object)

In [25]:
df = df.replace({"Center_of_Mass_Flag": {"C":"Center_of_Mass", " ":"Lab"}})

In [26]:
df['Center_of_Mass_Flag'].unique()

array(['Lab', 'Center_of_Mass'], dtype=object)

In [27]:
df.head()

Unnamed: 0,Projectile,Target_ZA,Target_Metastable_State,MF,MT,Product_Metastable_State,EXFOR_Status,Center_of_Mass_Flag,Energy,dEnergy,Data,dData,Cos/LO,dCos/LO,ELV/HL,dELV/HL,I78,Short_Reference,EXFOR_Accession_Number,EXFOR_SubAccession_Number,EXFOR_Pointer,Z,A,N
0,1,1,All_or_Total,3,1,All_or_Total,Dependent,Lab,8.8200+7,882000.0,0.03,1.5232-3,,,,,,"D.F.MEASDAY,ET.AL. (66)",11152,2,,0,1,1
1,1,1,All_or_Total,3,1,All_or_Total,Dependent,Lab,9.8100+7,981000.0,0.0291,1.5162-3,,,,,,"D.F.MEASDAY,ET.AL. (66)",11152,2,,0,1,1
2,1,1,All_or_Total,3,1,All_or_Total,Dependent,Lab,1.1000+8,1100000.0,0.0279,1.4147-3,,,,,,"D.F.MEASDAY,ET.AL. (66)",11152,2,,0,1,1
3,1,1,All_or_Total,3,1,All_or_Total,Dependent,Lab,1.1960+8,1196000.0,0.0264,1.4031-3,,,,,,"D.F.MEASDAY,ET.AL. (66)",11152,2,,0,1,1
4,1,1,All_or_Total,3,1,All_or_Total,Dependent,Lab,1.2940+8,1294000.0,0.0256,1.3972-3,,,,,,"D.F.MEASDAY,ET.AL. (66)",11152,2,,0,1,1


## Fixing numerical features formatting.

In [28]:
# Defining Numerical Columns to Fix and casting them as strings
cols = ["Energy", "dEnergy", "Data", "dData", "Cos/LO", "dCos/LO", "ELV/HL", "dELV/HL"]
df[cols] = df[cols].astype(str)

In [29]:
# df[cols] = df[cols].replace(to_replace="         ", value="0.0000000")
df[cols] = df[cols].replace(to_replace="         ", value=np.nan)

# We now strip values that may contain quatation marks and starting and trailing spaces
for col in cols:
    df[col] = df[col].str.strip("\"")
    df[col] = df[col].str.strip()
    
# df[cols] = df[cols].replace(to_replace="", value="0.0000000")
df[cols] = df[cols].replace(to_replace="", value=np.nan)

In [30]:
# For the numerical values we know per formatting that each of them should be 9 characters in length
max_length = 9

for col in cols:
    df[col] = df[col].apply(lambda x: x if pd.isnull(x) else ' '*(max_length - len(x)) + x) 

In [31]:
# Add appropiate formating for python to recognize it as numerical 
for col in cols:
    new_col = []
    values = df[col].values
    for x in values:
        if pd.isnull(x):
            new_col.append(x)
        elif "+" == x[7]:
            y = x[0:7]
            z = x[7:]
            new_col.append(y + "E" + z)
        elif "+" == x[6]:
            y = x[0:6]
            z = x[6:]
            new_col.append(y + "E" + z)
        elif "-" == x[7]:
            y = x[0:7]
            z = x[7:]
            new_col.append(y + "E" + z)
        elif "-" == x[6]:
            y = x[0:6]
            z = x[6:]
            new_col.append(y + "E" + z)
        else:
            new_col.append(x)
    df[col] = new_col

In [32]:
# We now convert the columns to numerical
for col in cols:
    df[col] = df[col].astype(float)
    print("Finish converting {} to float.".format(col))

Finish converting Energy to float.
Finish converting dEnergy to float.
Finish converting Data to float.
Finish converting dData to float.
Finish converting Cos/LO to float.
Finish converting dCos/LO to float.
Finish converting ELV/HL to float.
Finish converting dELV/HL to float.


## Specifying Categorical Columns

In [33]:
cat_cols = ["Target_Metastable_State", "MF", "MT", "I78", "Product_Metastable_State", "Center_of_Mass_Flag"]

# Convering all columns to strings and stripping whitespace
for col in cat_cols:
    df[col] = df[col].astype(str)
    df[col] = df[col].str.strip("\"")
    df[col] = df[col].str.strip()

In [34]:
df.I78.unique()

array(['', 'E2', 'LVL', 'EXC', 'DE2', 'HL'], dtype=object)

In [35]:
df = df.replace({"I78": {"E2":"Secondary_Energy", "LVL":"Level", "HL":"Half_Life", "DLV":"Level_Range", 
                         "EXC":"Excitation", "DE2":"Secondary_Energy_Range", "MIN":"Minimum_Energy", 
                         "MAX":"Maximum_Energy", "":"Other"}})

In [36]:
df.I78.unique()

array(['Other', 'Secondary_Energy', 'Level', 'Excitation',
       'Secondary_Energy_Range', 'Half_Life'], dtype=object)

In [37]:
df.drop(columns=['Target_ZA'], inplace=True)

In [38]:
df.head()

Unnamed: 0,Projectile,Target_Metastable_State,MF,MT,Product_Metastable_State,EXFOR_Status,Center_of_Mass_Flag,Energy,dEnergy,Data,dData,Cos/LO,dCos/LO,ELV/HL,dELV/HL,I78,Short_Reference,EXFOR_Accession_Number,EXFOR_SubAccession_Number,EXFOR_Pointer,Z,A,N
0,1,All_or_Total,3,1,All_or_Total,Dependent,Lab,88200000.0,882000.0,0.03,0.001523,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,,0,1,1
1,1,All_or_Total,3,1,All_or_Total,Dependent,Lab,98100000.0,981000.0,0.0291,0.001516,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,,0,1,1
2,1,All_or_Total,3,1,All_or_Total,Dependent,Lab,110000000.0,1100000.0,0.0279,0.001415,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,,0,1,1
3,1,All_or_Total,3,1,All_or_Total,Dependent,Lab,119600000.0,1196000.0,0.0264,0.001403,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,,0,1,1
4,1,All_or_Total,3,1,All_or_Total,Dependent,Lab,129400000.0,1294000.0,0.0256,0.001397,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,,0,1,1


## Appending Additional Information from EXFOR

In [39]:
# Reading experiments reaction notation
df1 = pd.read_csv(os.path.join(tmp_path, "reaction_notations.txt"), delim_whitespace=True, header=None)
df1.columns = ["Reaction", "Reaction_Notation"]

# Reading Experiment Titles
df2 = pd.read_csv(os.path.join(tmp_path, "titles.txt"), sep="#TITLE      ", header=None, engine="python")
df2.columns = ["Keyword", "Title"]

# Reading Data Points per Experiment
df3 = pd.read_csv(os.path.join(tmp_path, "data_points_per_experiment_refined.txt"),  delim_whitespace=True, header=None)
df3.columns = ["Data", "Multiple"]

# Reading Experiment Year
df4 = pd.read_csv(os.path.join(tmp_path, "years.txt"), delim_whitespace=True, header=None)
df4.columns = ["Keyword", "Year"]

# Reading Experiment Date
df5 = pd.read_csv(os.path.join(tmp_path, "authors.txt"), sep="    ", header=None, engine="python")
df5.columns = ["Keyword", "Author"]

# Reading Experiment Institute
df6 = pd.read_csv(os.path.join(tmp_path, "institutes.txt"), sep="  ", header=None, engine="python")
df6.columns = ["Keyword", "Institute"]

# Reading Experiment Year
df7 = pd.read_csv(os.path.join(tmp_path, "dates.txt"), delim_whitespace=True, header=None)
df7.columns = ["Keyword", "Date"]

# Reading Experiment Refere
df8 = pd.read_csv(os.path.join(tmp_path, "references.txt"), sep="#REFERENCE  ", header=None, engine="python")
df8.columns = ["Keyword", "Reference"]

# Reading Dataset Number
df9 = pd.read_csv(os.path.join(tmp_path, "dataset_num.txt"), sep="#DATASET    ", header=None, engine="python")
df9.columns = ["Keyword", "Dataset_Number"]

# Reading EXFOR entry number
df10 = pd.read_csv(os.path.join(tmp_path, "entry.txt"), sep="#ENTRY      ", header=None, engine="python")
df10.columns = ["Keyword", "EXFOR_Entry"]

# Reading reference code
df11 = pd.read_csv(os.path.join(tmp_path, "refcode.txt"), sep="#REF-CODE   ", header=None, engine="python")
df11.columns = ["Keyword", "Reference_Code"]

In [40]:
# Merging Datapoints, notation and titles and expanding based on datapoints
logging.info("EXFOR CSV: Expanding information based on the number of datapoints per experimental campaign...")
pre_final = pd.concat([df3, df1, df2, df4, df5, df6, df7, df8, df9, df10, df11], axis=1)
final = pre_final.reindex(pre_final.index.repeat(pre_final.Multiple))
final['position'] = final.groupby(level=0).cumcount() + 1

INFO:root:EXFOR CSV: Expanding information based on the number of datapoints per experimental campaign...


In [41]:
# # Extracting projectile and outogoing particle
# final["reaction_notation"] = final.Type.str.extract('.*\((.*)\).*')

# final["reaction_notation2"] = final["reaction_notation"].apply(lambda x: x.split(')')[0])
# final = pd.concat([final, final["reaction_notation2"].str.split(',', expand=True)], axis=1)

In [42]:
# # Formatting Columns
# new_columns = list(final.columns)
# new_columns.extend(["Projectile", "Out"])
# final.columns = new_columns

In [43]:
# Indexing only required information and saving file
final = final[["Reaction_Notation", "Title", "Year", "Author", "Institute", "Date", "Reference", 
               "Dataset_Number", "EXFOR_Entry", "Reference_Code"]]

In [44]:
# Verify all data matches.
df.shape[0] == final.shape[0]

# Reset Indexes to make copying faster
df = df.reset_index(drop=True)
final = final.reset_index(drop=True)

In [45]:
final.head()

Unnamed: 0,Reaction_Notation,Title,Year,Author,Institute,Date,Reference,Dataset_Number,EXFOR_Entry,Reference_Code
0,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,(1USAHRV),19800804,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)"
1,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,(1USAHRV),19800804,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)"
2,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,(1USAHRV),19800804,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)"
3,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,(1USAHRV),19800804,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)"
4,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,(1USAHRV),19800804,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)"


In [46]:
# Assign newly extracted data to main dataframe
df["Reaction_Notation"] = final["Reaction_Notation"]
df["Title"] = final["Title"]
df["Year"] = final["Year"]
df["Author"] = final["Author"]
df["Institute"] = final["Institute"]
df["Date"] = final["Date"]
df["Reference"] = final["Reference"]
df["Dataset_Number"] = final["Dataset_Number"]
df["EXFOR_Entry"] = final["EXFOR_Entry"]
df["Reference_Code"] = final["Reference_Code"]

Shape must be 6007126

In [47]:
# df = df[df.N != -1]
# df["Reference"] = df["Author"] + " " + df["Reference"]
# df = df.drop(columns=["Refer", "Author"])

df.Title = df.Title.fillna("No Title Found. Check EXFOR.")
df.Reference = df.Reference.fillna("No Reference Found. Check EXFOR.")
df.Short_Reference = df.Short_Reference.fillna("No Reference Found. Check EXFOR.")
df.Reference_Code = df.Reference_Code.fillna("No Reference Code Found. Check EXFOR.")
df.Author = df.Author.fillna("No Author Found. Check EXFOR.")
df.EXFOR_Pointer = df.EXFOR_Pointer.fillna("No Pointer")

In [48]:
import numbers

In [49]:
df.EXFOR_Pointer = df.EXFOR_Pointer.apply(lambda x: str(int(x)) if isinstance(x, numbers.Number) else x)
df.Date = df.Date.apply(lambda x: str(x)[:4] + "/" + str(x)[4:6] + "/" + str(x)[6:])
df.EXFOR_SubAccession_Number = df.EXFOR_SubAccession_Number.astype(int)
df.Institute = df.Institute.apply(lambda x: x.replace("(", "").replace(")", ""))

In [50]:
df = df.replace({'Projectile': {1: "neutron", 1001: "proton", 2003:"helion", 0:"gamma", 1002:"deuteron", 2004:"alpha"}})

if df.Projectile.unique()[0] == "neutron":
    Projectile_Z, Projectile_A, Projectile_N = 0, 1, 1
elif df.Projectile.unique()[0] == "proton":
    Projectile_Z, Projectile_A, Projectile_N = 1, 1, 0
elif df.Projectile.unique()[0] == "helion":
    Projectile_Z, Projectile_A, Projectile_N = 2, 3, 1
elif df.Projectile.unique()[0] == "gamma":
    Projectile_Z, Projectile_A, Projectile_N = 0, 0, 0
elif df.Projectile.unique()[0] == "deuteron":
    Projectile_Z, Projectile_A, Projectile_N = 1, 2, 1
elif df.Projectile.unique()[0] == "alpha":
    Projectile_Z, Projectile_A, Projectile_N = 2, 4, 2
df["Projectile_Z"] = Projectile_Z
df["Projectile_A"] = Projectile_A
df["Projectile_N"] = Projectile_N

In [51]:
df.head()

Unnamed: 0,Projectile,Target_Metastable_State,MF,MT,Product_Metastable_State,EXFOR_Status,Center_of_Mass_Flag,Energy,dEnergy,Data,dData,Cos/LO,dCos/LO,ELV/HL,dELV/HL,I78,Short_Reference,EXFOR_Accession_Number,EXFOR_SubAccession_Number,EXFOR_Pointer,Z,A,N,Reaction_Notation,Title,Year,Author,Institute,Date,Reference,Dataset_Number,EXFOR_Entry,Reference_Code,Projectile_Z,Projectile_A,Projectile_N
0,neutron,All_or_Total,3,1,All_or_Total,Dependent,Lab,88200000.0,882000.0,0.03,0.001523,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,1,1,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1
1,neutron,All_or_Total,3,1,All_or_Total,Dependent,Lab,98100000.0,981000.0,0.0291,0.001516,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,1,1,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1
2,neutron,All_or_Total,3,1,All_or_Total,Dependent,Lab,110000000.0,1100000.0,0.0279,0.001415,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,1,1,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1
3,neutron,All_or_Total,3,1,All_or_Total,Dependent,Lab,119600000.0,1196000.0,0.0264,0.001403,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,1,1,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1
4,neutron,All_or_Total,3,1,All_or_Total,Dependent,Lab,129400000.0,1294000.0,0.0256,0.001397,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,1,1,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1


In [52]:
element_w_a = objects.load_zan()
element_w_a = pd.DataFrame.from_dict(element_w_a, orient='index')
element_w_a.loc['12019'] = ['Heavy Water', 19, 1, 20, "Heavy Water"]

In [53]:
df = df.merge(element_w_a, on=['N', 'Z', 'A'], how='left')

In [54]:
df[["EXFOR_Accession_Number", "Dataset_Number", "EXFOR_Entry"]]  = df[["EXFOR_Accession_Number", "Dataset_Number", "EXFOR_Entry"]].astype(str)

In [55]:
# # Save Dataframe
# df.to_csv(heavy_dir + "/EXFOR_neutrons_ORIGINAL.csv", index=False)

csv_name = os.path.join(heavy_path, "EXFOR_" + mode + "_ORIGINAL.csv")
logging.info("EXFOR CSV: Saving EXFOR CSV file to {}...".format(csv_name))
df.to_csv(csv_name, index=False)

INFO:root:EXFOR CSV: Saving EXFOR CSV file to ../CSV_Files/EXFOR_neutrons/EXFOR_neutrons_ORIGINAL.csv...


In [56]:
df_copy = df.copy()

In [57]:
# df_original = df.copy()

## Merging EXFOR and AME Data

In [58]:
# csv_name = os.path.join(heavy_dir, "EXFOR_" + mode + "_ORIGINAL.csv")
# df = pd.read_csv(csv_name)

In [59]:
df_workxs = df.copy()

In [60]:
df_workxs.columns

Index(['Projectile', 'Target_Metastable_State', 'MF', 'MT',
       'Product_Metastable_State', 'EXFOR_Status', 'Center_of_Mass_Flag',
       'Energy', 'dEnergy', 'Data', 'dData', 'Cos/LO', 'dCos/LO', 'ELV/HL',
       'dELV/HL', 'I78', 'Short_Reference', 'EXFOR_Accession_Number',
       'EXFOR_SubAccession_Number', 'EXFOR_Pointer', 'Z', 'A', 'N',
       'Reaction_Notation', 'Title', 'Year', 'Author', 'Institute', 'Date',
       'Reference', 'Dataset_Number', 'EXFOR_Entry', 'Reference_Code',
       'Projectile_Z', 'Projectile_A', 'Projectile_N', 'Isotope', 'Element'],
      dtype='object')

In [61]:
masses = pd.read_csv(ame_dir + "AME_Natural_Properties_w_NaN.csv").rename(
    columns={'N': 'Neutrons', 'A': 'Mass_Number', 'Neutrons':'N', 'Mass_Number':'A', 'Flag':'Element_Flag'})
masses.head()
# masses = pd.read_csv(ame_dir + "AME_all_merged.csv")

Unnamed: 0,Neutrons,Z,Mass_Number,EL,O,Mass_Excess,dMass_Excess,Binding_Energy,dBinding_Energy,B_Decay_Energy,dB_Decay_Energy,Atomic_Mass_Micro,dAtomic_Mass_Micro,S(2n),dS(2n),S(2p),dS(2p),Q(a),dQ(a),Q(2B-),dQ(2B-),Q(ep),dQ(ep),Q(B-n),dQ(B-n),S(n),dS(n),S(p),dS(p),Q(4B-),dQ(4B-),"Q(d,a)","dQ(d,a)","Q(p,a)","dQ(p,a)","Q(n,a)","dQ(n,a)","Q(g,p)","Q(g,n)","Q(g,pn)","Q(g,d)","Q(g,t)","Q(g,He3)","Q(g,2p)","Q(g,2n)","Q(g,a)","Q(p,n)","Q(p,2p)","Q(p,pn)","Q(p,d)","Q(p,2n)","Q(p,t)","Q(p,3He)","Q(n,2p)","Q(n,np)","Q(n,d)","Q(n,2n)","Q(n,t)","Q(n,3He)","Q(d,t)","Q(d,3He)","Q(3He,t)","Q(3He,a)","Q(t,a)",N,A,Element_Flag
0,1,0,1,n,Other,8071.31713,0.00046,0.0,0.0,782.347,0.0,1008665.0,0.00049,,,,,,,,,,,,,0.0,0.0,,,,,,,,,,,,0.0,,,,,,,,0.0005,,0.0,2224.566,,,,,,,0.0,,,6257.229,,763.755,20577.6194,,1,1,I
1,6,1,7,H,-nn,49135.0,1004.0,940.0,143.0,23062.0,1004.0,7052749.0,1078.0,-100.0,1000.0,,,,,34228.0,1004.0,,,23472.0,1004.0,812.0,1036.0,,,21459.0,1004.0,,,,,,,,-812.0,,,,,,100.0,,22279.6535,,-812.0,1412.566,22689.6535,8581.7949,,,,,-812.0,,,5445.229,,23043.408,19765.6194,,6,7,I
2,5,1,6,H,-3n,41875.721,254.127,961.639,42.354,24283.626,254.127,6044955.0,272.816,-1111.96,273.09,,,,,27788.84,254.13,,,22573.17,254.91,-911.96,269.41,,,-5444.0,2019.0,,,,,,,,911.96,,,,,,1111.96,,23501.2795,,911.96,3136.526,21790.8235,9593.7549,,,,,911.96,,,7169.189,,24265.034,21489.5794,,5,6,I
3,4,1,5,H,-nn,32892.444,89.443,1336.359,17.889,21661.211,91.652,5035311.0,96.02,-1800.0,89.44,,,,,21213.56,102.47,,,22396.21,89.44,-200.0,134.16,,,,,,,,,,,,200.0,,,,,,1800.0,,20878.8645,,200.0,2424.566,21613.8635,10281.7949,,,,,200.0,,,6457.229,,21642.619,20777.6194,,4,5,I
4,3,1,4,H,-n,24621.127,100.0,1720.449,25.0,22196.211,100.0,4026432.0,107.354,4657.23,100.0,,,,,-702.06,234.52,,,1618.59,100.0,-1600.0,100.0,,,,,,,21413.86,100.0,,,,1600.0,,,1599.9951,,,-4657.23,,21413.8645,,1600.0,3824.566,836.2435,3824.5649,,,,,1600.0,,,7857.229,,22177.619,22177.6194,,3,4,I


In [62]:
# masses["ZAN"] = masses.Z.astype(str) + masses.A.astype(str) + masses.N.astype(str)

# masses["Isotope"] = masses.Mass_Number.astype(str) + masses.Element

# masses = masses[["Isotope", "N", "Z", "A", "ZAN", "Element"]]
# masses = masses.set_index("ZAN")

# element_dict = masses.to_dict('index')

# with open('element_ZAN.pkl', 'wb') as handle:
#     pickle.dump(element_dict, handle, protocol=pickle.HIGHEST_PROTOCOL)

In [63]:
df_workxs = df_workxs.reset_index(drop=True)
masses = masses.reset_index(drop=True)

In [64]:
df = df_workxs.merge(masses, on=['N', 'Z'], how='left')

In [65]:
df = df.drop(columns=["A_x", "A_y", "N", "EL"]).rename(columns={'Neutrons': 'N', 'Mass_Number':'A'})

In [66]:
df = df[~df['N'].isnull()]

In [67]:
df[["N", "A"]] = df[["N", "A"]].astype(int)

In [68]:
csv_name = os.path.join(heavy_path, "EXFOR_" + mode + "_ORIGINAL_w_AME.csv")
logging.info("EXFOR CSV: Saving EXFOR CSV file to {}...".format(csv_name))
df.to_csv(csv_name, index=False)

INFO:root:EXFOR CSV: Saving EXFOR CSV file to ../CSV_Files/EXFOR_neutrons/EXFOR_neutrons_ORIGINAL_w_AME.csv...


## Creating CSV file with AME, no RAW, and no NaN (ONLY MF3)

In [128]:
csv_name = os.path.join(heavy_path, "EXFOR_" + mode + "_ORIGINAL.csv")
df = pd.read_csv(csv_name)

  interactivity=interactivity, compiler=compiler, result=result)


In [129]:
# df = df_original.copy()
df_workxs = df.copy()

In [130]:
df_workxs.columns

Index(['Projectile', 'Target_Metastable_State', 'MF', 'MT',
       'Product_Metastable_State', 'EXFOR_Status', 'Center_of_Mass_Flag',
       'Energy', 'dEnergy', 'Data', 'dData', 'Cos/LO', 'dCos/LO', 'ELV/HL',
       'dELV/HL', 'I78', 'Short_Reference', 'EXFOR_Accession_Number',
       'EXFOR_SubAccession_Number', 'EXFOR_Pointer', 'Z', 'A', 'N',
       'Reaction_Notation', 'Title', 'Year', 'Author', 'Institute', 'Date',
       'Reference', 'Dataset_Number', 'EXFOR_Entry', 'Reference_Code',
       'Projectile_Z', 'Projectile_A', 'Projectile_N', 'Isotope', 'Element'],
      dtype='object')

In [131]:
masses = pd.read_csv(ame_dir + "AME_Natural_Properties_no_NaN.csv").rename(
    columns={'N': 'Neutrons', 'A': 'Mass_Number', 'Neutrons':'N', 'Mass_Number':'A', 'Flag':'Element_Flag'})
masses.head()

Unnamed: 0,Neutrons,Z,Mass_Number,EL,O,Mass_Excess,dMass_Excess,Binding_Energy,dBinding_Energy,B_Decay_Energy,dB_Decay_Energy,Atomic_Mass_Micro,dAtomic_Mass_Micro,S(2n),dS(2n),S(2p),dS(2p),Q(a),dQ(a),Q(2B-),dQ(2B-),Q(ep),dQ(ep),Q(B-n),dQ(B-n),S(n),dS(n),S(p),dS(p),Q(4B-),dQ(4B-),"Q(d,a)","dQ(d,a)","Q(p,a)","dQ(p,a)","Q(n,a)","dQ(n,a)","Q(g,p)","Q(g,n)","Q(g,pn)","Q(g,d)","Q(g,t)","Q(g,He3)","Q(g,2p)","Q(g,2n)","Q(g,a)","Q(p,n)","Q(p,2p)","Q(p,pn)","Q(p,d)","Q(p,2n)","Q(p,t)","Q(p,3He)","Q(n,2p)","Q(n,np)","Q(n,d)","Q(n,2n)","Q(n,t)","Q(n,3He)","Q(d,t)","Q(d,3He)","Q(3He,t)","Q(3He,a)","Q(t,a)",N,A,Element_Flag
0,1,0,1,n,Other,8071.31713,0.00046,0.0,0.0,782.347,0.0,1008665.0,0.00049,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0005,0.0,0.0,2224.566,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6257.229,0.0,763.755,20577.6194,0.0,1,1,I
1,0,1,1,H,Other,-2437.418042,-308.875124,1124.312708,-43.336286,-5842.787372,-791.107704,1007000.0,-331.637262,13491.797,-794.039005,0.0,0.0,0.0,0.0,-48428.181945,1705.218991,0.0,0.0,-18640.462072,-1220.521001,4590.313134,-517.712285,953.387144,0.0,-166862.0,8109.0,22654.676628,0.0,19493.859878,-20.0,0.0,0.0,-953.38715,-4590.313093,-3416.417233,-1191.85122,-320.0049,0.0,0.0,-13491.797066,0.0,-6625.133969,-953.38715,-4590.313093,-2365.746983,-19422.807928,-5010.002166,4301.623148,0.0,-953.38715,1271.178847,-4590.313093,5065.377773,0.0,1666.91591,4540.08726,-5861.379204,15987.306327,18860.477844,-1,0,N
2,0,1,1,H,Other,7288.97061,9e-05,0.0,0.0,-1025.364293,-574.895004,1007825.0,9e-05,11198.52,-576.730004,0.0,0.0,0.0,0.0,-35990.091956,1507.979993,0.0,0.0,-12066.743059,-932.774001,3814.015402,-345.643999,0.0,0.0,-139959.0,7094.0,21760.786636,0.0,19813.859902,4.263256e-14,0.0,0.0,0.0,-3814.015369,-4310.307233,-2085.741223,-0.0049,0.0,0.0,-11198.52005,0.0,-1807.710875,0.0,-3814.015369,-1589.449282,-12849.089025,-2716.725154,3407.733152,0.0,0.0,2224.566,-3814.015369,4171.487772,0.0,2443.213634,5493.4744,-1043.956164,16763.604053,19813.8649,0,1,I
3,1,1,2,H,Other,13135.72176,0.00011,1112.283,0.0,3792.058787,-358.682304,2014102.0,0.00012,8905.243,-359.421003,0.0,0.0,0.0,0.0,-23552.001967,1310.740995,0.0,0.0,-5493.024046,-645.027001,2224.57,0.0,2224.57,0.0,-113056.0,6079.0,23846.53,0.0,20133.859927,20.0,0.0,0.0,-2224.57,-2224.57,-2224.5639,0.0021,319.9951,0.0,0.0,-8905.243033,0.0,3009.712219,-2224.57,-2224.57,-0.004,-6275.370122,-423.448143,5493.4765,0.0,-2224.57,-0.004,-2224.57,6257.2311,0.0,4032.659,3268.9044,3773.466876,18353.0494,17589.2949,1,2,I
4,2,1,3,H,Other,14949.80993,0.00022,2827.265,0.0,18.592,0.0,3016049.0,0.00023,8481.79,0.0,0.0,0.0,0.0,0.0,-13717.0,2000.0,0.0,0.0,1080.694967,-357.280001,6257.23,0.0,2224.57,0.0,-86153.0,5064.0,17589.3,0.0,19813.86,0.0,0.0,0.0,-2224.57,-6257.23,-8481.7939,-6257.2279,-0.0049,0.0,0.0,-8481.79,0.0,-763.7545,-2224.57,-6257.23,-4032.664,298.348781,0.0049,-763.7535,0.0,-2224.57,-0.004,-6257.23,0.0011,0.0,-0.001,3268.9044,0.0,14320.3894,17589.2949,2,3,I


In [132]:
df_workxs = df_workxs.reset_index(drop=True)
masses = masses.reset_index(drop=True)

In [133]:
df_workxs.shape

(6007126, 38)

In [134]:
df = df_workxs.merge(masses, on=['N', 'Z'], how='left')

In [135]:
df = df.drop(columns=["A_x", "A_y", "N", "EL"]).rename(columns={'Neutrons': 'N', 'Mass_Number':'A'})

In [137]:
df = df[~df['N'].isnull()]

In [141]:
df.shape

(6006239, 99)

In [138]:
df[["N", "A"]] = df[["N", "A"]].astype(int)

In [139]:
df["O"].fillna(value="Other", inplace=True)

In [140]:
# df = df[~df.Neutrons.isna()]

## Neutron Induced Cross Section vs Energy Data 

MF are ENDF labels and are used to store different types of data:

- MF=1 contains descriptive and miscellaneous data,
- MF=2 contains resonance parameter data,
- MF=3 contains reaction cross sections vs energy,
- MF=4 contains angular distributions,
- MF=5 contains energy distributions,
- MF=6 contains energy-angle distributions,
- MF=7 contains thermal scattering data,
- MF=8 contains radioactivity data
- MF=9-10 contain nuclide production data,
- MF=12-15 contain photon production data, and
- MF=30-36 contain covariance data.

In [142]:
df.MF = df.MF.astype(str)
df.MT = df.MT.astype(str)

In [143]:
df = df[df["MF"] == "3"]

In [144]:
# df = df[df["MT"] < 999] # Cross Section Ratios

In [145]:
df.shape

(4644791, 99)

In [146]:
df.head()

Unnamed: 0,Projectile,Target_Metastable_State,MF,MT,Product_Metastable_State,EXFOR_Status,Center_of_Mass_Flag,Energy,dEnergy,Data,dData,Cos/LO,dCos/LO,ELV/HL,dELV/HL,I78,Short_Reference,EXFOR_Accession_Number,EXFOR_SubAccession_Number,EXFOR_Pointer,Z,Reaction_Notation,Title,Year,Author,Institute,Date,Reference,Dataset_Number,EXFOR_Entry,Reference_Code,Projectile_Z,Projectile_A,Projectile_N,Isotope,Element,N,A,O,Mass_Excess,dMass_Excess,Binding_Energy,dBinding_Energy,B_Decay_Energy,dB_Decay_Energy,Atomic_Mass_Micro,dAtomic_Mass_Micro,S(2n),dS(2n),S(2p),dS(2p),Q(a),dQ(a),Q(2B-),dQ(2B-),Q(ep),dQ(ep),Q(B-n),dQ(B-n),S(n),dS(n),S(p),dS(p),Q(4B-),dQ(4B-),"Q(d,a)","dQ(d,a)","Q(p,a)","dQ(p,a)","Q(n,a)","dQ(n,a)","Q(g,p)","Q(g,n)","Q(g,pn)","Q(g,d)","Q(g,t)","Q(g,He3)","Q(g,2p)","Q(g,2n)","Q(g,a)","Q(p,n)","Q(p,2p)","Q(p,pn)","Q(p,d)","Q(p,2n)","Q(p,t)","Q(p,3He)","Q(n,2p)","Q(n,np)","Q(n,d)","Q(n,2n)","Q(n,t)","Q(n,3He)","Q(d,t)","Q(d,3He)","Q(3He,t)","Q(3He,a)","Q(t,a)",Element_Flag
0,neutron,All_or_Total,3,1,All_or_Total,Dependent,Lab,88200000.0,882000.0,0.03,0.001523,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1,1n,n,1,1,Other,8071.31713,0.00046,0.0,0.0,782.347,0.0,1008665.0,0.00049,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0005,0.0,0.0,2224.566,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6257.229,0.0,763.755,20577.6194,0.0,I
1,neutron,All_or_Total,3,1,All_or_Total,Dependent,Lab,98100000.0,981000.0,0.0291,0.001516,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1,1n,n,1,1,Other,8071.31713,0.00046,0.0,0.0,782.347,0.0,1008665.0,0.00049,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0005,0.0,0.0,2224.566,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6257.229,0.0,763.755,20577.6194,0.0,I
2,neutron,All_or_Total,3,1,All_or_Total,Dependent,Lab,110000000.0,1100000.0,0.0279,0.001415,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1,1n,n,1,1,Other,8071.31713,0.00046,0.0,0.0,782.347,0.0,1008665.0,0.00049,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0005,0.0,0.0,2224.566,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6257.229,0.0,763.755,20577.6194,0.0,I
3,neutron,All_or_Total,3,1,All_or_Total,Dependent,Lab,119600000.0,1196000.0,0.0264,0.001403,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1,1n,n,1,1,Other,8071.31713,0.00046,0.0,0.0,782.347,0.0,1008665.0,0.00049,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0005,0.0,0.0,2224.566,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6257.229,0.0,763.755,20577.6194,0.0,I
4,neutron,All_or_Total,3,1,All_or_Total,Dependent,Lab,129400000.0,1294000.0,0.0256,0.001397,,,,,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1,1n,n,1,1,Other,8071.31713,0.00046,0.0,0.0,782.347,0.0,1008665.0,0.00049,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0005,0.0,0.0,2224.566,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6257.229,0.0,763.755,20577.6194,0.0,I


In [147]:
df.columns

Index(['Projectile', 'Target_Metastable_State', 'MF', 'MT',
       'Product_Metastable_State', 'EXFOR_Status', 'Center_of_Mass_Flag',
       'Energy', 'dEnergy', 'Data', 'dData', 'Cos/LO', 'dCos/LO', 'ELV/HL',
       'dELV/HL', 'I78', 'Short_Reference', 'EXFOR_Accession_Number',
       'EXFOR_SubAccession_Number', 'EXFOR_Pointer', 'Z', 'Reaction_Notation',
       'Title', 'Year', 'Author', 'Institute', 'Date', 'Reference',
       'Dataset_Number', 'EXFOR_Entry', 'Reference_Code', 'Projectile_Z',
       'Projectile_A', 'Projectile_N', 'Isotope', 'Element', 'N', 'A', 'O',
       'Mass_Excess', 'dMass_Excess', 'Binding_Energy', 'dBinding_Energy',
       'B_Decay_Energy', 'dB_Decay_Energy', 'Atomic_Mass_Micro',
       'dAtomic_Mass_Micro', 'S(2n)', 'dS(2n)', 'S(2p)', 'dS(2p)', 'Q(a)',
       'dQ(a)', 'Q(2B-)', 'dQ(2B-)', 'Q(ep)', 'dQ(ep)', 'Q(B-n)', 'dQ(B-n)',
       'S(n)', 'dS(n)', 'S(p)', 'dS(p)', 'Q(4B-)', 'dQ(4B-)', 'Q(d,a)',
       'dQ(d,a)', 'Q(p,a)', 'dQ(p,a)', 'Q(n,a)', 'dQ(n,a)',

In [148]:
columns_drop = ["MF", "Cos/LO", "dCos/LO"]
df = df.drop(columns=columns_drop)

## Exploring Missing Values

In [149]:
# df["Neutrons"] = df["Neutrons"].astype(int)
# df["Mass_Number"] = df["Mass_Number"].astype(int)

In [150]:
# df = df.rename(columns={"Z":"Protons", "EL":"Element", "O":"Origin", "Type":"Reaction_Notation"})

In [151]:
df.columns

Index(['Projectile', 'Target_Metastable_State', 'MT',
       'Product_Metastable_State', 'EXFOR_Status', 'Center_of_Mass_Flag',
       'Energy', 'dEnergy', 'Data', 'dData', 'ELV/HL', 'dELV/HL', 'I78',
       'Short_Reference', 'EXFOR_Accession_Number',
       'EXFOR_SubAccession_Number', 'EXFOR_Pointer', 'Z', 'Reaction_Notation',
       'Title', 'Year', 'Author', 'Institute', 'Date', 'Reference',
       'Dataset_Number', 'EXFOR_Entry', 'Reference_Code', 'Projectile_Z',
       'Projectile_A', 'Projectile_N', 'Isotope', 'Element', 'N', 'A', 'O',
       'Mass_Excess', 'dMass_Excess', 'Binding_Energy', 'dBinding_Energy',
       'B_Decay_Energy', 'dB_Decay_Energy', 'Atomic_Mass_Micro',
       'dAtomic_Mass_Micro', 'S(2n)', 'dS(2n)', 'S(2p)', 'dS(2p)', 'Q(a)',
       'dQ(a)', 'Q(2B-)', 'dQ(2B-)', 'Q(ep)', 'dQ(ep)', 'Q(B-n)', 'dQ(B-n)',
       'S(n)', 'dS(n)', 'S(p)', 'dS(p)', 'Q(4B-)', 'dQ(4B-)', 'Q(d,a)',
       'dQ(d,a)', 'Q(p,a)', 'dQ(p,a)', 'Q(n,a)', 'dQ(n,a)', 'Q(g,p)', 'Q(g,n)',
      

In [152]:
# # Assuming Unknown values are ground state
# df["Product_Meta_State"] = df["Product_Meta_State"].astype(str)
# df["Product_Meta_State"] = df["Product_Meta_State"].replace(to_replace="?", value="G")

In [153]:
# df["Element_w_A"] = df["Mass_Number"].astype(str) + df.Element

## Uncertainty Missing Values

The uncertainty is not given for every experiment. Missing values happen when they are not specified in the entries and are given in the respective paper, or are simply not given. In any case, it will be very tidius to go one by one finding uncertanties. For this, we take the mean of the current uncertanties and fill missing values using the mean uncertantity multiply times the energy values. 

**it would be better to assign mean uncertainty per facility, per author, or per dataset**

In [154]:
df.columns[df.isna().any()].tolist()

['dEnergy', 'dData', 'ELV/HL', 'dELV/HL']

## Exploring Uncertainty

In [155]:
# missing_uncertanties_institute = df[["Institute","dEnergy"]].drop('Institute', 1).isna().groupby(df.Institute, sort=False).sum().reset_index()
# missing_uncertanties_institute = missing_uncertanties_institute[missing_uncertanties_institute.dEnergy > 0]
# missing_uncertanties_institute = missing_uncertanties_institute.sort_values('dEnergy', ascending=False)

# missing_uncertanties_reference = df[["Institute","dEnergy"]].drop('Institute', 1).isna().groupby(df.Institute, sort=False).sum().reset_index()
# missing_uncertanties_reference = missing_uncertanties_reference[missing_uncertanties_reference.dEnergy > 0]
# missing_uncertanties_reference = missing_uncertanties_reference.sort_values('dEnergy', ascending=False)

# missing_uncertanties_reference.to_csv("./Extracted_Text/missing_unc_ref.csv", index=False)
# missing_uncertanties_institute.to_csv("./Extracted_Text/missing_unc_ins.csv", index=False)

In [156]:
df["Uncertainty_E"] = df["dEnergy"]/df["Energy"]
df["Uncertainty_D"] = df["dData"]/df["Data"]
df["Uncertainty_ELV"] = df["dELV/HL"]/df["ELV/HL"]

In [157]:
df_copy = df.copy()

In [158]:
df = df_copy.copy()

In [159]:
df.shape

(4644791, 99)

In [160]:
df[["Uncertainty_E", "Uncertainty_D", "Uncertainty_ELV"]].isna().sum()

Uncertainty_E      3919001
Uncertainty_D       830521
Uncertainty_ELV    4634484
dtype: int64

### Fill by Reaction Channel

In [161]:
df["Uncertainty_E"] = df[["MT", "Uncertainty_E"]].groupby("MT").transform(lambda x: x.fillna(x.mean()))
df["Uncertainty_D"] = df[["MT", "Uncertainty_D"]].groupby("MT").transform(lambda x: x.fillna(x.mean()))
df["Uncertainty_ELV"] = df[["MT", "Uncertainty_ELV"]].groupby("MT").transform(lambda x: x.fillna(x.mean()))

In [162]:
df[["Uncertainty_E", "Uncertainty_D", "Uncertainty_ELV"]].isna().sum()

Uncertainty_E           19
Uncertainty_D            0
Uncertainty_ELV    3942539
dtype: int64

### Fill by Institute

In [163]:
df["Uncertainty_E"] = df[["Institute", "Uncertainty_E"]].groupby("Institute").transform(lambda x: x.fillna(x.mean()))
df["Uncertainty_D"] = df[["Institute", "Uncertainty_D"]].groupby("Institute").transform(lambda x: x.fillna(x.mean()))
df["Uncertainty_ELV"] = df[["Institute", "Uncertainty_ELV"]].groupby("Institute").transform(lambda x: x.fillna(x.mean()))

In [164]:
df[["Uncertainty_E", "Uncertainty_D", "Uncertainty_ELV"]].isna().sum()

Uncertainty_E           0
Uncertainty_D           0
Uncertainty_ELV    109691
dtype: int64

### Fill by Isotope

In [165]:
df["Uncertainty_E"] = df[["Isotope", "Uncertainty_E"]].groupby("Isotope").transform(lambda x: x.fillna(x.mean()))
df["Uncertainty_D"] = df[["Isotope", "Uncertainty_D"]].groupby("Isotope").transform(lambda x: x.fillna(x.mean()))
df["Uncertainty_ELV"] = df[["Isotope", "Uncertainty_ELV"]].groupby("Isotope").transform(lambda x: x.fillna(x.mean()))

In [166]:
df[["Uncertainty_E", "Uncertainty_D", "Uncertainty_ELV"]].isna().sum()

Uncertainty_E      0
Uncertainty_D      0
Uncertainty_ELV    8
dtype: int64

In [167]:
df["Uncertainty_ELV"] = df[["I78", "Uncertainty_ELV"]].groupby("I78").transform(lambda x: x.fillna(x.mean()))

In [168]:
df[["Uncertainty_E", "Uncertainty_D", "Uncertainty_ELV"]].isna().sum()

Uncertainty_E      0
Uncertainty_D      0
Uncertainty_ELV    0
dtype: int64

In [169]:
df.shape

(4644791, 99)

### Having Filled Uncertainty Fraction Values let us fill the actual Uncertainties

In [170]:
df[["dEnergy", "dData", "dELV/HL"]].isna().sum()

dEnergy    3919001
dData       830521
dELV/HL    4634415
dtype: int64

In [171]:
df.dEnergy = df.dEnergy.fillna(df.Energy * df.Uncertainty_E)
df.dData = df.dData.fillna(df.Data * df.Uncertainty_D)
df["dELV/HL"] = df["dELV/HL"].fillna(df["ELV/HL"] * df["Uncertainty_ELV"])

In [172]:
df.Uncertainty_D = df.Uncertainty_D.replace(to_replace=np.inf, value=0)

In [173]:
df.dData = df.dData.replace(to_replace=np.nan, value=0)
df["dELV/HL"] = df["dELV/HL"].replace(to_replace=np.nan, value=0)

In [174]:
df[["dEnergy", "dData", "dELV/HL"]].isna().sum()

dEnergy    0
dData      0
dELV/HL    0
dtype: int64

In [175]:
df["ELV/HL"] = df["ELV/HL"].replace(to_replace=np.nan, value=0)

In [176]:
df.fillna(value=0, inplace=True)

In [177]:
df.columns

Index(['Projectile', 'Target_Metastable_State', 'MT',
       'Product_Metastable_State', 'EXFOR_Status', 'Center_of_Mass_Flag',
       'Energy', 'dEnergy', 'Data', 'dData', 'ELV/HL', 'dELV/HL', 'I78',
       'Short_Reference', 'EXFOR_Accession_Number',
       'EXFOR_SubAccession_Number', 'EXFOR_Pointer', 'Z', 'Reaction_Notation',
       'Title', 'Year', 'Author', 'Institute', 'Date', 'Reference',
       'Dataset_Number', 'EXFOR_Entry', 'Reference_Code', 'Projectile_Z',
       'Projectile_A', 'Projectile_N', 'Isotope', 'Element', 'N', 'A', 'O',
       'Mass_Excess', 'dMass_Excess', 'Binding_Energy', 'dBinding_Energy',
       'B_Decay_Energy', 'dB_Decay_Energy', 'Atomic_Mass_Micro',
       'dAtomic_Mass_Micro', 'S(2n)', 'dS(2n)', 'S(2p)', 'dS(2p)', 'Q(a)',
       'dQ(a)', 'Q(2B-)', 'dQ(2B-)', 'Q(ep)', 'dQ(ep)', 'Q(B-n)', 'dQ(B-n)',
       'S(n)', 'dS(n)', 'S(p)', 'dS(p)', 'Q(4B-)', 'dQ(4B-)', 'Q(d,a)',
       'dQ(d,a)', 'Q(p,a)', 'dQ(p,a)', 'Q(n,a)', 'dQ(n,a)', 'Q(g,p)', 'Q(g,n)',
      

In [178]:
df["Nucleus_Radius"] = 1.25 * np.power(df["A"], 1/3)
df["Neutron_Nucleus_Radius_Ratio"] = 0.8 / df["Nucleus_Radius"]

In [179]:
df[df.Reaction_Notation.str.contains("RAW")].shape

(311512, 101)

## Ordering and Renaming

In [180]:
df.head(2)

Unnamed: 0,Projectile,Target_Metastable_State,MT,Product_Metastable_State,EXFOR_Status,Center_of_Mass_Flag,Energy,dEnergy,Data,dData,ELV/HL,dELV/HL,I78,Short_Reference,EXFOR_Accession_Number,EXFOR_SubAccession_Number,EXFOR_Pointer,Z,Reaction_Notation,Title,Year,Author,Institute,Date,Reference,Dataset_Number,EXFOR_Entry,Reference_Code,Projectile_Z,Projectile_A,Projectile_N,Isotope,Element,N,A,O,Mass_Excess,dMass_Excess,Binding_Energy,dBinding_Energy,B_Decay_Energy,dB_Decay_Energy,Atomic_Mass_Micro,dAtomic_Mass_Micro,S(2n),dS(2n),S(2p),dS(2p),Q(a),dQ(a),Q(2B-),dQ(2B-),Q(ep),dQ(ep),Q(B-n),dQ(B-n),S(n),dS(n),S(p),dS(p),Q(4B-),dQ(4B-),"Q(d,a)","dQ(d,a)","Q(p,a)","dQ(p,a)","Q(n,a)","dQ(n,a)","Q(g,p)","Q(g,n)","Q(g,pn)","Q(g,d)","Q(g,t)","Q(g,He3)","Q(g,2p)","Q(g,2n)","Q(g,a)","Q(p,n)","Q(p,2p)","Q(p,pn)","Q(p,d)","Q(p,2n)","Q(p,t)","Q(p,3He)","Q(n,2p)","Q(n,np)","Q(n,d)","Q(n,2n)","Q(n,t)","Q(n,3He)","Q(d,t)","Q(d,3He)","Q(3He,t)","Q(3He,a)","Q(t,a)",Element_Flag,Uncertainty_E,Uncertainty_D,Uncertainty_ELV,Nucleus_Radius,Neutron_Nucleus_Radius_Ratio
0,neutron,All_or_Total,1,All_or_Total,Dependent,Lab,88200000.0,882000.0,0.03,0.001523,0.0,0.0,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1,1n,n,1,1,Other,8071.31713,0.00046,0.0,0.0,782.347,0.0,1008665.0,0.00049,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0005,0.0,0.0,2224.566,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6257.229,0.0,763.755,20577.6194,0.0,I,0.01,0.050773,0.701151,1.25,0.64
1,neutron,All_or_Total,1,All_or_Total,Dependent,Lab,98100000.0,981000.0,0.0291,0.001516,0.0,0.0,Other,"D.F.MEASDAY,ET.AL. (66)",11152,2,No Pointer,0,"0-NN-1(N,TOT),,SIG","NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...",1966,D.F.Measday+,1USAHRV,1980/08/04,"Jour. Nuclear Physics Vol.85, p.142, 1966",11152002,11152,"(J,NP,85,142,6609)",0,1,1,1n,n,1,1,Other,8071.31713,0.00046,0.0,0.0,782.347,0.0,1008665.0,0.00049,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0005,0.0,0.0,2224.566,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6257.229,0.0,763.755,20577.6194,0.0,I,0.01,0.052103,0.701151,1.25,0.64


In [181]:
# Use this for ordering
new_order = list(df.columns)[:35]
new_order_2 = list(df.columns)[-6:]
new_order.extend(new_order_2)
nuclear_data_target = list(df.columns)[35:-6]
new_order.extend(nuclear_data_target)

# # use these for renaming
# nuclear_data_target_cols = ["Target_" + s for s in nuclear_data_target]

In [182]:
df = df[new_order]

In [183]:
# df = df.rename(columns={"Protons":"Target_Protons", "Neutrons":"Target_Neutrons", 
#                         "Mass_Number":"Target_Mass_Number", "Element":"Target_Element", 
#                         "Flag": "Target_Flag", "Nuc_Radius_fm":"Target_Radius", 
#                         "Neut_Nuc_Rad_Ratio":"Target_Neut_Rad_Ratio", "Element_w_A":"Target_Element_w_A"})
df = df.drop(columns=["Uncertainty_D", "Uncertainty_E", "Uncertainty_ELV"])

In [184]:
# new_order = list(df.columns)[:28]
# nuclear_data_target = list(df.columns)[28:]
# nuclear_data_target_cols = ["Target_" + s for s in nuclear_data_target]
# new_order.extend(nuclear_data_target_cols)

In [185]:
# df.columns = new_order

In [186]:
logging.info("EXFOR CSV: Dropping RAW experimental datapoints...")
df = df[~df.Reaction_Notation.str.contains("RAW")]

INFO:root:EXFOR CSV: Dropping RAW experimental datapoints...


In [187]:
df.shape

(4333279, 101)

In [188]:
df = df[~(df.Data < 0)]

In [189]:
df.shape

(4255409, 101)

In [126]:
logging.info("EXFOR CSV: Saving MF3 NaN Imputed RAW Free EXFOR CSV...")
df.to_csv(os.path.join(heavy_path, "EXFOR_" + mode + "_MF3_AME_no_RawNaN.csv"), index=False)
logging.info("Finished")

INFO:root:EXFOR CSV: Saving MF3 NaN Imputed RAW Free EXFOR CSV...
INFO:root:Finished


# ------------------ EXTRA ------------------
## Adding Compound Nucleus Info

In [192]:
df["Compound_Neutrons"] = df.Target_Neutrons + 1
df["Compound_Mass_Number"] = df.Target_Mass_Number + 1
df["Compound_Protons"] = df.Target_Protons

In [193]:
df_copy = df.copy()

In [194]:
masses = pd.read_csv(ame_dir + "/AME_final_properties_no_NaN.csv")
masses = masses[masses.Flag == "I"]
masses = masses.drop(columns=["Neutrons", "Mass_Number", "Flag"])
masses = masses.rename(columns={'N': 'Neutrons', 'A': 'Mass_Number', "Z":"Protons", "O":"Origin"})

In [195]:
nuclear_data_compound = list(masses.columns)
nuclear_data_compound_cols = ["Compound_" + s for s in nuclear_data_compound]

In [196]:
masses.columns = nuclear_data_compound_cols

In [197]:
masses.head()

Unnamed: 0,Compound_Neutrons,Compound_Protons,Compound_Mass_Number,Compound_EL,Compound_Origin,Compound_Mass_Excess,Compound_dMass_Excess,Compound_Binding_Energy,Compound_dBinding_Energy,Compound_B_Decay_Energy,Compound_dB_Decay_Energy,Compound_Atomic_Mass_Micro,Compound_dAtomic_Mass_Micro,Compound_S(2n),Compound_dS(2n),Compound_S(2p),Compound_dS(2p),Compound_Q(a),Compound_dQ(a),Compound_Q(2B-),Compound_dQ(2B-),Compound_Q(ep),Compound_dQ(ep),Compound_Q(B-n),Compound_dQ(B-n),Compound_S(n),Compound_dS(n),Compound_S(p),Compound_dS(p),Compound_Q(4B-),Compound_dQ(4B-),"Compound_Q(d,a)","Compound_dQ(d,a)","Compound_Q(p,a)","Compound_dQ(p,a)","Compound_Q(n,a)","Compound_dQ(n,a)","Compound_Q(g,p)","Compound_Q(g,n)","Compound_Q(g,pn)","Compound_Q(g,d)","Compound_Q(g,t)","Compound_Q(g,He3)","Compound_Q(g,2p)","Compound_Q(g,2n)","Compound_Q(g,a)","Compound_Q(p,n)","Compound_Q(p,2p)","Compound_Q(p,pn)","Compound_Q(p,d)","Compound_Q(p,2n)","Compound_Q(p,t)","Compound_Q(p,3He)","Compound_Q(n,2p)","Compound_Q(n,np)","Compound_Q(n,d)","Compound_Q(n,2n)","Compound_Q(n,t)","Compound_Q(n,3He)","Compound_Q(d,t)","Compound_Q(d,3He)","Compound_Q(3He,t)","Compound_Q(3He,a)","Compound_Q(t,a)"
0,1,0,1,n,Other,8071.31713,0.00046,0.0,0.0,782.347,0.0,1008665.0,0.00049,15404.483723,162.252148,13771.880283,158.524008,-1125.3436,142.081942,-232.1475,160.8227,-6859.135662,158.184195,-7754.629285,161.152354,0.0,0.0,6889.086305,162.896488,-343.812798,175.128817,11405.545508,176.651049,5916.436032,171.776354,6730.114706,163.516658,-6889.086305,-0.0,-14665.548392,-12440.982392,-13897.428868,-13847.504694,-13771.880283,-15404.483723,-1125.3436,0.0005,-6889.086305,-0.0,2224.566,-8536.975785,-6922.688823,-6947.507992,-6076.789162,-6889.086305,-4664.520305,-0.0,-6183.753392,-6053.839883,6257.229,-1395.611905,763.755,20577.6194,12924.778595
1,0,1,1,H,Other,7288.97061,9e-05,0.0,0.0,18244.328,289.9558,1007825.0,9e-05,2025.412,292.506,13771.880283,158.524008,-1125.3436,142.081942,13762.268,719.024,-6859.135662,158.184195,17514.9925,362.0875,1096.973333,256.595,0.0,0.0,8007.5,1511.5,20717.915,0.0,20613.86,50.0,6730.114706,163.516658,-0.0,-1096.973333,-5353.1789,-3128.6129,799.9951,-13847.504694,-13771.880283,-2025.412,-1125.3436,17461.9815,-0.0,-1096.973333,1127.592667,16732.646,6456.3829,2364.8615,-6076.789162,-0.0,2224.566,-1096.973333,3128.6161,-6053.839883,5160.255667,5493.4744,18225.736,19480.646067,19813.8649
2,1,1,2,H,Other,13135.72176,0.00011,1112.283,0.0,18244.328,289.9558,2014102.0,0.00012,2025.412,292.506,13771.880283,158.524008,-1125.3436,142.081942,13762.268,719.024,-6859.135662,158.184195,17514.9925,362.0875,2224.57,0.0,2224.57,0.0,8007.5,1511.5,23846.53,0.0,20613.86,50.0,6730.114706,163.516658,-2224.57,-2224.57,-2224.5639,0.0021,799.9951,-13847.504694,-13771.880283,-2025.412,-1125.3436,17461.9815,-2224.57,-2224.57,-0.004,16732.646,6456.3829,5493.4765,-6076.789162,-2224.57,-0.004,-2224.57,6257.2311,-6053.839883,4032.659,3268.9044,18225.736,18353.0494,17589.2949
3,2,1,3,H,Other,14949.80993,0.00022,2827.265,0.0,18.592,0.0,3016049.0,0.00023,8481.79,0.0,13771.880283,158.524008,-1125.3436,142.081942,-13717.0,2000.0,-6859.135662,158.184195,17514.9925,362.0875,6257.23,0.0,1112.285,0.0,8007.5,1511.5,17589.3,0.0,19813.86,0.0,6730.114706,163.516658,-1112.285,-6257.23,-8481.7939,-6257.2279,-0.0049,-13847.504694,-13771.880283,-8481.79,-1125.3436,-763.7545,-1112.285,-6257.23,-4032.664,16732.646,0.0049,-763.7535,-6076.789162,-1112.285,1112.281,-6257.23,0.0011,-6053.839883,-0.001,4381.1894,0.0,14320.3894,18701.5799
4,1,2,3,He,Other,14931.21793,0.00021,2572.68,0.0,-13736.0,2000.0,3016029.0,0.00022,4013.16,30.3,7718.04,0.0,367.5,10.0,12743.205,359.296667,-6859.135662,158.184195,-2571.224286,344.341429,3176.184286,29.417143,5493.47,0.0,14022.91,52.653333,18353.05,0.0,4173.344286,210.882857,20577.62,0.0,-5493.47,-3176.184286,-7718.0439,-5493.4779,-15640.520614,0.0006,-7718.04,-4013.16,367.5,-14518.3465,-5493.47,-3176.184286,-951.618286,-3353.570786,4468.6349,-0.0035,-6076.789162,-5493.47,-3268.904,-3176.184286,763.7511,0.0004,3081.044714,0.0044,-13754.592,17401.435114,14320.3949


In [198]:
df = df.reset_index(drop=True)
masses = masses.reset_index(drop=True)

df = df.merge(masses, on=['Compound_Neutrons', 'Compound_Protons'], how='left')

In [199]:
df[df.isna().any(axis=1)].Target_Element_w_A.unique()

array(['1n'], dtype=object)

In [200]:
df = df.drop(columns=["Compound_Mass_Number_y"])
df = df.rename(columns={'Compound_Mass_Number_x': 'Compound_Mass_Number'})

In [201]:
q_value = [col for col in df.columns if 'Q' in col]
df = df.drop(columns=q_value)

In [202]:
df.shape

(4644791, 66)

In [203]:
df_no_raw = df[~df.Reaction_Notation.str.contains("RAW")]

df_no_raw = df_no_raw[~(df_no_raw.Data < 0)]

df_no_raw.shape

(4255409, 66)

In [204]:
df_no_raw.EXFOR_Status.value_counts()

Other    2181990
A        1691538
C         231849
D         126722
P          20622
O           2513
R            175
Name: EXFOR_Status, dtype: int64

In [205]:
df_no_raw.to_csv(heavy_dir + "/EXFOR_neutrons_MF3_AME_no_NaNRaw.csv", index=False)