# XML Conversion of a Material

This will walk through the process of converting an XML material file in the provided format into a protocol buffer `Material` message. This will only work on the XML files that we generated for BART-Lite, contained in `/lib/kaist/xml`

# Step 1: Parse XML File

This requires use of the `xml.etree` library which allows us to parse the XML file automatically.

In [1]:
import xml.etree.cElementTree as ET

Now get one of the XML files, in this case we will be converting file `uo2_33.xml`

In [2]:
filename = "../lib/kaist/xml/uo2_33.xml"

Make a dictionary to hold all the data we are going to extract:

In [3]:
data = {}

Parse the XML in the file by getting the root.

In [4]:
root = ET.parse(filename).getroot()

Get the `<name>` and `<id>` XML fields that are on the top level and store them in `data`. Also get the `<prop>` root that will hold physical properties, and `<grp_structures>` which will have cross-sections.

In [5]:
for el in root.findall('./material/'):
    if el.tag == "prop":
        prop_root = el
    elif el.tag == "grp_structures":
        grp_structs = el
    else:
        data.update({el.tag: el.text})

In [6]:
data

{'id': 'uo2_33', 'name': 'UO2 3.3% fuel cell'}

Now we will get the values of $\chi$, $\Sigma_t$, $\Sigma_a$, etc, which are all vector quantities so we will store them in a list. The parser automatically parses everything as a string so they'll need to be converted into floats. There's only one tag in `grp_structs`, which is `<grp_struct>`, that has `<energy_groups>`, `<chi>`, and `<xsec>` that holds the cross-sections. We will extract the data from these and save the one matrix quantity `<sig_s>` for next:

In [7]:
for el in grp_structs[0]:
    if el.tag == "chi":
        data.update({"chi": [float(v) for v in el.text.split(',')]})
    elif el.tag == "energy_groups":
        data.update({"energy_groups": [float(v) for v in el.text.split(',')]})
    elif el.tag == "xsec":
        xsec_root = el
print(data) # Check that we got the right values

{'name': 'UO2 3.3% fuel cell', 'id': 'uo2_33', 'energy_groups': [20000000.0, 1353000.0, 9119.0, 3.928, 0.6251, 0.1457, 0.05692, 0.0], 'chi': [0.59252, 0.40714, 0.00033193, 0.0, 0.0, 0.0, 0.0]}


In [8]:
for el in xsec_root:
    if el.tag == "sig_s":
        sig_s_root = el
    else:
        data.update({el.tag: [float(v) for v in el.text.split(',')]})
print(data)

{'name': 'UO2 3.3% fuel cell', 'id': 'uo2_33', 'energy_groups': [20000000.0, 1353000.0, 9119.0, 3.928, 0.6251, 0.1457, 0.05692, 0.0], 'chi': [0.59252, 0.40714, 0.00033193, 0.0, 0.0, 0.0, 0.0], 'sig_t': [0.11113, 0.28844, 0.45382, 0.46398, 0.68795, 0.98919, 1.6809], 'sig_a': [0.0047825, 0.0020899, 0.02669, 0.018674, 0.060669, 0.09879, 0.18302], 'nu_sig_f': [0.011458, 0.001054, 0.0123, 0.022601, 0.095993, 0.15886, 0.29556], 'kappa_sig_f': [1.3977e-13, 1.3885e-14, 1.6387e-13, 3.011e-13, 1.2789e-12, 2.1164e-12, 3.9375e-12]}


The matrix quantity needs to be flatted into a single list using a bunch of string manipulations.

In [9]:
data.update({"sig_s": [float(v) for v in sig_s_root.text.replace('\n\t','').replace(';',',').split(',')]})
print(data["sig_s"])

[0.12239000000000001, 0.06713, 0.0002876, 0.0, 0.0, 0.0, 0.0, 0.0, 0.42991999999999997, 0.051655999999999994, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.74242, 0.048867, 0.0068715, 0.0012435, 0.00076773, 0.0, 0.0, 0.0, 0.54727, 0.19917, 0.027343, 0.012226, 0.0, 0.0, 0.0, 0.0047185000000000005, 0.6695, 0.20942, 0.065028, 0.0, 0.0, 0.0, 0.0, 0.14190999999999998, 0.80934, 0.25445, 0.0, 0.0, 0.0, 0.0, 0.05942000000000001, 0.41229, 1.3222]


We now have a `data` dictionary with all the data from the XML file:

In [10]:
print(data)

{'name': 'UO2 3.3% fuel cell', 'id': 'uo2_33', 'energy_groups': [20000000.0, 1353000.0, 9119.0, 3.928, 0.6251, 0.1457, 0.05692, 0.0], 'chi': [0.59252, 0.40714, 0.00033193, 0.0, 0.0, 0.0, 0.0], 'sig_t': [0.11113, 0.28844, 0.45382, 0.46398, 0.68795, 0.98919, 1.6809], 'sig_a': [0.0047825, 0.0020899, 0.02669, 0.018674, 0.060669, 0.09879, 0.18302], 'nu_sig_f': [0.011458, 0.001054, 0.0123, 0.022601, 0.095993, 0.15886, 0.29556], 'kappa_sig_f': [1.3977e-13, 1.3885e-14, 1.6387e-13, 3.011e-13, 1.2789e-12, 2.1164e-12, 3.9375e-12], 'sig_s': [0.12239000000000001, 0.06713, 0.0002876, 0.0, 0.0, 0.0, 0.0, 0.0, 0.42991999999999997, 0.051655999999999994, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.74242, 0.048867, 0.0068715, 0.0012435, 0.00076773, 0.0, 0.0, 0.0, 0.54727, 0.19917, 0.027343, 0.012226, 0.0, 0.0, 0.0, 0.0047185000000000005, 0.6695, 0.20942, 0.065028, 0.0, 0.0, 0.0, 0.0, 0.14190999999999998, 0.80934, 0.25445, 0.0, 0.0, 0.0, 0.0, 0.05942000000000001, 0.41229, 1.3222]}


# Step 2: Store in Protocol Buffer

**OPTIONAL**: Recompiple protocol buffer

In [4]:
!protoc --python_out=./proto -I=../proto ../proto/material.proto 

Import the created library:

In [5]:
import proto.material_pb2 as mat_proto

Although the top level is `Library` meant for holding multiple materials, we can make _just_ a material and store it in it's own file, this is what we will do here. As the library holds data about the group structure, we will need to make sure we indicate the group structure in the name of the material in case we forget.

Step 1, make a new material. `Material` is a `message` so it is an object in python.

In [13]:
uo2_33 = mat_proto.Material() #Create a new Material object

Store the upper level data in the material.

In [14]:
uo2_33.full_name = data["name"]
uo2_33.id = data["id"]
uo2_33.abbreviation = data["id"]
uo2_33

full_name: "UO2 3.3% fuel cell"
abbreviation: "uo2_33"
id: "uo2_33"

We have no scalar properties (like density, etc) for this material, but we have vector quantities. We will store them in `VectorProperty` objects. Their ID is given in an `enum` called `VectorId`. If there is no corresponding one, they need to be added at the protobuf recompiled. Here, I make a mapping from the keys in `data` to the id's for easy looping:

In [15]:
key_map = {"energy_groups": mat_proto.Material.ENERGY_GROUPS,
          "sig_t": mat_proto.Material.SIGMA_T,
          "sig_a": mat_proto.Material.SIGMA_A,
          "chi": mat_proto.Material.CHI,
          "nu_sig_f": mat_proto.Material.NU_SIG_F,
          "kappa_sig_f": mat_proto.Material.KAPPA_SIG_F}

In [16]:
vector_properties = []
for key, value in key_map.items():
    vector_prop = mat_proto.Material.VectorProperty()
    vector_prop.id = value
    vector_prop.value.extend(data[key])
    #print(vector_prop) # printed first to make sure it worked
    vector_properties.append(vector_prop)

Add the list of vector properties to `uo2_33`:

In [17]:
uo2_33.vector_property.extend(vector_properties)

Now we need to add the matrix property `sig_s` in a similar way. The matrix properties have two extra value that identify the size of the matrix because protocol buffers cannot store matrices, only lists.

In [18]:
sig_s_matrix = mat_proto.Material.MatrixProperty()
sig_s_matrix.id = mat_proto.Material.SIGMA_S
sig_s_matrix.value.extend(data["sig_s"])

In [19]:
uo2_33.matrix_property.extend([sig_s_matrix])

# Step 3: Save Protocol Buffer

Now we will save the file as a serialized string.

In [20]:
f = open('../lib/kaist/uo2_33', 'wb')
f.write(uo2_33.SerializeToString())
f.close()

You can see that the protocol buffer is half the size of the xml file, and benefits from a native structuring without needing to write a parser.

In [21]:
!ls -l ../lib/kaist/ | grep uo2_33

-rw-r--r-- 1 ablank ablank  816 Jun 22 16:57 uo2_33


In [22]:
!ls -l ../lib/kaist/xml/ | grep uo2_33

-rw-r--r-- 1 ablank ablank 1577 Jun 22 16:27 uo2_33.xml
