Skip to content
botonchou edited this page Dec 23, 2014 · 2 revisions

XML model

All model is stored in XML format.

XML model is simple and self-explanatory

For example, a model generated by nn-init will look like this:

<transform type="Affine" input-dim="20" output-dim="64" momentum="0.100000" learning-rate="0.100000" >
  <weight></weight>
  <bias></bias>
</transform>
<transform type="Sigmoid" input-dim="64" output-dim="64" />
<transform type="Affine" input-dim="64" output-dim="64" momentum="0.100000" learning-rate="0.100000" >
  <weight></weight>
  <bias></bias>
</transform>
<transform type="Sigmoid" input-dim="64" output-dim="64" />
<transform type="Affine" input-dim="64" output-dim="2" momentum="0.100000" learning-rate="0.100000" >
  <weight></weight>
  <bias></bias>
</transform>
<transform type="Softmax" input-dim="2" output-dim="2" />

Don't worry about indentation like TAB or space. I use 3rd party library rapidxml, which is very robust.
BUT be careful, rapidxml is a lightweight XML parse. It can only parse

<transform type="Sigmoid" input-dim="2" output-dim="2" />

NOT this (which is common in HTML)

<transform type=Sigmoid input-dim=2 output-dim=2 />

Remember to quote attribute like "value" !!

Also, empty node like <weight></weight> and <bias></bias> is okay.
My program nn-train will try to fill them with random numbers (see normalized uniform distribution).

Changing Activation Functions

You can change activation functions by replacing the type attribute in <transfrom .. /> like this:

<transform type="tanh" input-dim="2" output-dim="2" />

Case is insensitive. Either type="tanh" or type="Tanh" will be fine. It'll be converted it to lower case when parsed.

Here's the list of activation functions available now:

  • Sigmoid
  • Tanh
  • ReLU
  • Softplus
  • Softmax (last layer only)
  • Convolution
  • SubSample

Add a New Layer

Take dropout for example, you can add it after activation functions (ex: Sigmoid) to the above model by inserting:

<transform type="Dropout" input-dim="64" output-dim="64" dropout-ratio="0.3"/>

The attribute dropout-ratio means in what percentage do you want to dropout.
In this case dropout-ratio="0.3" means 30% of hidden nodes will be randomly turned off.

The results will be:

<transform type="Affine" input-dim="20" output-dim="64" momentum="0.100000" learning-rate="0.100000" >
  <weight></weight>
  <bias></bias>
</transform>
<transform type="Sigmoid" input-dim="64" output-dim="64" />
<transform type="Dropout" input-dim="64" output-dim="64" dropout-ratio="0.3"/>
<transform type="Affine" input-dim="64" output-dim="64" momentum="0.100000" learning-rate="0.100000" >
  <weight></weight>
  <bias></bias>
</transform>
<transform type="Sigmoid" input-dim="64" output-dim="64" />
<transform type="Dropout" input-dim="64" output-dim="64" dropout-ratio="0.3"/>
<transform type="Affine" input-dim="64" output-dim="2" momentum="0.100000" learning-rate="0.100000" >
  <weight></weight>
  <bias></bias>
</transform>
<transform type="Softmax" input-dim="2" output-dim="2" />