Skip to content

Developer_Information

Joshua J. Cogliati edited this page Jul 17, 2024 · 39 revisions

Developer Information:

Development workflow

Create a new development branch

[~]> cd projects/raven
[~/projects/raven]> git checkout devel
[~/projects/raven]> git checkout -b username/branch_name

Download an existing development branch

[~/projects/raven]> cd raven
[~/projects/raven]> git fetch
[~/projects/raven]> git checkout username/branch_name

Do development

#edit files
[~/projects/raven]> git add -u
#create a new file
[~/projects/raven]> git add new_file
[~/projects/raven]> git commit -m "add a meaningful text"

Push

Note that this assumes that:

[~/projects/raven]> git config --global push.default simple

is used.

[~/projects/raven]> git push --set-upstream origin username/branch_name

After the first push, then:

[~/projects/raven]> git push

can be used.

Tagged version

Certain versions are tagged.

The list of tags can be seen with

[~/projects/raven]> git tag -n

The tags can be checkout out:

[~/projects/raven]> git checkout tag_number_2

Conversion Scripts

Whenever a pull request is issued that renders existing input formats invalid, a conversion script needs to be created to fix old input files. The script should have the following properties:

  • Resides in the directory raven/scripts/conversionScripts
  • Is a python module that has a method of the following form. Notably, it is called convert, it accepts one parameter tree, and returns one parameter tree.
def convert(tree):
  " Converts input files to be compatible with pull request 1234.
    @In, tree, xml.etree.ElementTree object, the entire contents of an input file
    @Out, tree, xml.etree.ElementTree object, the modified contents of an input file
  "
  # ...script body...
  return tree
  • The if __name__=='__main__' block should be used in the script to accept either a single filename or a list of filenames (such as produced by the command-line wildcard *) and iteratively parse the input file, apply the convert operation, and write the new file.
  • The usage of the script is such that it is called from the directory the file is in. For example, if my script is convscript.py and my deprecated input file is old.xml, and old.xml resides in raven/input/myoldinputs, the command from command line should appear as
user@machine:~/projects/raven/input $ python ../../scripts/conversionScripts/convscript.py old.xml

and the modified old.xml should appear in the same location as the old one.

  • Before overwriting a file, the script must create a backup of that file by appending a .bak extension (for example, old.xml.bak). If that backup already exists, the script should exit without writing, and throw a warning to the user explaining that if they're sure they want to run the script, please remove the .bak file (by name).

Python Code Standard

Internal Parallelization for Internal Object

Currently, RAVEN supports two distinct methods to perform internal objects/actions (ROMs, External Model, etc.) parallel execution:

  • multi-threading;
  • parallel python.

A function/method can be executed in parallel through the the JobHandler class method addInternal:

addInternal(self,Input,functionToRun,identifier,metadata=None, modulesToImport = [], forceUseThreads = False):
    """
     Method to add an internal run (function execution)
     @ In, Input, list, list of Inputs that are going 
                        to be passed to the function to be executed as *args
     @ In, functionToRun,function or method, the function that needs to be executed
     @ In, identifier, string, the job identifier
     @ In, metadata, dict, optional, dictionary of metadata associated to this run
     @ In, modulesToImport, list, optional, list of modules 
                                            that need to be imported for internal parallelization
                                            (parallel python). This list should be generated with
                                            the method returnImportModuleString in utils.py
     @ In, forceUseThreads, bool, optional, flag that, if True, is going to force the usage of
                                            multi-threading even if parallel python is activated
     @ Out, None
    """

In case the developer wants to use the parallel python implementation (equivalent of MPI for Python), the modulesToImport parameter needs to be specified. All the bound methods whose classes are based on the BaseType class automatically have a list of the module dependencies in the variable self.mods. In case the method uses functions/methods that are defined outside the caller class (for example, ROM class uses SupervisedLearning module), the list of modules needs to be extended (in the main class, e.g. ROM) using the method returnImportModuleString in utils.py. In addition, the Parallel Python relies on the pickle module. If the new class that is developed is not fully pickle-able, the developer needs to implement the method __getstate__ and __setstate__.

Testing

When adding new code, it may be necessary to test whether the new objects constructed can be pickled and unpickled for use in parallel and that no other race conditions or other parallel problems arise. As such, two modes of parallelism should be tested when adding new code:

  • qsub/MPI parallelism
  • internal mulithreading

The following paragraphs should help guide you in writing a test case for each.

Internal Parallel Testing

The easiest way to test that the internal parallel testing is working is to replicate one of your existing test cases and to add the node <internalParallel> node to the <runInfo> block:

  <RunInfo>
    ...
    <internalParallel>True</internalParallel>
  </RunInfo>

Cluster Testing

Under raven/tests/cluster_tests/, there is a script called test_qsubs.sh which holds the parallel execution test cases that run on the HPC. Again, the simplest way to get a test case running is to replicate an existing test case and to modify the <runInfo block to contain both the <internalParallel> and <mode qsub> nodes.

  <RunInfo>
    ...
    <internalParallel>True</internalParallel>
    <mode>mpi<runQSUB/></mode>
  </RunInfo>

You can then add your test to this file by following along with an existing example within this file. See the example below:

# Go into the appropriate test directory
cd InternalParallel/

#clean the output directory first
rm -Rf InternalParallelExtModel/*.csv

#Run the parallel RAVEN input
python ../../../framework/Driver.py test_internal_parallel_extModel.xml

#Wait for disk to propagate
sleep 2

# Go into the output directory
cd InternalParallelExtModel/

# Count the number of files
lines=`ls *.csv | wc -l`

# Go back up
cd ..

# Test whether the correct number of files were generated
if test $lines -eq 28; then
    echo PASS paralExtModel
else
    echo FAIL paralExtModel
    num_fails=$(($num_fails+1))
fi

# Restore the working directory for the next test case
cd ..

Currently, these tests only verify that the correct number of files were created. More robust testing can be implemented here.

Errors, Warnings, and Messages

As of !180, there is a standardized method for errors, messages, and warnings within RAVEN. This allows for custom error and message handling in the future. Additionally, we replaced the binary debug to the discrete verbosity, which has the following levels:

  • silent: only Error messages are displayed,
  • quiet: Error and Warning messages are displayed,
  • all: Error, Warning, and Message messages are displayed (default),
  • debug: Error, Warning, Message, and Debug messages are all displayed.

verbosity can be set locally and globally, wherever the old debug was set. Additionally, errors/warnings/messages/debugs can all have their verbosity level set when called using the optional keyword argument verbosity='all', where 'all' can be any of the four options listed above.

Errors

Each RAVEN object now inherits from MessageHandler.MessageUser, and should be initialized to know about the Simulation's MessageHandler.MessageHandler instance.

To raise an error, call self.raiseAnError(errortype, msg). errortype is a Python exception object, such as TypeError or IOError. msg is the message to the user you want to show. For instance, both of these error calls are acceptable:

if not found: self.raiseAnError(IOError, 'Keyword not found in input!')

if x>3: self.raiseAnError(ValueError, 'x cannot be greater than 3!')

Nowhere in RAVEN (except utils) should it be necessary to use syntax such as raise IOError(msg).

Assertions

Python assert statements are a way to quickly check something before moving forward. They can be skipped by running python -O <script>.py. RAVEN takes advantage of this by using assertions when developing code that is exclusively internal; that is, the user cannot fail the assertion because of an input error, but the assertion might be failed because of internal communication when new development is performed.

For example, when adding a realization to a DataObject, we might assert that the type of the realization is a dict, as

assert(type(rls)==dict)

The user never directly interfaces with this code, so the assertion cannot be failed because of an input error. However, the assertion is useful in the event development is being performed which (intentionally or unintentionally) changes the type of rls.

Because it can be removed using Python's opt mode, this kind of assertion is preferable to this following:

# not like this
if type(rls) != dict:
  self.raiseAnError(TypeError,'Wrong type sent in as a realization!')

which is slower than asserting and can't be optimized out.

By default, when raven_framework is run, the -O option is used, so assertions are removed. To preserve assertions when debugging, run raven_framework -D to enter development mode. As a recommendation if consistently developing, alias raven_framework -D to raven, or something similar.

Warnings, Messages, and Debugs

To print a warning, message, or debug to the standard output, call self.raiseAWarning(msg) or self.raiseAMessage(msg) or self.raiseAMessage(msg) respectively. msg is the message to display. For example:

if not found: self.raiseAWarning('order not found in input; defaulting to 2.')

self.raiseAMessage(' *** Starting Simulation *** ')

self.raiseADebug(' Value at SIM:298 -> '+str(value))

Nowhere in RAVEN should it be necessary to implement print() statements.

Checklist for merging

Checklists are now on Development Checklists

Validating against XSD schema

The shell script "validate_xml.sh" in the developer_tools directory in RAVEN parent directory will validate all the xml files in the test/framework directory against the "raven.xsd" schema file. The script is using xmllint as the validation tool.

Currently not all the input features are included in the schema files, however, if you add any new functionality please update the relevant xsd schema file.

A good reference to xsd: http://www.w3schools.com/schema/default.asp The other reference is: https://www.w3.org/TR/xmlschema-0/

Merging

Go to https://github.com/idaholab/raven/pulls and click on "New Pull Request". Then select the branch you pushed, and add a description, and "Submit pull request".

Then, someone else can look at it and merge it.

Merging REMARKS

The "Pull Request" should represent a complete task but as small as possible. This improves the quality of the review process.

Input file changes

For input file changes, three things need to be done:

  1. Update the manual to describe the new file format.
  2. Email the mailing list about the change.
  3. Create a converter to convert from the old syntax to the new syntax.

The converter should be tested by using it to change all the framework tests. It probably should use Element Tree to parse the old file, make the changes, and then write out the new input file.

os.linesep versus \n

For most of the time, \n should be used for newlines. For print and text mode files, this will automatically be converted to the correct form for the operating system. For binary files os.linesep needs to be used. In general binary files should not be used for CSV files or XML files.

thing \n os.linesep
print works works
open(...,"w") works fails
open(...,"wb") fails works

Developing Regression Tests

See Adding New Tests for more information.

Regression tests for the python RAVEN framework are found in raven/tests/framework. There is a hierarchy of folders with tests collected by similar testing. In order to add a new test, the following node must be included in the test file within the <Simulation> block:

<Simulation>
  ...
  <TestInfo>
    <name>framework/path/to/test/label</name>
    <author>AuthorGitHubTag</author>
    <created>YYYY-MM-DD</created>
    <classesTested>Module.Class, Module.Class</classesTested>
    <description>
        Paragraph describing workflows, modules, classes, entities, et cetera, how they are tested, and any other notes
    </description>
    <requirements>RequirementsLabel</requirements>
    <analytic>paragraph description of analytic test</analytic>
    ...
  </TestInfo>
  ...
</Simulation>

The <requirements> and <analytic> nodes are optional, for those tests who satisfy an NQA design requirement and/or have an analytic solution documented in the analytic tests document. Other notes on block contents:

  • name: this is the test framework path, as well as the name/label assigned in the tests file block. This is the path and name that show up when running the tests using the MOOSE testing harness (run_tests).
  • author: this is the GitHub tag of the author who constructed this test originally, i.e. alfoa for @alfoa.
  • created: this is the date on which the test was originally created, in year-month-day YYYY-MM-DD XSD date format.
  • classesTested: a list of the classes tested in the python framework, listed as Entity.Class, i.e. Samplers.MonteCarlo.
  • description: general notes about what workflows or other methods are tested.
  • requirements: (optional) lists the NQA requirement that this test satisfies.
  • analytic: (optional) describes the analytic nature of this test and how it is documented in the analytic tests documentation.

An additional node is optionally available to demonstrate significant revisions to a test:

<Simulation>
  ...
  <TestInfo>
    ...
    <revisions>
      <revision author="AuthorGitHubTag" date="YYYY-MM-DD">paragraph description of revision</revision>
    </revisions>
    ...
  </TestInfo>
  ...
</Simulation>

Misc Tips and tricks

Find out what a git push --force will do (add --dry-run):

[~/projects/raven]> git push --force --dry-run

Clean out ignored and uncommited files in current and subdirectories:

[~/projects/raven]> git clean -f -x .

Edit the last 4 commits:

[~/projects/raven]> git rebase -i HEAD~4

Cache the username and password for https:

[~/projects/raven]> git config --global credential.helper cache

To find all the floating point errors in crow:

CFLAGS="-Werror=float-conversion" python setup.py build_ext build install --install-platlib=`pwd`/install

Or even more errors:

CFLAGS="-Werror -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wfloat-conversion" python setup.py build_ext build install --install-platlib=`pwd`/install

Updated:

CFLAGS="-Werror -Wno-unused-function -Wno-misleading-indentation -Wno-ignored-attributes -Wno-deprecated-declarations -Wno-unused-local-typedefs -Wno-maybe-uninitialized -Wfloat-conversion" python setup.py build_ext build install --install-platlib=`pwd`/install

There is a script to remove whitespace (put the directory you want it to use as the parameter):

../raven/developer_tools/delete_trailing_whitespace.sh .

The amsc library can be manually compiled with setup.py

cd ../raven/
python setup.py build_ext build  install --install-platlib=`pwd`/src/contrib

Running run_tests for a pip installed raven but with a tests from a clone. (note that plugins need to be installed with pip, or not installed in the clone):

./run_tests --re="user_guide" --tester-command RavenFramework raven_framework --skip-load-env

Directory list:

  • crow - python libraries for RAVEN
  • developer_tools - Extra tools for developers (such as a relap7 to raven input converter tool)
  • doc - doxygen directories, qa documents and user manuals and other documentation.
  • framework - The directories for the RAVEN framework that allows multiple RAVEN's to be run, branched etc.
  • plugin - Plugin directory. In here multiple plugins can be stored
  • gui - Code for the graphical interface peacock
  • include - C++ include files
  • inputs - Extra example inputs
  • papers - Papers about RAVEN
  • scripts - This is the location of the test harness and compiling scripts.
  • src - C++ compilation files
  • tests - The tests for the test harness (run by run_tests)
  • work_in_progress - Extra code or examples that might be useful in the future.

Intermittent failures

This is a list of issues for tests that fail randomly sometimes. If a test fails randomly, it should be listed here.

Testing multiple times

Sometimes you need to run some tests multiple times to find an intermittent failure. The following can be used (replace the re= with the tests you are looking at:

COUNT=0; while ./run_tests -j11 --re=Sobol; do COUNT=$((COUNT+1)); echo $COUNT; done

Checking for docstrings

pylint --disable=all --enable=missing-docstring framework/

Adding new Python dependencies

Adding new Python dependencies to RAVEN is done by editing the file raven/dependencies.xml'. This file uses the "XML" file structure, and instructions for how it is used and read are found within that file. The list of dependencies can be shown using calls to raven/scripts/library_handler.py`; for example,

python library_handler.py conda --action install

Conflict of Dependencies License

  • GPLv3 software cannot be included in Apache projects. The licenses are incompatible in one direction only, and it is a result of ASF's licensing philosophy and the GPLv3 authors' interpretation of copyright law. (More details can be found: https://www.apache.org/licenses/GPL-compatibility.html)

Custom Diff and Version Control for an Excel Tool

In ~/projects/raven/scripts, there is a file named "Excel_diff.py" that can compare the two different excel files for version control. To execute this python file and perform diff for different versions of the excel, please follow the instruction below after you have installed Raven successfully:

Step 1: Install Xlwings inside the python package

[~]> pip install xlwings

Step 2: Put the target excel file under your local repository

You can put the target excel file in any folders. For example, under ~/projects/Name_of_your_folder

[~]> cd projects/
[~/projects]> mkdir <Name_of_your_folder>
[~/projects/Name_of_your_folder]> git init 

Step 3: Add one line to the config file under .git folder:

[diff "exceldiff"]
   command = Python ~/projects/raven/scripts/Excel_diff.py

Step 4: Add the “.gitattributes” files in the testing folder that include the following information.

*.xla diff=exceldiff
*.xlam diff=exceldiff
*.xls diff=exceldiff
*.xlsb diff=exceldiff
*.xlsm diff=exceldiff
*.xlsx diff=exceldiff
*.xlt diff=exceldiff
*.xltm diff=exceldiff
*.xltx diff=exceldiff

Step 5: Version control of your excel files

You must close the excel file before executing the "git diff"

There are two ways for the version control. The first is to compare the excel files in the same branch [1] while the other condition is to compare the excel files in the different branches [2]. Please follow the instructions bellow to diff the two excel files:

[1] In the same branch

Make changes, save and close the target excel file

Check the status of the git and make sure there are changes to the file

[~/projects/Name_of_your_folder]> git status 

Run diff for your excel file

[~/projects/Name_of_your_folder]> git diff <Name_of_your_excel_file>

Check the "Diff_Results.txt" in the same folder. The outputs (Diff_Results.txt) show the difference between the two different versions of the excel if you made some changes. You will need to review the changes row by row and see if you agree. If yes, you would need to perform the following actions.

[~/projects/Name_of_your_folder]>git add <Name_of_your_excel_file>
[~/projects/Name_of_your_folder]>git git commit -m "adding text over here"

The committed text would be saved in the log file under git repository. If you do not agree with the changes, you would need to do the following actions to remove the changes and back to the original version.

[~/projects/Name_of_your_folder]>git restore <Name_of_your_excel_file>

[2] In the different branches

Create a new branch

[~/projects/Name_of_your_folder]> git branch new

Switch to the "new" branch from "master" branch

[~/projects/Name_of_your_folder]>git checkout new

Open an excel to make changes. Then save and close the file. git add and commit the changes

[~/projects/Name_of_your_folder]>git add <Name_of_your_excel_file>
[~/projects/Name_of_your_folder]>git commit -m "adding text over here"

Diff the excels in different branches

[~/projects/Name_of_your_folder] git diff master…HEAD

Check the "Diff_Results.txt" in the current branch. The outputs (Diff_Results.txt) show the difference between the two different versions of the excel if you made some changes. You will need to review the changes row by row and see if you agree. If yes, you would need to perform the following actions to merge the branches

[~/projects/Name_of_your_folder]>git checkout master
[~/projects/Name_of_your_folder]>git merge new

If you do not agree with the changes, you can remove the new branch.

[~/projects/Name_of_your_folder]>git branch --delete new
Clone this wiki locally