The IPS portal hosted on the NERSC Spin service, shows the progress and status of IPS runs on a variety of machines. The simulation configuration file and platform configuration file contain entries that allow the IPS to publish events to the portal.
On the top-level page, you will see information about each run including who ran it, the current status, physics time stamp, wall time, and a descriptive comment. From there you can click on a Run ID to see the details of that run, including calls on components, data movement events, task launches and finishes, and checkpoints.
To use the portal include
USE_PORTAL = True
PORTAL_URL = http://lb.ipsportal.production.svc.spin.nersc.org
The source code for the portal can be found one GitHub and issues can be reported using GitHub issues.
in either your :doc:`Platform Configuration File<platform>` or your :doc:`Simulation Configuration File<config_file>`.
Note
New in IPS-Framework 0.6.0
IPS has the ability to capture a trace of the workflow to allow analysis and visualizations. The traces are captured in the Zipkin Span format and viewed within IPS portal using Jaeger.
After selecting a run in the portal there will be a link to the trace:
The default view is the Trace Timeline but other useful views are Trace Graph and Trace Statistic which can be selected from the menu in the top-right:
The statistics can be further broken down by operation.
Note
Self time (ST) is the total time spent in a span when it was not waiting on children. For example, a 10ms span with two 4ms non-overlapping children would have self-time = 10ms - 2 * 4ms = 2ms.
Note
New in IPS-Framework 0.7.0
If you have a workflow where you are running ips
as a task of
another IPS simulation you can create a relation between them that
will allow it to be viewed together in the IPS-portal and get a single
trace for the entire collection.
To setup the hierarchical structure between different IPS runs, so if
one run starts other runs as a separate simulation, you can set the
PARENT_PORTAL_RUNID
parameter in the child simulation
configuration. This can be done dynamically from the parent simulation
like:
child_conf['PARENT_PORTAL_RUNID'] = self.services.get_config_param("PORTAL_RUNID")
This is automatically configured when running
ips_dakota_dynamic.py
.
The child runs will not appear on the main runs list but will appear on a tab next to the events.
The trace of the primary simulation will contain the traces from all the simulations: