Skip to content

Commit

Permalink
Merge pull request #4122 from pnorbert/user-options-yaml-doc
Browse files Browse the repository at this point in the history
Updates to campaign management
  • Loading branch information
pnorbert committed Apr 2, 2024
2 parents 56320ec + 95f2794 commit 696096f
Show file tree
Hide file tree
Showing 10 changed files with 242 additions and 36 deletions.
187 changes: 187 additions & 0 deletions docs/user_guide/source/advanced/campaign_management.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
####################
Campaign Management
####################

The campaign management in ADIOS2 is for collecting basic information and metadata about a collection of ADIOS2 output files, from a single application run or multiple runs. The campaign archive is a single file (.ACA) that can be transferred to other locations. The campaign file can be opened by ADIOS2 and all the metadata can be processed (including the values of GlobalValue and LocalValue variables, or min/max of each Arrays at each step and decomposition/min/max of each block in an Array at each step). However, Get() operations will only succeed to read actual data of the arrays, if the data belonging to the campaign is either local or some mechanism for remote data access to the location of the data is set up in advance.

.. warning::

In 2.10, Campaign Management is just a first prototype and is included only for evaluation purposes. It will change substantially in the future and campaign files produced by this version will unlikely to be supported going forward.

The idea
========
Applications produce one or more output files in a single run. Subsequent analysis and visualization runs produce more output files. Campaign is a data organization concept one step higher than a file. A campaign archive includes information about multiple files, including the scalar variable's values and the min/max of arrays and the location of the data files (host and directory information). A science project can agree on how to organize their campaigns, i.e., how to name them, what files to include in a single campaign archive, how to distribute them, how to name the hosts where the actual data resides.

Example
-------

The Gray-Scott example, that is included with ADIOS2, in `examples/simulation/gray-scott`, has two programs, Gray-Scott and PDF-Calc. The first one produces the main output `gs.bp` which includes the main 3D variables `U` and `V`, and a checkpoint file `ckpt.bp` with a single step in it. PDF-Calc processes the main output and produces histograms on 2D slices of U and V (`U/bins` and `U/pdf`) in `pdf.bp`. A campaign can include all the three output files as they logically belong together.

.. code-block:: bash
# run application as usual
$ mpirun -n 4 adios2_simulations_gray-scott settings-files.json
$ ls -d *.bp
ckpt.bp gs.bp
$ adios2_campaign_manager.py create demoproject/frontier_gray-scott_100
$ mpirun -n 3 adios2_simulations_gray-scott_pdf-calc gs.bp pdf.bp 1000
$ ls -d *.bp
ckpt.bp gs.bp pdf.bp
$ adios2_campaign_manager.py update demoproject/frontier_gray-scott_100
$ adios2_campaign_manager.py info demoproject/frontier_gray-scott_100
info archive
ADIOS Campaign Archive, version 1.0, created on 2024-04-01 10:44:11.644942
hostname = OLCF longhostname = frontier.olcf.ornl.gov
dir = /lustre/orion/csc143/proj-shared/demo/gray-scott
dataset = ckpt.bp created on 2024-04-01 10:38:19
dataset = gs.bp created on 2024-04-01 10:38:17
dataset = pdf.bp created on 2024-04-01 10:38:08
# The campaign archive is small compared to the data it points to
$ du -sh *bp
7.9M ckpt.bp
40M gs.bp
9.9M pdf.bp
$ du -sh /lustre/orion/csc143/proj-shared/adios-campaign-store/demoproject/frontier_gray-scott_100.aca
97K /lustre/orion/csc143/proj-shared/adios-campaign-store/demoproject/frontier_gray-scott_100.aca
# ADIOS can list the content of the campaign archive
$ bpls -l demoproject/frontier_gray-scott_100
double ckpt.bp/U {4, 34, 34, 66} = 0.171103 / 1
double ckpt.bp/V {4, 34, 34, 66} = 1.71085e-19 / 0.438921
int32_t ckpt.bp/step scalar = 700
double gs.bp/U 10*{64, 64, 64} = 0.090778 / 1
double gs.bp/V 10*{64, 64, 64} = 8.24719e-63 / 0.515145
int32_t gs.bp/step 10*scalar = 100 / 1000
double pdf.bp/U/bins 10*{1000} = 0.0908158 / 0.999938
double pdf.bp/U/pdf 10*{64, 1000} = 0 / 4096
double pdf.bp/V/bins 10*{1000} = 8.24719e-63 / 0.514267
double pdf.bp/V/pdf 10*{64, 1000} = 0 / 4096
int32_t pdf.bp/step 10*scalar = 100 / 1000
# scalar over steps is available in metadata
$ bpls -l demoproject/frontier_gray-scott_100 -d pdf.bp/step -n 10
int32_t pdf.bp/step 10*scalar = 100 / 1000
(0) 100 200 300 400 500 600 700 800 900 1000
# Array decomposition including min/max are available in metadata
$ bpls -l demoproject/frontier_gray-scott_100 -D gs.bp/V
double gs.bp/V 10*{64, 64, 64} = 8.24719e-63 / 0.515145
step 0:
block 0: [ 0:63, 0:31, 0:31] = 8.24719e-63 / 0.410653
block 1: [ 0:63, 32:63, 0:31] = 8.24719e-63 / 0.410652
block 2: [ 0:63, 0:31, 32:63] = 8.24719e-63 / 0.410653
block 3: [ 0:63, 32:63, 32:63] = 8.24719e-63 / 0.410653
...
step 9:
block 0: [ 0:63, 0:31, 0:31] = 3.99908e-09 / 0.441847
block 1: [ 0:63, 32:63, 0:31] = 3.99931e-09 / 0.44192
block 2: [ 0:63, 0:31, 32:63] = 3.99928e-09 / 0.441813
block 3: [ 0:63, 32:63, 32:63] = 3.99899e-09 / 0.441796
# Array data is only available if data is local
$ ./bin/bpls -l demoproject/frontier_gray-scott_100 -d pdf.bp/U/bins
double pdf.bp/U/bins 10*{1000} = 0.0908158 / 0.999938
(0, 0) 0.93792 0.937982 0.938044 0.938106 0.938168 0.93823 0.938292 0.938354 0.938416 0.938479
...
(9,990) 0.990306 0.991157 0.992007 0.992858 0.993708 0.994559 0.995409 0.99626 0.99711 0.997961
Setup
=====

There are three paths/names important in the campaign setup.

- `hostname` is the name detected by the adios2_campaign_manager when creating a campaign archive, however, it is better to define a specific name the project agrees upon (e.g. OLCF, NERSC, ALCF) that identifies the generic location of the data and then use that name later to specify the modes of remote data access (not available in this release).

- `campaignstorepath` is the directory where all the campaign archives are stored. This should be shared between project members in a center, and a private one on every member's laptop. It is up to the project to determine what file sharing / synchronization mechanism to use to sync this directories. `Rclone is a great command-line tool <https://rclone.org>`_ to sync the campaign store with many cloud-based file sharing services and cloud instances.

- `cachepath` is the directory where ADIOS can unpack metadata from the campaign archive so that ADIOS engines can read them as if they were entirely local datasets. The cache only contains the metadata for now but in the future data that have already been retrieved by previous read requests will be stored here as well.


Use `~/.config/adios2/adios2.yaml` to specify these options.

.. code-block:: bash
$ cat ~/.config/adios2/adios2.yaml
Campaign:
active: true
hostname: OLCF
campaignstorepath: /lustre/orion/csc143/proj-shared/adios-campaign-store
cachepath: /lustre/orion/csc143/proj-shared/campaign-cache
verbose: 0
$ ls -R ~/dropbox/adios-campaign-store
/lustre/orion/csc143/proj-shared/adios-campaign-store/demoproject:
frontier_gray-scott_100.aca
$ adios2_campaign_manager.py list
demoproject/frontier_gray-scott_100.aca
Remote access
=============
For now, we have one way to access data, through SSH port forwarding and running a remote server program to read in data on the remote host and to send back the data to the local ADIOS program. `adios2_remote_server` is included in the adios installation. You need to use the one built on the host.

Launch the server by SSH-ing to the remote machine, and specifying the `26200` port for fowarding. For example:

.. code-block:: bash
$ ssh -L 26200:dtn.olcf.ornl.gov:26200 -l <username> dtn.olcf.ornl.gov "<path_to_adios_install>/bin/adios2_remote_server -v "
Assuming the campaign archive was synced to a local machine's campaign store under `csc143/demoproject`, now we can retrieve data:

.. code-block:: bash
$ adios2_campaign_manager.py list
csc143/demoproject/frontier_gray-scott_100.aca
$ bpls -l csc143/demoproject/frontier_gray-scott_100
double ckpt.bp/U {4, 34, 34, 66} = 0.171103 / 1
...
double pdf.bp/U/bins 10*{1000} = 0.0908158 / 0.999938
# metadata is extracted to the local cachepath
$ du -sh /tmp/campaign/OLCF/csc143/demoproject/frontier_gray-scott_100.aca/*
20K /tmp/campaign/OLCF/csc143/demoproject/frontier_gray-scott_100.aca/ckpt.bp
40K /tmp/campaign/OLCF/csc143/demoproject/frontier_gray-scott_100.aca/gs.bp
32K /tmp/campaign/OLCF/csc143/demoproject/frontier_gray-scott_100.aca/pdf.bp
# data is requested from the remote server
# read 16 values (4x4x4) from U from last step, from offset 30,30,30
$ bpls -l csc143/demoproject/frontier_gray-scott_100 -d gs.bp/U -s "-1,30,30,30" -c "1,4,4,4" -n 4
double gs.bp/U 10*{64, 64, 64}
slice (9:9, 30:33, 30:33, 30:33)
(9,30,30,30) 0.89189 0.899854 0.899854 0.891891
(9,30,31,30) 0.899851 0.908278 0.908278 0.899852
(9,30,32,30) 0.899849 0.908276 0.908277 0.899851
(9,30,33,30) 0.891885 0.899848 0.899849 0.891886
(9,31,30,30) 0.89985 0.908276 0.908276 0.899849
(9,31,31,30) 0.908274 0.916977 0.916977 0.908274
(9,31,32,30) 0.908273 0.916976 0.916976 0.908273
(9,31,33,30) 0.899844 0.908271 0.908271 0.899844
(9,32,30,30) 0.89985 0.908276 0.908275 0.899848
(9,32,31,30) 0.908274 0.916976 0.916976 0.908272
(9,32,32,30) 0.908272 0.916975 0.916974 0.908271
(9,32,33,30) 0.899844 0.90827 0.90827 0.899842
(9,33,30,30) 0.89189 0.899851 0.899851 0.891886
(9,33,31,30) 0.89985 0.908275 0.908275 0.899847
(9,33,32,30) 0.899848 0.908274 0.908273 0.899845
(9,33,33,30) 0.891882 0.899845 0.899844 0.89188
Requirements
============
The Campaign Manager uses SQlite3 and ZLIB for its operations, and Python3 3.8 or higher for the `adios2_campaign_manager` tool. Check `bpls -Vv` to see if `CAMPAIGN` is in the list of "Available features".

Limitations
===========

- The Campaign Reader engine only supports ReadRandomAccess mode, not step-by-step reading. Campaign management will need to change in the future to support sorting the steps from different outputs to a coherent order.
- Attributes are not processed by the Campaign Reader yet
- Updates to moving data for other location is not supported yet
1 change: 1 addition & 0 deletions docs/user_guide/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Funded by the `Exascale Computing Project (ECP) <https://www.exascaleproject.org
advanced/gpu_aware
advanced/query
advanced/plugins
advanced/campaign_management
advanced/ecp_hardware

.. toctree::
Expand Down
30 changes: 22 additions & 8 deletions source/adios2/core/ADIOS.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -108,8 +108,7 @@ static std::atomic_uint adios_refcount(0); // adios objects at the same time
static std::atomic_uint adios_count(0); // total adios objects during runtime

/** User defined options from ~/.config/adios2/adios2.yaml if it exists */
static adios2::UserOptions UserOptions;
const adios2::UserOptions &ADIOS::GetUserOptions() { return UserOptions; };
const adios2::UserOptions &ADIOS::GetUserOptions() { return m_UserOptions; };

ADIOS::ADIOS(const std::string configFile, helper::Comm comm, const std::string hostLanguage)
: m_HostLanguage(hostLanguage), m_Comm(std::move(comm)), m_ConfigFile(configFile),
Expand Down Expand Up @@ -149,10 +148,11 @@ ADIOS::ADIOS(const std::string configFile, helper::Comm comm, const std::string
#ifdef ADIOS2_HAVE_KOKKOS
m_GlobalServices.Init_Kokkos_API();
#endif
if (UserOptions.campaign.active)
if (m_UserOptions.campaign.active)
{
std::string campaignName = "campaign_" + std::to_string(adios_count);
m_CampaignManager.Open(campaignName, UserOptions);
std::string campaignName =
"campaign_" + helper::RandomString(8) + "_" + std::to_string(adios_count);
m_CampaignManager.Open(campaignName, m_UserOptions);
}
}

Expand All @@ -175,12 +175,25 @@ ADIOS::~ADIOS()
{
m_GlobalServices.Finalize();
}
if (UserOptions.campaign.active)
if (m_UserOptions.campaign.active)
{
m_CampaignManager.Close();
}
}

void ADIOS::SetUserOptionDefaults()
{
m_UserOptions.general.verbose = 0;

m_UserOptions.campaign.active = true;
m_UserOptions.campaign.verbose = 0;
m_UserOptions.campaign.hostname = "";
m_UserOptions.campaign.campaignstorepath = "";
m_UserOptions.campaign.cachepath = "/tmp/adios2-cache";

m_UserOptions.sst.verbose = 0;
}

void ADIOS::ProcessUserConfig()
{
// read config parameters from config file
Expand All @@ -190,10 +203,11 @@ void ADIOS::ProcessUserConfig()
#else
homePath = getenv("HOME");
#endif
SetUserOptionDefaults();
const std::string cfgFile = homePath + "/.config/adios2/adios2.yaml";
if (adios2sys::SystemTools::FileExists(cfgFile))
{
helper::ParseUserOptionsFile(m_Comm, cfgFile, UserOptions, homePath);
helper::ParseUserOptionsFile(m_Comm, cfgFile, m_UserOptions, homePath);
}
}

Expand Down Expand Up @@ -358,7 +372,7 @@ void ADIOS::YAMLInitIO(const std::string &configFileYAML, const std::string &con

void ADIOS::RecordOutputStep(const std::string &name, const size_t step, const double time)
{
if (UserOptions.campaign.active)
if (m_UserOptions.campaign.active)
{
m_CampaignManager.Record(name, step, time);
}
Expand Down
4 changes: 3 additions & 1 deletion source/adios2/core/ADIOS.h
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ class ADIOS
const double time = UnknownTime);

/** A constant reference to the user options from ~/.config/adios2/adios2.yaml */
static const adios2::UserOptions &GetUserOptions();
const adios2::UserOptions &GetUserOptions();

private:
/** Communicator given to parallel constructor. */
Expand Down Expand Up @@ -207,6 +207,8 @@ class ADIOS
void YAMLInitIO(const std::string &configFileYAML, const std::string &configFileContents,
core::IO &io);

adios2::UserOptions m_UserOptions;
void SetUserOptionDefaults();
void ProcessUserConfig();

private:
Expand Down
4 changes: 2 additions & 2 deletions source/adios2/engine/campaign/CampaignData.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -318,10 +318,10 @@ void SaveToFile(sqlite3 *db, const std::string &path, const CampaignBPFile &bpfi
int iBlobsize = sqlite3_column_bytes(statement, 0);
const void *p = sqlite3_column_blob(statement, 0);

std::cout << "-- Retrieved from DB data of " << bpfile.name << " size = " << iBlobsize
/*std::cout << "-- Retrieved from DB data of " << bpfile.name << " size = " << iBlobsize
<< " compressed = " << bpfile.compressed
<< " compressed size = " << bpfile.lengthCompressed
<< " original size = " << bpfile.lengthOriginal << " blob = " << p << "\n";
<< " original size = " << bpfile.lengthOriginal << " blob = " << p << "\n";*/

size_t blobsize = static_cast<size_t>(iBlobsize);
std::ofstream f;
Expand Down
10 changes: 0 additions & 10 deletions source/adios2/engine/campaign/CampaignReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,22 +33,12 @@ CampaignReader::CampaignReader(IO &io, const std::string &name, const Mode mode,
: Engine("CampaignReader", io, name, mode, std::move(comm))
{
m_ReaderRank = m_Comm.Rank();
if (m_Options.verbose > 1)
{
std::cout << "Campaign Reader " << m_ReaderRank << " Open(" << m_Name << ") in constructor."
<< std::endl;
}
Init();
m_IsOpen = true;
}

CampaignReader::~CampaignReader()
{
/* CampaignReader destructor does close and finalize */
if (m_Options.verbose > 1)
{
std::cout << "Campaign Reader " << m_ReaderRank << " destructor on " << m_Name << "\n";
}
if (m_IsOpen)
{
DestructorClose(m_FailVerbose);
Expand Down
19 changes: 19 additions & 0 deletions source/adios2/helper/adiosString.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
/// \cond EXCLUDE_FROM_DOXYGEN
#include <fstream>
#include <ios> //std::ios_base::failure
#include <random>
#include <sstream>
#include <stdexcept> // std::invalid_argument
/// \endcond
Expand Down Expand Up @@ -484,5 +485,23 @@ std::string RemoveTrailingSlash(const std::string &name) noexcept
return name.substr(0, len);
}

std::string RandomString(const size_t length)
{
size_t len = length;
if (len == 0)
len = 1;
if (len > 64)
len = 64;

std::string str("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzA");

std::random_device rd;
std::mt19937 generator(rd());

std::shuffle(str.begin(), str.end(), generator);

return str.substr(0, len);
}

} // end namespace helper
} // end namespace adios2
7 changes: 7 additions & 0 deletions source/adios2/helper/adiosString.h
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,13 @@ std::set<std::string> PrefixMatches(const std::string &prefix,
*/
std::string RemoveTrailingSlash(const std::string &name) noexcept;

/**
* Generate a random string of length between 1 and 64
* This is a dummy string generator, don't use it for uuid or
* for generating truly random unique strings en masse
*/
std::string RandomString(const size_t length);

} // end namespace helper
} // end namespace adios2

Expand Down
13 changes: 0 additions & 13 deletions source/adios2/helper/adiosYAML.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -266,19 +266,6 @@ void ParseUserOptionsFile(Comm &comm, const std::string &configFileYAML, UserOpt

const std::string configFileContents = comm.BroadcastFile(configFileYAML, hint);

/*
* Set defaults first
*/
options.general.verbose = 0;

options.campaign.active = true;
options.campaign.verbose = 0;
options.campaign.hostname = "";
options.campaign.campaignstorepath = "";
options.campaign.cachepath = "/tmp/adios2-cache";

options.sst.verbose = 0;

const YAML::Node document = YAML::Load(configFileContents);
if (!document)
{
Expand Down

0 comments on commit 696096f

Please sign in to comment.