Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PnetCDF module #456

Merged
merged 52 commits into from
Nov 8, 2022
Merged

PnetCDF module #456

merged 52 commits into from
Nov 8, 2022

Conversation

wkliao
Copy link
Contributor

@wkliao wkliao commented Aug 24, 2021

Add/update the PnetCDF module.

  • the default is changed to --disable-pnetcdf-mod
  • command-line option --with-pnetcdf is added
  • All PnetCDF APIs are included.
  • Internal counters and timestamps are now separated for files and variables respectively

@shanedsnyder
Copy link
Contributor

I'm reviewing this now, but probably a good starting point here would be to work on rebasing this on top of #726, which increases Darshan's maximum modules from 16 to 64 -- these PnetCDF modules push us beyond our current limits on main. This will require a little bit of work to update the PnetCDF module for changes to Darshan's core API since this PR was submitted.

@shanedsnyder
Copy link
Contributor

This is now rebased on our main branch and is working fine for me using IOR+PnetCDF as a test. I'll add more detailed feedback as a review.

darshan-runtime/doc/darshan-runtime.txt Outdated Show resolved Hide resolved
include/darshan-pnetcdf-log-format.h Outdated Show resolved Hide resolved
@shanedsnyder
Copy link
Contributor

I started a small review for more concise things, but there are a few things that stood out to me at a high-level that it might be good for us all to discuss:

  • We will need to decide how exactly we want to provide backwards compatibility (if any at all) to our existing PnetCDF module. Notably, there have been some key changes to counter names that might complicate up-conversion of old PnetCDF records to the new format: PNETCDF_INDEP_OPENS and PNETCDF_COLL_OPENS are completely removed in the new format version, which also defines PNETCDF_CREATES and PNETCDF_OPENS now. We could still upconvert as best we can (maybe combining INDEP_OPENS/COLL_OPENS into OPENS and setting CREATES to -1 when upconverting?). Or we could provide no backwards compatibility and avoid parsing PnetCDF data from logs with old format versions.
  • For the new PnetCDF file module, there are a lot of timers that we probably wouldn't typically capture. Right now, this module captures _START_TIMESTAMP, _END_TIMESTAMP, and _TIME counters for each operation it instruments (open, create, close, redef, enddef, sync, coll_wait, indep_wait). To better match our level of capture in other modules, I'd suggest we just capture open TIMESTAMP information (with this accounting for opens and creates), and combine all of the cumulative TIME counters into a single META_TIME counter. Part of the reasoning for doing this is that some of our analysis tools kind of expect this format, and it will complicate PnetCDF analysis if it's being a lot more specific, I think.
  • There are a lot of (560!) wrapped functions. I'd be surprised if they are all supported in all PnetCDF versions, but maybe they are? If not, we probably want to think about how to best handle this (i.e., autoconf checks for functions defined in newer PnetCDF releases, only formally supporting PnetCDF versions that have all of these functions implemented, etc.).

@wkliao
Copy link
Contributor Author

wkliao commented Aug 18, 2022

  • We will need to decide how exactly we want to provide backwards compatibility (if any at all) to our existing PnetCDF module. Notably, there have been some key changes to counter names that might complicate up-conversion of old PnetCDF records to the new format: PNETCDF_INDEP_OPENS and PNETCDF_COLL_OPENS are completely removed in the new format version, which also defines PNETCDF_CREATES and PNETCDF_OPENS now. We could still upconvert as best we can (maybe combining INDEP_OPENS/COLL_OPENS into OPENS and setting CREATES to -1 when upconverting?). Or we could provide no backwards compatibility and avoid parsing PnetCDF data from logs with old format versions.

Like in MPI-IO, file create and open in PnetCDF are collective only. I think using INDEP_OPENS/COLL_OPENS in Darshan was a mistake (and misleading).

@wkliao
Copy link
Contributor Author

wkliao commented Aug 18, 2022

  • For the new PnetCDF file module, there are a lot of timers that we probably wouldn't typically capture. Right now, this module captures _START_TIMESTAMP, _END_TIMESTAMP, and _TIME counters for each operation it instruments (open, create, close, redef, enddef, sync, coll_wait, indep_wait). To better match our level of capture in other modules, I'd suggest we just capture open TIMESTAMP information (with this accounting for opens and creates), and combine all of the cumulative TIME counters into a single META_TIME counter. Part of the reasoning for doing this is that some of our analysis tools kind of expect this format, and it will complicate PnetCDF analysis if it's being a lot more specific, I think.

This sounds fine with me.

@wkliao
Copy link
Contributor Author

wkliao commented Aug 18, 2022

  • There are a lot of (560!) wrapped functions. I'd be surprised if they are all supported in all PnetCDF versions, but maybe they are? If not, we probably want to think about how to best handle this (i.e., autoconf checks for functions defined in newer PnetCDF releases, only formally supporting PnetCDF versions that have all of these functions implemented, etc.).

The last time PnetCDF added new I/O-related APIs was version 1.6.0, released in February 2, 2015. (FYI. the latest release is 1.12.3.) We can make 1.6.0 the minimum requirement in configure.ac. Its Release Note contains all history, just keyword search "New APIs".

@carns
Copy link
Contributor

carns commented Aug 18, 2022

  • We will need to decide how exactly we want to provide backwards compatibility (if any at all) to our existing PnetCDF module. Notably, there have been some key changes to counter names that might complicate up-conversion of old PnetCDF records to the new format: PNETCDF_INDEP_OPENS and PNETCDF_COLL_OPENS are completely removed in the new format version, which also defines PNETCDF_CREATES and PNETCDF_OPENS now. We could still upconvert as best we can (maybe combining INDEP_OPENS/COLL_OPENS into OPENS and setting CREATES to -1 when upconverting?). Or we could provide no backwards compatibility and avoid parsing PnetCDF data from logs with old format versions.

What if we assigned this module a different ID and just treated them as two separate things? That way if the parser encounters a log with the old-style pnetcdf module it will still emit it in case someone needs that. It doesn't seem worth the effort to up-convert since the scope is so different.

If we go that route (new module id, keep ability to print old module id) then I guess we want to make sure the prefix is slightly different.

I'm not sure about this idea yeah, just wanted to offer it as an idea for discussion.

@shanedsnyder
Copy link
Contributor

  • There are a lot of (560!) wrapped functions. I'd be surprised if they are all supported in all PnetCDF versions, but maybe they are? If not, we probably want to think about how to best handle this (i.e., autoconf checks for functions defined in newer PnetCDF releases, only formally supporting PnetCDF versions that have all of these functions implemented, etc.).

The last time PnetCDF added new I/O-related APIs was version 1.6.0, released in February 2, 2015. (FYI. the latest release is 1.12.3.) We can make 1.6.0 the minimum requirement in configure.ac. Its Release Note contains all history, just keyword search "New APIs".

Awesome, thanks for the details. I'll add in an autoconf check for this version number and will mention in the darshan-runtime docs about the version dependency (though it sounds like these functions are all sufficiently old, that we're unlikely to hit this problem in practice).

@wkliao
Copy link
Contributor Author

wkliao commented Aug 18, 2022

Backward compatibility refers to a new version (Darshan parser) being able to read the log files created by an older version. So, when the new parser encountered PNETCDF_INDEP_OPENS and PNETCDF_COLL_OPENS, it should merge them into PNETCDF_OPEN (same for PNETCDF_CREATE).

One question, how does Darshan tell an open/create is independent?

@shanedsnyder
Copy link
Contributor

  • For the new PnetCDF file module, there are a lot of timers that we probably wouldn't typically capture. Right now, this module captures _START_TIMESTAMP, _END_TIMESTAMP, and _TIME counters for each operation it instruments (open, create, close, redef, enddef, sync, coll_wait, indep_wait). To better match our level of capture in other modules, I'd suggest we just capture open TIMESTAMP information (with this accounting for opens and creates), and combine all of the cumulative TIME counters into a single META_TIME counter. Part of the reasoning for doing this is that some of our analysis tools kind of expect this format, and it will complicate PnetCDF analysis if it's being a lot more specific, I think.

This sounds fine with me.

Okay, I'll take a shot at updating the module along these lines.

@shanedsnyder
Copy link
Contributor

  • We will need to decide how exactly we want to provide backwards compatibility (if any at all) to our existing PnetCDF module. Notably, there have been some key changes to counter names that might complicate up-conversion of old PnetCDF records to the new format: PNETCDF_INDEP_OPENS and PNETCDF_COLL_OPENS are completely removed in the new format version, which also defines PNETCDF_CREATES and PNETCDF_OPENS now. We could still upconvert as best we can (maybe combining INDEP_OPENS/COLL_OPENS into OPENS and setting CREATES to -1 when upconverting?). Or we could provide no backwards compatibility and avoid parsing PnetCDF data from logs with old format versions.

What if we assigned this module a different ID and just treated them as two separate things? That way if the parser encounters a log with the old-style pnetcdf module it will still emit it in case someone needs that. It doesn't seem worth the effort to up-convert since the scope is so different.

If we go that route (new module id, keep ability to print old module id) then I guess we want to make sure the prefix is slightly different.

I'm not sure about this idea yeah, just wanted to offer it as an idea for discussion.

I think one key downside to this approach is just that utilities would have to be aware of both formats (old/new) and try to handle gracefully. It probably would just simplify things generally for us to only expose the new format going forward?

Looking closer, the only issue with the up-conversion is just that there are no longer INDEP_OPENS and COLL_OPENS counters, so we have to combine them into a single counter now in the new format. I'm not sure that's really a big deal, especially if we are making the call going forward that there's not really a distinction between INDEP_OPENS and COLL_OPENS. That is, if we think it's important enough to distinguish between these, then they ought to be accounted for in our new counters -- Wei-Keng seems to think the distinction is misleading, so maybe they really should just be cut entirely?

Also, FWIW, we've already taken a similar approach in the HDF5 module, in that it was previously a barebones module that is now being upconverted into the new more detailed format.

@shanedsnyder
Copy link
Contributor

One question, how does Darshan tell an open/create is independent?

Looks like in the PnetCDF module, an open is marked as independent if the input communicator size == 1. I'm not sure it's semantically useful to capture that distinction, though. Presumably if the input communicator only contains 1 process, it will be clear from the rank information captured by Darshan how many processes opened it?

@carns
Copy link
Contributor

carns commented Aug 18, 2022

Looking closer, the only issue with the up-conversion is just that there are no longer INDEP_OPENS and COLL_OPENS counters, so we have to combine them into a single counter now in the new format. I'm not sure that's really a big deal, especially if we are making the call going forward that there's not really a distinction between INDEP_OPENS and COLL_OPENS. That is, if we think it's important enough to distinguish between these, then they ought to be accounted for in our new counters -- Wei-Keng seems to think the distinction is misleading, so maybe they really should just be cut entirely?

Also, FWIW, we've already taken a similar approach in the HDF5 module, in that it was previously a barebones module that is now being upconverted into the new more detailed format.

That makes sense. Upconverting seems reasonable to me.

@carns
Copy link
Contributor

carns commented Aug 18, 2022

One question, how does Darshan tell an open/create is independent?

Looks like in the PnetCDF module, an open is marked as independent if the input communicator size == 1. I'm not sure it's semantically useful to capture that distinction, though. Presumably if the input communicator only contains 1 process, it will be clear from the rank information captured by Darshan how many processes opened it?

Yeah, that's right. I agree it's not a useful counter distinction for pnetcdf moving forward.

@shanedsnyder
Copy link
Contributor

  • There are a lot of (560!) wrapped functions. I'd be surprised if they are all supported in all PnetCDF versions, but maybe they are? If not, we probably want to think about how to best handle this (i.e., autoconf checks for functions defined in newer PnetCDF releases, only formally supporting PnetCDF versions that have all of these functions implemented, etc.).

The last time PnetCDF added new I/O-related APIs was version 1.6.0, released in February 2, 2015. (FYI. the latest release is 1.12.3.) We can make 1.6.0 the minimum requirement in configure.ac. Its Release Note contains all history, just keyword search "New APIs".

Awesome, thanks for the details. I'll add in an autoconf check for this version number and will mention in the darshan-runtime docs about the version dependency (though it sounds like these functions are all sufficiently old, that we're unlikely to hit this problem in practice).

@wkliao -- is there a reliable way to grab the PnetCDF library version across versions that you recommend? I found pnetcdf_version program which looks like it would work, but wasn't sure if there was another preferred method.

@wkliao
Copy link
Contributor Author

wkliao commented Aug 18, 2022

In PnetCDF header file, pnetcdf.h, there are a few C constants defined for version numbers. Below is an example.

#define PNETCDF_VERSION       "1.11.2"
#define PNETCDF_VERSION_MAJOR 1
#define PNETCDF_VERSION_MINOR 11
#define PNETCDF_VERSION_SUB   2

@shanedsnyder
Copy link
Contributor

  • There are a lot of (560!) wrapped functions. I'd be surprised if they are all supported in all PnetCDF versions, but maybe they are? If not, we probably want to think about how to best handle this (i.e., autoconf checks for functions defined in newer PnetCDF releases, only formally supporting PnetCDF versions that have all of these functions implemented, etc.).

The last time PnetCDF added new I/O-related APIs was version 1.6.0, released in February 2, 2015. (FYI. the latest release is 1.12.3.) We can make 1.6.0 the minimum requirement in configure.ac. Its Release Note contains all history, just keyword search "New APIs".

Awesome, thanks for the details. I'll add in an autoconf check for this version number and will mention in the darshan-runtime docs about the version dependency (though it sounds like these functions are all sufficiently old, that we're unlikely to hit this problem in practice).

@wkliao -- is there a reliable way to grab the PnetCDF library version across versions that you recommend? I found pnetcdf_version program which looks like it would work, but wasn't sure if there was another preferred method.

To answer my own question from the release notes:

-------------------------------------
Version 1.5.0 (July 8, 2014)
-------------------------------------
...
  o New utility program
    * pnetcdf_version prints the version information of the PnetCDF library and
      command-line arguments used at configure

I'll just use this unless you have any objections.

@shanedsnyder
Copy link
Contributor

In PnetCDF header file, pnetcdf.h, there are a few C constants defined for version numbers. Below is an example.

#define PNETCDF_VERSION       "1.11.2"
#define PNETCDF_VERSION_MAJOR 1
#define PNETCDF_VERSION_MINOR 11
#define PNETCDF_VERSION_SUB   2

Ahh, thanks. I'll try this out.

@wkliao
Copy link
Contributor Author

wkliao commented Aug 18, 2022

@shanedsnyder
Copy link
Contributor

Autoconf code enforcing library version >= 1.6.0 has been pushed via bd6369a

@shanedsnyder
Copy link
Contributor

  • For the new PnetCDF file module, there are a lot of timers that we probably wouldn't typically capture. Right now, this module captures _START_TIMESTAMP, _END_TIMESTAMP, and _TIME counters for each operation it instruments (open, create, close, redef, enddef, sync, coll_wait, indep_wait). To better match our level of capture in other modules, I'd suggest we just capture open TIMESTAMP information (with this accounting for opens and creates), and combine all of the cumulative TIME counters into a single META_TIME counter. Part of the reasoning for doing this is that some of our analysis tools kind of expect this format, and it will complicate PnetCDF analysis if it's being a lot more specific, I think.

This should be done via 1792807.

There is one additional simplification we could consider in that we could just get rid of the PNETCDF_FILE_CREATES counter and just track it with PNETCDF_FILE_OPENS. We have similar approaches in POSIX and MPI-IO modules, in that we don't explicitly track create counters since there's no way to tell whether they are actually creating the underlying file -- if the file already exists, these calls behave much like vanilla open calls. I'd prefer to borrow this same approach in the PnetCDF module, unless we think it's really important to know whether users are using open vs create. What do you guys think?

@wkliao
Copy link
Contributor Author

wkliao commented Aug 23, 2022

Is the same implemented for HDF5? given H5Fcreate and H5Fopen are two separate APIs.

@shanedsnyder
Copy link
Contributor

Is the same implemented for HDF5? given H5Fcreate and H5Fopen are two separate APIs.

The HDF5 module bins both of those calls up into HF5_OPENS counter, so this change would be consistent with that, too.

@wkliao
Copy link
Contributor Author

wkliao commented Aug 23, 2022

I am wondering whether separating the counters can be useful for
metadata intensive applications, e.g. create/open a lot of files in the
one-file-per-process I/O pattern. In modern parallel file systems, will
the cost of file creation be significantly different from opening? If it
is not, then keeping separate counters makes less sense.

darshan-runtime/lib/darshan-pnetcdf-api.m4 Outdated Show resolved Hide resolved
darshan-runtime/lib/darshan-pnetcdf-api.m4 Outdated Show resolved Hide resolved
darshan-runtime/lib/darshan-pnetcdf-api.m4 Show resolved Hide resolved
darshan-runtime/lib/darshan-pnetcdf-api.m4 Outdated Show resolved Hide resolved
darshan-runtime/lib/darshan-pnetcdf-api.m4 Outdated Show resolved Hide resolved
@shanedsnyder
Copy link
Contributor

I am wondering whether separating the counters can be useful for metadata intensive applications, e.g. create/open a lot of files in the one-file-per-process I/O pattern. In modern parallel file systems, will the cost of file creation be significantly different from opening? If it is not, then keeping separate counters makes less sense.

I agree it's useful to be able to distinguish opens/creates in the theoretical sense, since they likely have different performance characteristics on different systems. We maybe have just been careful binning things in a CREATE counter in the past to avoid potential confusion, in that calling create variants of different I/O calls doesn't really ensure the underlying file was created (rather than just opening an existing file) -- this is probably especially true for POSIX open() calls that by habit keep O_CREAT flag set, but maybe it's not as typical to use create variants of PnetCDF/HDF5 calls like this. It's possible users could see lots of CREATE counters in there job summary report and incorrectly come to conclusion that there job created lots of files, if they were just using create APIs to open existing files.

That's just some background on why I think we've avoided CREATE counters in other modules. That said, this isn't a really big deal if PnetCDF wants to do something different, so if you think it's useful to capture, we can keep the additional counter. I just wanted to mention it now, since we can't really remove the counter easily once it's added, so better to think about now.

@shanedsnyder
Copy link
Contributor

I added a more detailed review that captures most of my comments/feedback for the PnetCDF wrappers contained in the m4 code. I'm happy to make the code changes necessary to address these, just wanted to make sure you had a chance to respond to any feedback.

@shanedsnyder
Copy link
Contributor

shanedsnyder commented Aug 24, 2022

I'll use this comment to keep track of things that aren't currently tracked in a review comment, but that we would like to see addressed before merging this in.

  • add backwards compatibility code to up-convert old versions of PnetCDF module to current version introduced in this PR
  • add CFFI bindings for new PnetCDF record structures in pnetcdf-log-format.h
  • add PnetCDF module data to I/O cost graphs, as well as add a new PnetCDF module summary section to PyDarshan summary reports

- this allows Darshan to capture whether variables are record
  variables or not
@github-actions github-actions bot removed the CI continuous integration label Oct 26, 2022
@@ -12,7 +12,7 @@ def mod_agg_iohist(self, mod, mode='append'):
"""

# sanitation and guards
supported = ["POSIX", "MPI-IO", "H5D"]
supported = ["POSIX", "MPI-IO", "H5D", "PNETCDF_VAR", "PNETCDF_FILE"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylerjereddy just noticed this when doing a final read-through

I think we can remove PNETCDF_FILE from the supported list, as there are no access histogram counters in the file-level records for PnetCDF (they are stored in the variable-level counters). I removed locally and tests pass, so just wanted to double check there wasn't a reason before removing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the changes to cli/summary.py you can see there are similar guards for supported modules for I/O histograms that does not include PNETCDF_FILE, which makes total sense, so I just removed it from the supported list here, too. Must have been an oversight.

@@ -436,15 +436,21 @@ def register_figures(self):
opcounts_mods.append("H5D")
elif "H5F" in self.report.modules:
opcounts_mods.append("H5F")
elif "PNETCDF_VAR" in self.report.modules:
Copy link
Contributor

@shanedsnyder shanedsnyder Oct 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylerjereddy Shouldn't this be if rather than elif, otherwise we'd run into issues if a log had both PnetCDF and HDF5 modules active? i.e., use an if/elif block for HDF5 and a separate if/elif block for PnetCDF

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, and can't the same be said for the elif above as well with the new config modularity? I suspect we need more test files to catch that stuff.

Copy link
Contributor

@shanedsnyder shanedsnyder Oct 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed s.t. a different if/elif clause is used for PNETCDF_FILE/PNETCDF_VAR than for H5F/H5D. Otherwise, it doesn't look like opcounts plots would be generated for both PNETCDF and HDF5 if a log happened to have both (probably very rare).

I'm not sure if there's other places in this logic that need fixed. My understanding from the preceding comment is that you only want one of H5F/H5D or PNETCDF_FILE/PNETCDF_VAR, as the H5D and PNETCDF_VAR modules know how to additionally include the file-level data.

I spot checked a couple of summary reports and confirmed tests succeed with the changes, so think I'm content with the changes for now, but could still be some corner cases to sort out.

@tylerjereddy
Copy link
Collaborator

If you click through the sub-100 % patch diff coverage in CI, those missing lines may be related to your concern above, assuming it isn't a codecov fusing issue.

shanedsnyder
shanedsnyder previously approved these changes Oct 28, 2022
@shanedsnyder
Copy link
Contributor

@carns @wkliao I think I'm all done here, if there are any things you guys wanted to double check before merging. I know it's probably not practical to look through everything, but I'll wait for a thumbs up from you guys before moving ahead.

@wkliao
Copy link
Contributor Author

wkliao commented Oct 28, 2022

It will be great to see this PR merged soon.
I plan to use this feature to check WRF I/O pattern.
@shanedsnyder , @tylerjereddy Thanks for your great effort !

@carns
Copy link
Contributor

carns commented Oct 29, 2022

@wkliao @shanedsnyder is it possible to silence some of the warnings that pop out with -Wall? I'm seeing a large volume of warnings like the following with gcc 12.2:

./darshan-pnetcdf-api.c:2524:17: warning: unused variable ‘i’ [-Wunused-variable]
 2524 |             int i, j;
      |                 ^
./darshan-pnetcdf-api.c:2507:9: warning: unused variable ‘err’ [-Wunused-variable]
 2507 |     int err, ret;
      |         ^~~
./darshan-pnetcdf-api.c: In function ‘__wrap_ncmpi_get_var_schar_all’:
./darshan-pnetcdf-api.c:2586:20: warning: unused variable ‘j’ [-Wunused-variable]
 2586 |             int i, j;
      |                    ^

I don't believe these are important; I suspect that it's just an artifact of how the m4 generated code is used for various wrappers. It does make the build output noisy, though, so it could obfuscate meaningful warnings.

I don't know what the options are for this situation; is there something simple like a gcc pragma maybe?

carns
carns previously approved these changes Oct 29, 2022
Copy link
Contributor

@carns carns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. If there is a way to make the -Wall warnings quieter that would be great, but I approve regardless. Everything runs fine for me. It's need to see the file:variable notation in the record name fields for the VAR module.

@wkliao wkliao dismissed stale reviews from carns and shanedsnyder via 09fe016 October 29, 2022 17:20
@wkliao
Copy link
Contributor Author

wkliao commented Oct 29, 2022

Those variables are probably used before, not longer.
They have been removed in 09fe016

@carns
Copy link
Contributor

carns commented Oct 30, 2022

Those variables are probably used before, not longer. They have been removed in 09fe016

Great, thanks @wkliao ! I confirmed that everything compiles cleanly for me now. PR looks good to go to me.

@shanedsnyder shanedsnyder merged commit aef584b into darshan-hpc:main Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants