New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add README for Alpaka modules #40690
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40690/34047
|
A new Pull Request was created by @makortel (Matti Kortelainen) for master. It involves the following packages:
@cmsbuild, @makortel, @fwyzard can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
* Any code that `#include`s a header from the framework or from the `HeterogeneousCore/AlpakaCore`(?) must be separated from the Alpaka device code, and have the usual `.cc` suffix. | ||
* A few framework headers are allowed to be used in `.dev.cc` files, e.g. `FWCore/Utilities/interface/Exception.h` or `FWCore/MessageLogger/interface/MessageLogger.h` (?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the leading order I'd think all code in HeterogeneousCore/AlpakaCore
is about the interaction with the framework, and therefore would not be allowed to be used in .dev.cc
files.
In the next-to-leading order, maybe some individual headers could be useful in .dev.cc
, like an ability to throw cms::Exception
and use MessageLogger, and with limited set of headers the maintenance burden of keeping them usable also by device compilers would be tolerable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWCore/Utilities/interface/CMSUnrollLoop.h
is OK as it explicitly supports CUDA and ROCm (we could move or copy these macros into Alpaka itself, but even then it would not be a problem if somebody uses this file);- other macro-only headers are also fine (
FWCore/Utilities/interface/stringize.h
, etc.); FWCore/Utilities/interface/Exception.h
should be OK, as it does not depend on other packages, and as long as the exception are used only in host code, not in device code;FWCore/MessageLogger
might be OK, too; though my preference would be to keep the messages in the.cc
files and have only device code and simple wrappers in.dev.cc
files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, EventCache.h
and QueueCache.h
might be fine in .dev.cc
files (even though I don't think we have a use for them in user code); I was thinking to eventually push a reworked version directly into Alpaka.
* There must be a host-only flavor of the data format that is either idependent of Alpaka, or depends only on Alpaka's Serial backend | ||
* The host-only data format must be defined in `Package/SubPackage/interface/` directory | ||
* If the data format is to be serialized (with ROOT), it must be serialized in a way that the on-disk format does not depend on Alpaka, i.e. it can be read without Alpaka | ||
* For Event data products the ROOT dictionary should be defined in `Package/SubPackage/src/classes{.h,_def.xml}` (**NOTE**: currently they are in `.../src/alpaka/classes_serial{.h,_def.xml}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to define the dictionaries for the host-only data format classes in src/classes_def.xml
instead of src/alpaka/classes_serial_def.xml
, because it would be more consistent with "host platform data formats are usable by non-Alpaka modules". Technically ROOT takes care of loading the proper shared object of the dictionary, so the impact is close to invisible in practice. At least we would be able to avoid building the shared object for the serial backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about this as well, and it would also simplify supporting different host back-ends like TBB.
I'm mostly waiting to see if Eric's work with dictionaries can simplify things further, but I'm OK with the move.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear: if you prefer to move them already now, I'm fine with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the move is not particularly urgent, how about, to avoid further conflicts with #40285, we don't touch the existing classes_serial_def.xml
files now, but move them after #40285, and follow the guideline for any other new code that may be PR'd before that?
(although I would be fine with moving them already now as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we will be able to review and merge #40285 this week, as we'll be busy with the tutorial until Friday. If you'd like to move them, do go ahead.
My only request is to base the branch with these changes on top of 3cd90b2, which is itself based on CMSSW_13_0_0_pre4
, in order to try and keep a single branch valid for both CMSSW_13_0_X
and CMSSW_13_1_X
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. Maybe I'll move them then.
* The `classes_<backend>_def.xml` should declare the dictionaries for the data product type `T`, `edm::DeviceProduct<T>`, and `edm::Wrapper<edm::DeviceProduct<T>>`. All these dictionaries must be declared as transient with `persistent="false"` attribute. | ||
* The list of `<backend>` includes currently: `cuda` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe <platform>
would make more sense for data formats than <backend>
? (even if technically we have only "backend" concept right now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<platform>
(well, Platform
) is already an Alpaka concept, and indeed it seems to be a better match than <backend>
here: Alpaka has platforms for "CPU", "CUDA", "HIP" (ROCm), etc., and doesn't differentiate between "CPU with serial backend" and "CPU with TBB parallel backend".
OpenMP 5, OpenACC and SYCL are their own platforms - but, to be fair, they likely require their own memory management.
|
||
Data formats, for both Event and EventSetup, should be placed following their usual rules. The Alpaka-specific conventions are | ||
* There must be a host-only flavor of the data format that is either idependent of Alpaka, or depends only on Alpaka's Serial backend | ||
* The host-only data format must be defined in `Package/SubPackage/interface/` directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* The host-only data format must be defined in `Package/SubPackage/interface/` directory | |
* The host-only data format must be defined in `DataFormats/Package/interface/` directory |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically there are other data format packages than DataFormats
(e.g. SimDataFormats
). Fully transient data products are also allowed to be defined in packages outside of the *DataFormats
(and with looser constraints).
I agree seeing DataFormats
explicitly here could be helpful for e.g. newcomers. A possible compromise would be to use DataFormats
in these placement guidelines, and after the bullet list add a link to https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts for the exact constraints for data format classes. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically there are other data format packages than
DataFormats
(e.g.SimDataFormats
).
True, but I don't see us making changes to the simulation DataFormats any time soon...
Fully transient data products are also allowed to be defined in packages outside of the
*DataFormats
(and with looser constraints)
I admit I'm not familiar with the use cases for transient data formats, other than for testing and CUDADataFormats
(and I think we want to get rid of the latter).
A possible compromise would be ...
Sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end I used the DataFormats/SubPackage
only for the cases that referred to Event data products, and kept the more general Package/SubPackage
elsewhere, as only a few EventSetup data format classes are defined in DataFormats
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just typos.
b8ca610
to
a1067a1
Compare
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40690/34269
|
a1067a1
to
d4b3eb8
Compare
d4b3eb8
to
e72acff
Compare
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40690/34301
|
+heterogeneous |
This pull request is fully signed and it will be integrated in one of the next master IBs after it passes the integration tests. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
(since this is just a readme, I thought we could skip tests) |
please abort |
+1 |
please abort |
merge |
PR description:
This PR adds a README.md documenting the Alpaka module framework integration, and also some recommended practices.
PR validation:
None