-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Global driver_version switch for mapping input to drivers #3897
Conversation
I still would like to run |
I thought the desire was to drop the vmc_batch name. @jtkrogel convinced me that not having the second driver name was better and it definitely is considering the legacy handling of input. ProjectData is high level and parsed very early, we need to parse qmchamiltonian differently based on whether the driver is batched or not and this greatly simplifies that. |
xxx_batch will be history. This is the intended change. Much friendlier for users and we want to block off the old drivers in any case. I think we should try to improve on "driver_epoch". There is a lot more than the driver being switched here, plus this tag will be visible for "eternity" given the transition plan. Ask on Slack for ideas. |
Any chance we change Nexus currently uses |
Any arguments to not use these? Consistency is a strong argument for. |
@@ -103,13 +103,14 @@ constexpr std::array<const char*, 2> valid_dmc_input_sections{ | |||
constexpr int valid_dmc_input_dmc_index = 0; | |||
constexpr int valid_dmc_input_dmc_batch_index = 1; | |||
|
|||
/** As far as I can tell these are no longer valid */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to update the WFOptDriverInput class, and I'll change these tests then as well.
Technically, determine the behavior fully relying on a global state is convenient but bad both for the code parsing and human reading. As a user, I request the existing input with "vmc_batch" to continue functioning. It is bad to upset users making breaking changes. The cost of keep "vmc_batch" tag to work is nothing and there is no ambiguity. Here is what consider a sane way. For users new to this complexity, they may choose to just rely on the global flag. This is also a desired method to help migration. Once the legacy vmc driver is deleted, then we can remove "vmc_legacy/batch", there is no ambiguity. |
Will post with a suggested transition plan when I have more time this week. I think we could support _batch but with a warning that this will be deprecated in future, similar to how we need a warning when the global flag is not specified. Once Nexus is updated no one should be invoking xxx_batch in any new inputs. |
Not having a high level switch between the legacy architecture and the new architecture makes it quite torturous to reject legacy estimators in the qmchamiltonian section as requested in #3875. I support introducing separate code paths for input parsing as the legacy input handling is a mess that is tightly entangled with the legacy simulation objects. We can support the current _batch as they are now but I don't think we should make the sphaghetti to go back and check previous parsed and constructured objects like the hamiltonian pool. |
We do need a high level switch. We agreed on that. I was asking to keep "_batch" functioning. It is only a key mapping issue. |
src/QMCDrivers/QMCDriverFactory.cpp
Outdated
if (curName != "qmc") | ||
qmc_mode = curName; | ||
|
||
const int nchars = qmc_mode.size(); | ||
|
||
// Begin to separate batch input reading from the legacy input parsing | ||
if (project_data_.get_driver_epoch() == "batched") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a bool argument (force_batch) of readSection for and expand unit tests of readSection.
Also check getEngineName()
after the driver being created in unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you cannot create a QMCDriverFactory without ProjectData and the global driver epoch is set there, I don't think its a good idea to also allow it to be forced directly. There is one source of truth for this parameter and it is from the ProjectData the driver factory is constructed with.
The _batch are supported for the convenience of users and tests currently using them but are not valid input when the driver epoch is set. Based on the driver epoch input, the qmcsystem and hamiltonian node will be parsed differently so we don't want to encourage further use of the _batch or introduce _legacy as a tag for drivers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot that ProjectData is required by QMCDriverFactory. Then there is no need to add a bool argument. I have not read your latest commit but I guess you have both epoch cases covered in the unit test now. What did you mean by "the qmcsystem and hamiltonian node will be parsed differently" did you mean "hamiltonian of qmcsystem" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are also going to allow an <estimators>
node for global estimator definitions. This will only be valid if in the the batched epoch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. global "estimators" will outside "hamiltonian" but inside "qmcsystem"
Having |
@PDoakORNL could you run clang-format on all the changed source file ? Some diff looks strange. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The driver factory change looks good.
Having concerns on the changes in ProjectData.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comments around ProjectData class change.
Test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What did we end up with? I could not determine from the discussion here.
Test this please |
1 similar comment
Test this please |
Test this please |
* In the application context project data can indicate the input be read in the context of | ||
* the batched driver architecture. | ||
* param[in] cur qmc section node | ||
* param[in] force_batch forces input to be evaluated as if project driver type = batched |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There doesn't seem to be force_batch parameter.
src/QMCDrivers/QMCDriverFactory.h
Outdated
@@ -48,12 +48,23 @@ class QMCDriverFactory | |||
QMCRunType new_run_type = QMCRunType::DUMMY; | |||
}; | |||
|
|||
/** Application uses this constructor | |||
* param[in] project_data this is stored as a reference and this state controls later behavior. | |||
* For both the driver factory i.e. driver epoch. And the drivers it creates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover reference to 'driver epoch'
src/Utilities/ProjectData.h
Outdated
@@ -113,6 +132,9 @@ class ProjectData : public OhmmsElementBase | |||
|
|||
///max cpu seconds | |||
int max_cpu_secs_; | |||
|
|||
// The driver epoch of the project |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left over mention of epoch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this Mark. We should confirm grep -i epoch returns nothing.
my refactor didn't reach into the comments. Did a straight text based recursive grep and I don't see any epoch's related to this only the time api. |
Test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @prckent good for you now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are missing a test for the case where driver_version is not specified but vmc_batch is called explicitly (or one of the other _batched drivers). This should work currently but in future when the driver_version tag is made a requirement will break (and be updated to expect_fail status). A quick clone of one of the deterministic tests would handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The no switch + *_batch call is already tested. (This is what current users of the batched code will have in their inputs).
Note: After discussion and review, the switch was decided to be named driver_version (not epoch)
This removes the
and replaces them with
Proposed changes
ProjectData can now be queried get_driver_epoch() and explicitly different input parsing can easily be done for batched drivers vs. legacy.
The unit tests pass.
Most of the short and deterministic batched tests pass. I'm not sure any of the system level batched optimizer test pass. They seem extremely slow.
In the future this will greatly simplify preventing legacy/new input mismatches. i.e. #3875
What type(s) of changes does this code introduce?
Does this introduce a breaking change?
For legacy no, for batched yes.
Your existing batched driver input will stop working.
You must make the change shown at the top of the PR to have them continue working.
What systems has this change been tested on?
Leconte
Checklist
Update the following with a yes where the items apply. If you're unsure about any of them, don't hesitate to ask. This is
simply a reminder of what we are going to look for before merging your code.