Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1620 Compare Module and Data Source keyword #1646

Merged
merged 74 commits into from
Sep 20, 2023
Merged

Conversation

rhinoella
Copy link
Contributor

@rhinoella rhinoella commented Aug 23, 2023

Module to compare two datasets and a new keyword to manage data sources

Compare Module

  • Module to replace the old 'DataTest' module with added functionalities
  • Currently this PR does not include the GUI implementation, as I think it is better in a different PR

Data Source Keyword

  • A keyword that creates Data objects (Data1D, Data2D, Data3D, etc.) from user-specified internal or external data
  • Can accept up to two different data sources depending on the needs of the module

Notes

  • Compare module is currently only able to handle Data1D sources

Closes #1620

@rhinoella rhinoella marked this pull request as draft August 23, 2023 13:33
Copy link
Member

@trisyoungs trisyoungs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not be fully awake while going through this, but it feels kinda overkill for what we want to do, and I fear is dragging us away from just getting a working 1D comparison up and running correctly, with multiple datasets, which is what we really need.

Perhaps I'm mistaking your intent, but at present this feels like its written to be a catch-all class that can store multiple different dimensionalities of data (which we don't want) rather than something more akin to a templated class taking, for instance, the type and file format (Data1D and Data1DImportFileFormat) and which then has specialisations to handle the difficult parts that don't translate across dimensionalities (e.g. the fact that Data1D derives from a base class, and which complicates the search for internal data of that type).

src/io/import/dataBase.h Outdated Show resolved Hide resolved
src/io/import/dataBase.h Outdated Show resolved Hide resolved
src/keywords/dataSource.cpp Outdated Show resolved Hide resolved
src/math/data1D.h Outdated Show resolved Hide resolved
src/keywords/dataSource.cpp Outdated Show resolved Hide resolved
@rhinoella
Copy link
Contributor Author

I may not be fully awake while going through this, but it feels kinda overkill for what we want to do, and I fear is dragging us away from just getting a working 1D comparison up and running correctly, with multiple datasets, which is what we really need.

Perhaps I'm mistaking your intent, but at present this feels like its written to be a catch-all class that can store multiple different dimensionalities of data (which we don't want) rather than something more akin to a templated class taking, for instance, the type and file format (Data1D and Data1DImportFileFormat) and which then has specialisations to handle the difficult parts that don't translate across dimensionalities (e.g. the fact that Data1D derives from a base class, and which complicates the search for internal data of that type).

@trisyoungs The intent is to have a DataSource template class that can handle Data1D, 2D and 3D objects depending on what the type is specified as:

https://github.com/disorderedmaterials/dissolve/blob/713c3528a866636c2e900480d2dae601d2843f72/src/modules/compare/compare.cpp#L10C5-L18C5

The specialisation (Data1D requiring a different search function) has to be handled in Compare.cpp, as it requires the ModuleContext object which is provided in the Compare::process() function. So the main function of DataSource is to accept a set of internal/external data (1D, 2D or 3D depending on what is specified), construct a vector of the relevant object (Data1D, Data2D or Data3D) and add it to the module's data through the module's addData() function that is passed into its constructor.

@trisyoungs
Copy link
Member

@rhinoella Because of the need to have the processing module data to search? Couldn't you just pass this in as a const reference when creating the keyword?

@rhinoella
Copy link
Contributor Author

@trisyoungs Unfortunately the moduleContext object is only known and constructed during runtime, when it is passed into the Compare::process() function during Dissolve::iterate(), so the keyword is constructed before the moduleContext object is created.

@trisyoungs
Copy link
Member

@rhinoella True true. Alas, module constructors used to get passed the main Dissolve& object, but no longer, and so you could have gotten the GenericList from there. Curses to this design! I guess you'd better just work this up, focussing on the 1D case, and see where you get to.

@rhinoella
Copy link
Contributor Author

Changes

  • Type definition for a Data classes' associated file and format is stored in the class. This improves type-safety (prevents someone from using a Data2D object with a Data1DImportFileFormat) and also means that the template class DataSourceKeyword only needs one template parameter- the data class
  • DataSource class stores the Data object and Format object as an std::variant instead of a std::unique_ptr as this further improves type safety

Note

Due to the removal of unique_ptr, the DataBase class (parent class for Data1DBase, Data2DBase and Data3DBase) can also be removed. I left it in anyways because it reduces code-repetition but please let me know if it should be removed.

Copy link
Contributor

@rprospero rprospero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks really good. I've just included one small change to fix the Windows build.

src/classes/dataSource.h Show resolved Hide resolved
src/classes/dataSource.cpp Outdated Show resolved Hide resolved
rhinoella and others added 3 commits September 14, 2023 11:41
Co-authored-by: Adam Washington <adam.washington@stfc.ac.uk>
Copy link
Member

@trisyoungs trisyoungs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really very nice indeed. I have a bunch of formatting comments, but nothing more than that.

src/classes/dataSource.cpp Show resolved Hide resolved
src/classes/dataSource.cpp Outdated Show resolved Hide resolved
src/classes/dataSource.cpp Show resolved Hide resolved
src/classes/dataSource.cpp Outdated Show resolved Hide resolved
src/classes/dataSource.cpp Outdated Show resolved Hide resolved
src/classes/dataSource.h Outdated Show resolved Hide resolved
src/classes/dataSource.h Outdated Show resolved Hide resolved
src/keywords/dataSource.h Outdated Show resolved Hide resolved
src/keywords/dataSource.h Outdated Show resolved Hide resolved
src/keywords/dataSource.h Outdated Show resolved Hide resolved
@rhinoella rhinoella merged commit 8437c75 into develop Sep 20, 2023
7 checks passed
@rhinoella rhinoella deleted the 1620-compare-module branch September 20, 2023 10:21
rprospero added a commit that referenced this pull request Mar 11, 2024
Co-authored-by: Tristan Youngs <tristan.youngs@stfc.ac.uk>
Co-authored-by: Adam Washington <adam.washington@stfc.ac.uk>
rprospero added a commit that referenced this pull request Apr 8, 2024
Co-authored-by: Tristan Youngs <tristan.youngs@stfc.ac.uk>
Co-authored-by: Adam Washington <adam.washington@stfc.ac.uk>
rprospero added a commit that referenced this pull request Apr 9, 2024
Co-authored-by: Tristan Youngs <tristan.youngs@stfc.ac.uk>
Co-authored-by: Adam Washington <adam.washington@stfc.ac.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(DataTest) Data Display and Ranges
3 participants