Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify data export/import process #210

Merged
merged 31 commits into from Apr 28, 2021

Conversation

stognini
Copy link
Member

@stognini stognini commented Apr 9, 2021

This PR updates the Geant4 data export/import to a simpler mechanism:

  • All exported/imported data is stored in a single struct named ImportData.
  • Except for ImportData, all other import classes used by geant-exporter were moved to io/detail.
  • geant-exporter stores the populated ImportData into a single TTree entry, vastly simplifying both writing and reading processes.
  • RootImporter::operator()("tree_name", "branch_name") returns the ImportData object stored in the ROOT file. It now takes the names of the tree and branch as input parameters to avoid any hardcoded string name.
  • All Celeritas' host/device classes now have a from_import(ImportData& data) function that returns a constructed shared_ptr.
  • test/gtest/detail/Macros.hh now includes TEST_IF_CELERITAS_USE_ROOT. This allows disabling individual tests accordingly and is currently used to enable/disable all from_import(...) tests added to the host/device classes.

Example code on how the data is now imported into Celeritas:

RootImporter import("/path/to/root_file.root");
const auto   data            = import("tree_name", "branch_name");
const auto   particle_params = ParticleParams::from_import(data);

- Add ImportData as a single struct to be saved to the ROOT file
for easier readability.
- RootImporter not yet updated. At the end, it will
  - Only load ImportData into memory.
  - Param classes might have a from_import(ImportData& data) or
  similar function that will construct the loaded data. Another
  option is to create  helper classes to construct the Params ptrs.
- Clean up RootImporter
- Fix RootImporter.noroot
- Add TEST_IF_CELERITAS_USE_ROOT macro in gtest/detail/Macros.hh
for enabling/disabling from_import() tests in all classes that
have this functionality
- Update RootImporter test to read the new ROOT file
- RootImporter::operator() now needs the tree and branch names to
read the ROOT file. This provides the flexibility to users to
write the file as they please
- Refactor geant-exporter functions/classes to better match their
purposes
- Improve documentation
- Extend, update, or improve doxygen documentation in a set of files
- Rename slabsGeometry.gdml to four-steel-slabs.gdml to avoid confusion
@stognini stognini added core Software engineering infrastructure integration labels Apr 9, 2021
@stognini stognini requested a review from sethrj April 9, 2021 15:27
@sethrj sethrj mentioned this pull request Apr 11, 2021
23 tasks
Copy link
Member

@sethrj sethrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! This is on the right track and probably 95% to the goal. The RootImporter operator is a little funky, and some of the from_import methods need to be handed existing "pre-converted" data structures rather than rebuilding them.

src/CMakeLists.txt Outdated Show resolved Hide resolved
src/io/RootImporter.hh Outdated Show resolved Hide resolved
src/io/RootImporter.hh Outdated Show resolved Hide resolved
src/io/detail/ImportMaterial.hh Outdated Show resolved Hide resolved
src/io/ImportData.hh Outdated Show resolved Hide resolved
test/io/GdmlGeometryMap.test.cc Outdated Show resolved Hide resolved
test/physics/base/Particle.test.cc Outdated Show resolved Hide resolved
test/physics/base/Particle.test.cc Outdated Show resolved Hide resolved
test/physics/em/LivermorePE.test.cc Outdated Show resolved Hide resolved
test/physics/material/Material.test.cc Outdated Show resolved Hide resolved
@stognini
Copy link
Member Author

Now that ImportData already includes all the needed information to connect volumes and materials, the GdmlGeometryMap class is not fundamentally needed anymore. Nevertheless, it provides an easy interface to read the material, element, and volume information while constructing new classes. It is now constructed from a populated ImportData object, as seen in GdmlGeometryMap.test, GeoMaterial.test, and LDemoParams.

Two points of discussion are:

  • Do we want to keep it, since it provides a better interface to find materials, elements, and volumes? Or should we just delete it and read the vectors available in ImportData directly when constructing other objects?
  • If we intend to keep it, we might revisit the GdmlGeometryMap name now that it serves as an interface.

@sethrj
Copy link
Member

sethrj commented Apr 20, 2021

Now that ImportData already includes all the needed information to connect volumes and materials, the GdmlGeometryMap class is not fundamentally needed anymore. Nevertheless, it provides an easy interface to read the material, element, and volume information while constructing new classes. It is now constructed from a populated ImportData object, as seen in GdmlGeometryMap.test, GeoMaterial.test, and LDemoParams.

Two points of discussion are:

  • Do we want to keep it, since it provides a better interface to find materials, elements, and volumes? Or should we just delete it and read the vectors available in ImportData directly when constructing other objects?
  • If we intend to keep it, we might revisit the GdmlGeometryMap name now that it serves as an interface.

I'd like the "import" layer to be as simple as possible, so we move the data as quickly as possible into our more feature-rich Params classes, which offers the finding/mapping functionality we need. It could make sense to have an extra mapping layer at the import layer if we were filtering out data at problem setup time, but at this point it looks like our import layer will always be loading all of the data because it's so interconnected. Any extra useful mapping functionality should be added to the Params classes or a helper function/class that uses multiple params classes.

I don't think that removing the GdmlMap would make construction more complicated—the data is stored as vectors and we need it as vectors, so there shouldn't be much mapping involved.

@stognini
Copy link
Member Author

I don't think that removing the GdmlMap would make construction more complicated—the data is stored as vectors and we need it as vectors, so there shouldn't be much mapping involved.

It mostly does not, so I have removed GdmlGeometryMap in its entirety. I have just added 2 small free functions in ImportData to make life a little bit easier. They retrieve full elemental or material data from a given element_id or material_id. Element, material, and volume IDs are defined as unsigned int to simplify this import layer as much as possible and not rely on any particular type, but let me know if you believe that a type-safe index is fundamental.

Copy link
Member

@sethrj sethrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, Stefano, thanks for the updates. Regarding the helper functions: are they used anywhere? I don't think we should ever need to "search" through the import data (especially since the search criterion is just the index of an element/material in its vector), and as I commented there I think the new element/material ID accessors are redundant in the context of loading an entire ImportFile.

app/geant-exporter/ImportProcessConverter.cc Outdated Show resolved Hide resolved
app/geant-exporter/ImportProcessConverter.cc Outdated Show resolved Hide resolved
src/io/ImportData.cc Outdated Show resolved Hide resolved
src/io/ImportElement.hh Outdated Show resolved Hide resolved
src/io/ImportMaterial.hh Outdated Show resolved Hide resolved
src/io/ImportData.cc Outdated Show resolved Hide resolved
geant-exporter:
- Each ImportData member has its own store function for easier
  code maintenance.
- Vectors for element, material, and volume are resized according
  to the number of entities and each enttity is stored at position
  EntityID() in the vector. This makes sure that the entity's
  position always matches its id.
geant-exporter-cat:
- Each ImportData member has its own print function for easier
  code maintenance.
@stognini
Copy link
Member Author

Sorry for not waiting on another PR to address some new improvements, but they fit in the current scope. After we moved from a "geometry" idea (GdmlGeometryMap mapped elements, materials, and volumes) to a struct of vectors in ImportData, I refactored the whole code for consistency, better readability, and maintenance. So now both geant-exporter and geant-exporter-cat have individual functions to write/print each member of ImportData:

  • geant-exporter:
// Populate ImportData members
import_data.particles = store_particles();
import_data.elements  = store_elements();
import_data.materials = store_materials();
import_data.processes = store_processes();
import_data.volumes   = store_volumes(world_phys_volume);
  • geant-exporter-cat:
print_particles(*particle_params);
print_elements(data.elements);
print_materials(data.materials, data.elements, *particle_params);
print_processes(data.processes, *particle_params);
print_volumes(data.volumes, data.materials);

Currently geant-exporter-cat print functions still rely on a mixed bag of ImportData and Params. This should be revisited soon.

Copy link
Member

@sethrj sethrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great aside from one final scaling issue in the geant exporter that I'd like to have addressed... sorry!

app/geant-exporter/geant-exporter.cc Outdated Show resolved Hide resolved
app/geant-exporter/geant-exporter.cc Show resolved Hide resolved
app/geant-exporter/geant-exporter.cc Show resolved Hide resolved
- Add assertion before storing a material in
geant-exporter::store_materials()
- Improve code documentation
- Remove ImportProcessConverter::remove_empty() function in favor
of a much simpler solution: use operator bool to check if
ImportProcess is not empty before adding it to the
vector<ImportProcess>
@stognini
Copy link
Member Author

stognini commented Apr 27, 2021

I've added a much simpler (and kinda obvious) solution regarding the empty ImportProcess objects returned by ImportProcessConverter::operator(). It makes the code cleaner and allows us to remove the ImportProcessConverter::remove_empty() function, which you were not very fond of.

We simply use ImportProcess::operator bool() to check if ImportProcess is empty before adding it to the vector:

if (ImportProcess process
    = process_writer(g4_particle_def, *process_list[j]))
{
    // Not an empty process, so it was not added in a previous loop
    processes.push_back(process);
}

@sethrj
Copy link
Member

sethrj commented Apr 27, 2021

@stognini great idea! One final gotcha there though is the distinction between copy and move. The push_back(process) does a recursive copy because process is an "lvalue". Change to push_back(std::move(process)) for the desired effect.

Copy link
Member

@sethrj sethrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a bunch, @stognini !

@sethrj sethrj merged commit 57a15f5 into celeritas-project:master Apr 28, 2021
@stognini stognini deleted the simplify-importer branch August 12, 2021 18:24
@sethrj sethrj added the minor Minor internal changes or fixes (including CI updates) label Nov 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Software engineering infrastructure minor Minor internal changes or fixes (including CI updates)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants