Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save spots #552

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Save spots #552

wants to merge 8 commits into from

Conversation

donald-e-boyce
Copy link
Collaborator

This pull request is a work in progress and not intended to merged immediately.
I expect it to evaluated to determine worthwhile improvements and updated accordingly.

The intent is to provide a way to save reflection data for spots for examination.
The current hexrd saves a summary table of all the reflections of each grain in a text file.
The GrainDataWriter_h5 class allows you to save full information for each reflection.
It is enabled in pull_spots but not accessible through the config file interface;
also it was set up to write a single file for all grains, creating a huge file.
This pull request allows you save spot intensities, one file per grain, accessibly through
the fitgrains config file.
In addition, the summary and full detail are both available as well a new option to
save the intensities only, creating a much smaller file.

This also cleans up much code from hedm_instrument.
There were separate classes for writing patches (spots) to text files (summary) and
to hdf5 (detailed), and there was much code handling file names, directories and output
formats within the pull_spots() method, distracting from the real functionality of it.
All of that is now handled in the spots module.

Usage

You set writing options in the fitgrains config file. The output_format is
a list of formats to write the spot outputs. There are three choices, "summary",
"full", and "sparse"; "full" is not recommended for more than a few grains.
Corresponding output files (for grain 1) are "spots00001.out" for the summary,
"spots00001.hdf5" for the full output, and "spots00001-sparse.hdf5" for the sparse output.

fit_grains:
  output_format: ['summary', 'sparse']

The "sparse" is not actually sparse (yet).
Currently only the intensities are saved, and because they are mostly zero, the hdf5
compression makes the file size small, about 2MB for the example I ran, compared with
40MB for the full. The full output also contains the intensities, but it also contains
ij-arrays and xy-arrays, which are dense over the same range as the intensities, making
the file much bigger.

For the purpose of finding overlapping peaks, I am thinking that if the distortion is small,
then then you won't actually need the xy values to mark a reflection as having overlapping peaks.
But if we need to include more information we can.

Changes

The main changes were to create a new spots module for handling reflection data.
There is a Spot class for information about a single reflection.
Existing classes GrainDataWriter_h5 and PatchDataWriter were moved from
hedm_instrument to the new spots module.

Modules Changed

  • hexrd.instrument.hedm_instrument:
    • moved GrainDataWriter_h5 and PatchDataWriter to spots
    • moved centers_of_edge_vec() function to gridutil, keeping a reference to it via import
    • moved functions unwrap_h5_to_dict and unwrap_dict_to_h5 to new module hexrd.utils.hdf5, but kept references to them
  • created new module hexrd.utils.hdf5
  • moved h5py_read_string to hexrd.utils.hdf5, but kept reference to it
  • hexrd.config.fitgrains: removed (unused) skip_on_estimate property and added output_format property
  • hexrd.fitgrains: modified to pass output options to pull_spots
  • hexrd.gridutil: added centers_of_edge_vec() function
  • hexrd.instrument.spots: new file
    • has Spots class containing full patch data and summary values
    • has SpotsWriter class for ouput of spots summary text file, full hdf5 file or a simplified hdf5 file
  • hexrd.utils.compatibility.py: moved h5py_read_string to hexrd.utils.hdf5.py, keeping reference to it
  • hexrd.materials.py: changed import from utils.compatibility to utils.hdf5
  • tests/config/test_fit_grains.py: added unit tests for the new output_format property

Copy link
Collaborator

@psavery psavery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like some really nice changes so far! If you want to make changes still, let us know when it is ready for review for merging!

dataset = dataset.asstr()

return dataset[()]
from .hdf5 import h5py_read_string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! After this PR is merged, we can make sure paths like this one are updated in HEXRDGUI to be the new path instead, and then we can at some point remove this file.

@@ -0,0 +1,100 @@
"""HDF5 Tools
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to have these in a separate file!

else:
d[key] = tmp


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice cleanup!

Comment on lines +2049 to +2053
detector_id, iRefl, peak_id, hkl_id, hkl,
tth_edges, eta_edges, np.radians(ome_eval),
xyc_arr, ijs, frame_indices, patch_data,
ang_centers[i_pt], xy_centers[i_pt],
meas_angs, meas_xy, sum_int, max_int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there are many arguments, I think it's usually preferable to use kwargs, since it is less error-prone (if you forget a kwarg, the error is clearer than if you forget an arg, and you don't have to worry about accidentally swapping two arguments if you use kwargs, which can lead to subtle bugs if two args are similar).

But since this is already using args, we can just leave it!

Comment on lines +37 to +43
hkl_id: in

detector_iRefl, peak_id, hkl_id, hkl,
tth_edges, eta_edges, ome_eval,
xyc_arr, ijs, frame_indices, patch_data,
ang_centers, xy_centers, meas_angs, meas_xy

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this docstring isn't finished yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants