-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xarray and scaling #69
Conversation
…le and automatic scaling of particles and images. Fixed some empty datablocks bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff, will have a proper play later but happy for you to merge when you see fit!
return f'{self.positions.data.shape}' | ||
@property | ||
def n(self): | ||
return self.positions.n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so n is the name we will use for the 'row' dimension across the codebase? just checking so we're on the same page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I'm not set on it, if you have better ideas. I think it's convenient to have a quick-to-type, easily remembered thing that will work across the codebase, more or less
@@ -16,21 +16,13 @@ class ParticleBlock(OrientedPointBlock): | |||
'class_plot': ClassPlotDepictor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClassPlot is for plotting particles in different colours according to their class, right?
In general I think names should be a little more descriptive e.g. ClassificationResultsDepictor
I've also been thinking about removing depictor from the names of these classes, I don't think we need it, what's your feeling on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can change this to clearer names. I would keep Depictor in it though, to be consistent (and since we can't remove Depictor
from ParticleDepictor
without being very ambiguous).
return data | ||
dims = ['x', 'y', 'z'] | ||
return xr.DataArray(data, dims=['n', 'spatial1', 'spatial2'], | ||
coords=(range(len(data)), dims[:data.shape[1]], dims[:data.shape[1]])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain this coords thing? I had a quick look at the xarray docs and it's not immediately obvious to me what's going on here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coords
is the xarray equivalent of index
in pandas. So while an image has:
dims = ('z', 'y', 'x')
coords = None # because there is no qualitative distinction between pixel 1 and pixel 2
A point block has:
dims = ('n', 'spatial') # where spatial stands for "spatial coordinates"
coords = (range(0, npoints), ['x', 'y', 'z'])
I would have preferred coords
over spatial
, but xarray allows for attribute access to coordinates, and having it named the same as the coords
attribute it would block it off (data.spatial
now works).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify: here in OrientationBlock
I called them spatial1
and spatial2
for lack of a better term. Ultimately it was just a trick to automatically see if it should use xy or xyz for the unit vector and similar. I don't think it's an elegant solution, but it's a bit less clunky than what we had. Do you have better ideas?
@@ -49,7 +52,7 @@ def extract_data(df, mode='3.1', name_regex=None): | |||
|
|||
coords = df_volume[coord_headings[dim]].to_numpy(dtype=float) | |||
shifts = df_volume.get(shift_headings[dim][mode], pd.Series([0.0])).to_numpy() | |||
px_size = df_volume.get(pixel_size_headings, pd.Series([1.0])).to_numpy() | |||
px_size = df_volume.get(pixel_size_headings[mode], pd.Series([1.0])).to_numpy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do rln 3.0 files have anything which tells you pixel size in the file? I didn't think they did
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, it's just in the table as rlnPixelSize
rather than being in the optics as rlnImagePixelSize
. Not sure why the change. There's also a rlnDetectorPixelSize
or something like that, but I could not find a difference from the simple rlnPixelSize
in any of my files. I guess if things break it will pop up and be clear then!
axes=convention['axes'], | ||
intrinsic=convention['intrinsic'], | ||
right_handed_rotation=convention['right_handed_rotation']) | ||
axes='zyz', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
little bit worries about the lack of separation between star files and the conventions for angles in them but okay to leave it for now as long as everything is working!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm changing this file quite a bit, so we can come back to this when it's fixed. But are there cases in which relion uses different conventions?
A thing to keep in mind: some stuff (ndim functionality, reshaping nonsense such as in TransformBlock) broke due to the use of xarray. I fixed some stuff, but others I will leave as is, since they are "legacy" code and should be either rewritten completely or moved to different places. For now, I will remove the test where necessary (only TransformBlock has this issue). |
Codecov Report
@@ Coverage Diff @@
## develop #69 +/- ##
===========================================
+ Coverage 68.51% 70.00% +1.48%
===========================================
Files 85 99 +14
Lines 1639 1960 +321
===========================================
+ Hits 1123 1372 +249
- Misses 516 588 +72
Continue to review full report at Codecov.
|
Migrated some stuff to xarray, so we now have named dimensions and dimension coordinates!
It's mostly useless (for now) for the user interface, but it makes some code MUCH easier (see coordinate reordering stuff in
PointBlock
).Another change from this PR that is instead very visible from the user side: scaling based on pixel size! Now everything is automagic for mrc and star files.
Smaller things: cleans up some old code and introduces a custom
ParseError
that makes debugging IO operations a bit easier.