Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local indexing: take into account phi in nearest neighbour analysis #1459

Merged
merged 7 commits into from
Oct 23, 2020

Conversation

rjgildea
Copy link
Contributor

@rjgildea rjgildea commented Oct 21, 2020

In the "local" index assignment method, the nearest neighbour analysis should only identify as neighbouring those reflections that are close in phi. If, for example, there is a rotation misset in a scan that is more than one full rotation, this previously could cause issues for the local index assignment, leading to low numbers of indexed reflections.

Fixes #1458

In the "local" index assignment method, the nearest neighbour
analysis should only identify as neighbouring those reflections
that are close in phi. If, for example, there is a rotation misset
in a scan that is more than one full rotation, this previously
could cause issues for the local index assignment, leading to low
numbers of indexed reflections.

Fixes #1458
@rjgildea
Copy link
Contributor Author

Tested by manually modifying the scan oscillation for the 4rotation dials_regression indexing test dataset to set oscillation to 0.997 degrees:

$ dials.show $(libtbx.find_in_repositories dials_regression)/indexing_test_data/4rotation/experiments.json | grep Scan -A 3
Scan:
    image range:   {1,1440}
    oscillation:   {0,1}
    exposure time: 2

$ dials.show modified.expt | grep Scan -A 3
Scan:
    image range:   {1,1440}
    oscillation:   {0,0.997}
    exposure time: 2

Results with previous behaviour:

$ dials.index modified.expt $(libtbx.find_in_repositories dials_regression)/indexing_test_data/4rotation/strong.pickle index_assignment.method=local
...
RMSDs by experiment:
+-------+--------+----------+----------+------------+
|   Exp |   Nref |   RMSD_X |   RMSD_Y |     RMSD_Z |
|    id |        |     (px) |     (px) |   (images) |
|-------+--------+----------+----------+------------|
|     0 | 103431 |  0.75239 |  0.64086 |     1.0838 |
+-------+--------+----------+----------+------------+

Refined crystal models:
model 1 (107677 reflections):
Crystal:
    Unit cell: 48.1989(15), 48.1579(16), 98.554(3), 75.8452(7), 75.9054(6), 60.0346(6)
    Space group: P 1
    U matrix:  {{ 0.6536, -0.3604, -0.6655},
                {-0.7534, -0.2265, -0.6173},
                { 0.0717,  0.9049, -0.4195}}
    B matrix:  {{ 0.0207,  0.0000,  0.0000},
                {-0.0120,  0.0240,  0.0000},
                {-0.0035, -0.0035,  0.0106}}
    A = UB:    {{ 0.0202, -0.0063, -0.0070},
                {-0.0108, -0.0032, -0.0065},
                {-0.0079,  0.0232, -0.0044}}
+------------+-------------+---------------+-------------+
|   Imageset |   # indexed |   # unindexed | % indexed   |
|------------+-------------+---------------+-------------|
|          0 |      107677 |        198601 | 35.2%       |
+------------+-------------+---------------+-------------+

newplot (11)
newplot (13)

Results with this PR:

$ dials.index modified.expt $(libtbx.find_in_repositories dials_regression)/indexing_test_data/4rotation/strong.pickle index_assignment.method=local
...
RMSDs by experiment:
+-------+--------+----------+----------+------------+
|   Exp |   Nref |   RMSD_X |   RMSD_Y |     RMSD_Z |
|    id |        |     (px) |     (px) |   (images) |
|-------+--------+----------+----------+------------|
|     0 | 143568 |  0.63947 |  0.87009 |     1.1982 |
+-------+--------+----------+----------+------------+

Refined crystal models:
model 1 (299644 reflections):
Crystal:
    Unit cell: 48.2151(14), 48.2297(14), 98.546(3), 90.0396(6), 75.8949(6), 59.9967(6)
    Space group: P 1
    U matrix:  {{-0.6534, -0.3604,  0.6658},
                { 0.7541, -0.2320,  0.6145},
                {-0.0670,  0.9035,  0.4233}}
    B matrix:  {{ 0.0207,  0.0000,  0.0000},
                {-0.0120,  0.0239,  0.0000},
                {-0.0070,  0.0035,  0.0106}}
    A = UB:    {{-0.0139, -0.0063,  0.0070},
                { 0.0141, -0.0034,  0.0065},
                {-0.0152,  0.0231,  0.0045}}
+------------+-------------+---------------+-------------+
|   Imageset |   # indexed |   # unindexed | % indexed   |
|------------+-------------+---------------+-------------|
|          0 |      299644 |          6634 | 97.8%       |
+------------+-------------+---------------+-------------+

newplot (12)
newplot (14)

@rjgildea rjgildea changed the title Take into account phi in nearest neighbour analysis Local indexing: take into account phi in nearest neighbour analysis Oct 21, 2020
@codecov
Copy link

codecov bot commented Oct 21, 2020

Codecov Report

Merging #1459 into master will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1459      +/-   ##
==========================================
+ Coverage   65.66%   65.68%   +0.01%     
==========================================
  Files         614      614              
  Lines       69490    69525      +35     
  Branches     9513     9515       +2     
==========================================
+ Hits        45630    45665      +35     
  Misses      22022    22022              
  Partials     1838     1838              

Copy link
Member

@dagewa dagewa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very useful change that will improve indexing success for wide scans where experimental geometry is not static, or even static but bad initial geometry.

@Anthchirp
Copy link
Member

Anthchirp commented Oct 22, 2020

Is the 4rotation data set particularly suitable for this or is there anything in dials_data that would work just as well?

I'm just mindful that 4rotation is only used in one other place and is quite chunky.

@rjgildea
Copy link
Contributor Author

Is the 4rotation data set particularly suitable for this or is there anything in dials_data that would work just as well?

I'm just mindful that 4rotation is only used in one other place and is quite chunky.

Yes - the bug was only highlighted when using data sets with multiple full rotations. dials_data doesn't even contain a data set with a single full rotation as far as I can tell. We could probably generate some predicted reflection centroids for the purpose of this test, however the additional test is already many more lines of code than the bug fix itself!

@dagewa
Copy link
Member

dagewa commented Oct 22, 2020

Generating "observations" is a bit of a strange dance that could be made easier. The way I've done it in the past does prediction twice, to first get rays that intersect the detector and then to get a reflection table that includes flags etc. Then you can set the observations from the predictions and invent some variances (for refinement). There are various messy examples in the refinement tests, like here:

#############################
# Generate some reflections #
#############################
# All indices in a 2.0 Angstrom sphere
resolution = 2.0
index_generator = IndexGenerator(
mycrystal.get_unit_cell(),
space_group(space_group_symbols(1).hall()).type(),
resolution,
)
indices = index_generator.to_array()
sequence_range = myscan.get_oscillation_range(deg=False)
im_width = myscan.get_oscillation(deg=False)[1]
assert sequence_range == (0.0, pi)
assert approx_equal(im_width, 0.1 * pi / 180.0)
# Predict rays within the sequence range
ray_predictor = ScansRayPredictor(experiments, sequence_range)
obs_refs = ray_predictor(indices)
# Take only those rays that intersect the detector
intersects = ray_intersection(mydetector, obs_refs)
obs_refs = obs_refs.select(intersects)
# Make a reflection predictor and re-predict for all these reflections. The
# result is the same, but we gain also the flags and xyzcal.px columns
ref_predictor = ScansExperimentsPredictor(experiments)
obs_refs["id"] = flex.int(len(obs_refs), 0)
obs_refs = ref_predictor(obs_refs)
# Set 'observed' centroids from the predicted ones
obs_refs["xyzobs.mm.value"] = obs_refs["xyzcal.mm"]
# Invent some variances for the centroid positions of the simulated data
im_width = 0.1 * pi / 180.0
px_size = mydetector[0].get_pixel_size()
var_x = flex.double(len(obs_refs), (px_size[0] / 2.0) ** 2)
var_y = flex.double(len(obs_refs), (px_size[1] / 2.0) ** 2)
var_phi = flex.double(len(obs_refs), (im_width / 2.0) ** 2)
obs_refs["xyzobs.mm.variance"] = flex.vec3_double(var_x, var_y, var_phi)
# The total number of observations should be 1128
assert len(obs_refs) == 1128

Modify insulin scan to ensure it covers multiple rotations and
generate predicted centroids for the purposes of the test.
This has the added bonus that we know exactly what are the correct
miller indices to expect.
@rjgildea
Copy link
Contributor Author

Generating "observations" is a bit of a strange dance that could be made easier. The way I've done it in the past does prediction twice, to first get rays that intersect the detector and then to get a reflection table that includes flags etc. Then you can set the observations from the predictions and invent some variances (for refinement).

It turns out it's much easier in this case, as all I need are some observed centroids, so I can use flex.reflection_table.from_predictions(experiment) and copy the calculated centroids to predicted centroids:
d1c9d51

@Anthchirp happy now? 🙂

@Anthchirp
Copy link
Member

@Anthchirp happy now? 🙂

I was already happy with your last response, but - yes, certainly 🙂

Copy link
Contributor

@graeme-winter graeme-winter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important set of fixes thank you - wonder if your news fragment should include a comment for > 360° sweeps to aid the end user in understanding the significance of this?

af::init_functor_null<double>());
double* r = rlps_double.begin();
for (std::size_t i = 0; i < reciprocal_space_points.size(); i++) {
for (std::size_t j = 0; j < 3; j++) {
*r++ = reciprocal_space_points[i][j];
}
*r++ = phi[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only comment here is that stuff like *r++ always gets me scratching my head as to whether ++ happens before or after the operation etc. - and we already know that this should be written to rlps_double[4*i+3] - applies to the line above which is obviously unchanged as well.

Appreciate that this is consistent with the "house style" locally and does mean an increment vs. add/multiply but I wonder how much difference that makes to the runtime (i.e. compilers are clever & can unroll the add/multiply version vs. cannot unroll the increment version)

@rjgildea rjgildea merged commit afa8d6a into master Oct 23, 2020
@rjgildea rjgildea deleted the index_local branch October 23, 2020 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Local Indexing
4 participants