Adaptive search #192

nkeim · 2015-01-05T00:49:52Z

This PR implements a new feature I call "adaptive search," that is explained and demonstrated in two new tutorials. See soft-matter/trackpy-examples#17

Adaptive search is my attempt to address an age-old problem when tracking dense packings with Crocker-Grier: how to select a search_range that does not exclude valid links, without creating many large subnets that make linking impractical or impossible. Conventionally, this is done by guessing, followed by trial, error, frustration, and resignation. However, when I was tracking 30k particles over 12k frames, it made no sense to carefully choose a single value for all particles in all frames — a search_range that worked perfectly well for the first 5500 frames or so would cause a SubnetOversizeException in frame 5501, because a corner of the image was momentarily displaced.

With adaptive search enabled, one instead specifies a maximum search_range for the entire movie. If an oversize subnet is encountered, linking becomes re-entrant: the particles of the offending subnet are re-linked with progressively smaller values of search_range. In this way, solvable (or trivial) subnets are broken off and solved one by one, until there are no more particles left to link. When it works, this is a huge simplification for the user: guess a reasonable maximum search_range based on the largest particle displacements, and leave the rest to trackpy.

Implementing this feature required a few big internal changes to the linker. The ones worth mentioning are

Refactoring the linker into a class, making it relatively easy to recurse without passing around lots of parameters and state.
Unifying MAX_SUB_NET_SIZE and its associated logic, and providing an alternate value more suited to the adaptive search algorithm.
Properly testing handling of oversize subnets, and adaptive search.

This is a big one — it's taken roughly the past year to dream this up, tell @danielballan about it, implement it, use it in my research, rebase, rebase, rebase, and write the documentation. (Although most of the heavy lifting was done by @danielballan and @tacaswell when they wrote the linking tests.) Some serious scrutiny is needed, though if there are no major problems it would be good to target this for the v0.3 milestone. Thanks!

tacaswell · 2015-01-05T01:06:32Z

trackpy/linking.py

@@ -474,6 +476,12 @@ def link_df(features, search_range, memory=0,
        the next frame.

        For examples of how this works, see the "predict" module.
+    adaptive_step : float (optional)


This is picky, but I have been parsing numpydoc formatted strings recently and the convention is

name : type, optional

tacaswell · 2015-01-05T02:02:22Z

Left some trivial-ish comments from reading the code. My knee-jerk reaction is to be skeptical, but I am having trouble coming up with a good reason why this is worse than hand tuning.

nkeim · 2015-01-09T23:08:32Z

Thanks, @tacaswell. I've just force-pushed a rebased branch that addresses your line comments. I'm glad you think that the example notebook gives users a sense of when adaptive search is and is not a good idea. But I'm not quite satisfied with that yet because there's presently no diagnostic info related to subnets or adaptive search — we're warning users that they can shoot themselves in the foot, but it's as though they can't even see where the darn thing is pointed. So as I mention in my comment to #184, some debug-level logging would go a long way here, and should be added before we merge.

nkeim · 2015-01-11T18:26:02Z

Inserting @tacaswell 's #184 (comment) into this discussion:

Add an option (or at least a documented trick) to drop subnets entirely. I like this. It could just be a separate, null subnet solver that the user chooses with link_strategy.
Let the linker return data on how adaptive search was used for each particle. This would be useful, but it also seems like it would be overly stretching the present API. I'll give it some thought.

danielballan · 2015-01-12T17:49:50Z

+1 for link_strategy='drop'

…dation

The way that such a condition could arise is no longer known.

This requires separating out the tests that require a normal subnet linker.

This also fixed a bug involving chained indexing

This was causing the linking tests to fail occasionally.

nkeim · 2015-02-02T22:04:11Z

Lots of new stuff in this rebase/push. Highlights:

New drop link_strategy as requested. I realized that this can also be used to dry-run through a difficult linking job to scan for oversize subnets, before attempting the actual computation.
With the diag keyword argument enabled, linking code can attach arbitrary bits of diagnostic information to each particle, returned as extra columns in the result DataFrame. A tutorial for this feature is in the works.
- The only sane way to add the diagnostic info in link_df was to make a copy of the input DataFrame. This could break some poorly-written user code. So I added the copy_features option to turn on copying without diagnostics.
Some minor performance enhancements, validated by asv benchmarking. When adaptive search and link diagnostics are turned off, the performance penalty due to this PR is zero, within error.
Particle ID numbering in the first level no longer depends on ordering of a set. The problem will still affect the rest of the movie, but in the first level it was causing at least one test to fail intermittently.

Obviously we have greatly expanded the scope of the original PR. But the diagnostics were motivated by adaptive search, and they mess with some of the same pieces of code, so I decided to take the lazy approach and then get feedback.

danielballan · 2015-02-03T11:15:06Z

trackpy/linking.py

        neighbor_strategy=neighbor_strategy, link_strategy=link_strategy,
        hash_size=hash_size, box_size=box_size)

+    if diagnostics:
+        features = strip_diagnostics(features)


Could you add a comment noting that strip_diagnostics does not modify the original? I was worried for a moment that it did.

nkeim · 2015-02-03T18:01:33Z

Thanks @danielballan . Good catches. I'll fix those issues, and let me know whether you want the new stuff in its own PR.

Also on my to-do list before a (possible) merge:

Replace adaptive_limit (number of recursion steps) with adaptive_stop (smallest acceptable search_radius). Makes more sense to think in terms of length scales.
Draft a tutorial on diagnostics, so you don't have to squint at the source code so much.

nkeim · 2015-02-03T18:04:58Z

trackpy/tests/test_link.py

@@ -58,6 +81,13 @@ def test_one_trivial_stepper(self):
        assert_frame_equal(actual, expected)
        actual_iter = self.link_df_iter(f, 5, hash_size=(10, 2))
        assert_frame_equal(actual_iter, expected)
+        if self.do_diagnostics:
+            assert 'diag_search_range' in self.diag.columns
+            print(self.diag.diag_search_range)


Oops! Will remove this print and check for others.

danielballan · 2015-02-03T18:57:41Z

Sounds good to me. Good idea with adaptive_stop. Best to stay in the physical world (as opposed to Algorithm World) wherever possible.

Also added doc/tutorials/index.rst to .gitignore; it is automatically generated by the Makefile.

nkeim · 2015-02-09T06:22:36Z

The promised changes are now up. Also, I wrote a new tutorial on linking diagnostics, and added diagnostics to the tutorial on adaptive search. See soft-matter/trackpy-examples#17 . The 2 tutorials in question are:

danielballan · 2015-02-12T13:32:56Z

The linking diagnostics tutorial is just so very cool. Wow.

I think it's time to merge this and the examples PR. Any final revisions?

tacaswell · 2015-02-12T13:47:56Z

I gave these a read over a while ago and had nothing to say but 'cool!'

I can give a more thorough code review if you want, but time scale for that
is iffy

On Thu, Feb 12, 2015, 08:32 Dan Allan notifications@github.com wrote:

The linking diagnostics tutorial is just so very cool. Wow.

I think it's time to merge this and the examples PR. Any final revisions?

—
Reply to this email directly or view it on GitHub
#192 (comment).

danielballan · 2015-02-12T13:53:52Z

OK, I'm going for it. I know first-hand how busy you are. :- D

Adaptive search

DOC : Update API docs after #192.

nkeim mentioned this pull request Jan 5, 2015

ENH: Add tutorials for subnets and adaptive search soft-matter/trackpy-examples#17

Merged

tacaswell reviewed Jan 5, 2015
View reviewed changes

tacaswell added this to the 0.3 milestone Jan 5, 2015

nkeim mentioned this pull request Jan 5, 2015

ENH: Configure proper logging. #184

Merged

nkeim force-pushed the adaptive-search branch from fa39154 to 30ba467 Compare January 9, 2015 23:02

nkeim mentioned this pull request Jan 15, 2015

Should we release a 0.2.5 to better support latest deps? #197

Closed

nkeim added 7 commits February 2, 2015 12:45

CLN: Fix typo in link_iter() comment

48ee243

TST: Some tests did not target all the linkers/hashes

2ddec00

WIP: Repackaged linker as an object; consolidated algorithm

7171096

BUG/TST: Subnet oversize checking for all linkers

6a75020

ENH: Iterative reduction of search_range within oversize subnets

8152a09

PERF: assign_links() no longer makes copies; adaptive_step value vali…

2bb7ca4

…dation

CLN: Further consolidate oversize subnet conditions

374b269

nkeim force-pushed the adaptive-search branch from fd3580c to 1d0f6f3 Compare February 2, 2015 21:26

nkeim added 9 commits February 2, 2015 13:42

CLN: PEP8

7776b9b

DOC: link_df() says more clearly that it modifies in-place

31d709c

ENH: Improved handling of subnet limits for adaptive search

cac1fe3

DOC: Added mention of adaptive search, and new tutorials

7ab987f

CLN: Made assign_links private; better explained its messiness

07d20cb

DOC: Make "(optional)" notation in linking.py conform to numpydoc

a1c59ec

CLN: Remove handling of source particle with no forward_cands

687fa71

The way that such a condition could arise is no longer known.

ENH: Add "drop" link_strategy

eced5c3

This requires separating out the tests that require a normal subnet linker.

CLN: Remove superfluous loop variable in numba linker

bfc5505

nkeim added 5 commits February 2, 2015 13:42

PERF: Reduced Pandas assignments for diagnostic info

4ade4f2

This also fixed a bug involving chained indexing

PERF: Make saving diagnostics conditional

fb64973

API/PERF: Combine PointND and IndexedPointND classes

708eb7b

PERF: Check for memory rather than catching exception

f45f72c

BUG: py3k compatibility

d65f516

nkeim force-pushed the adaptive-search branch from 1d0f6f3 to d65f516 Compare February 2, 2015 21:42

BUG: Consistent particle ID assignment

6fba839

This was causing the linking tests to fail occasionally.

danielballan reviewed Feb 3, 2015
View reviewed changes

nkeim reviewed Feb 3, 2015
View reviewed changes

nkeim added 5 commits February 4, 2015 13:56

DOC: Explain what drop_link might be good for

ba6e431

API: Limit adaptive search with adaptive_stop lengthscale

3b5aa8f

TST: Fix randomly-failing drop_link test for good

8ac6233

DOC: Added linking diagnostics tutorial

732c6f4

Also added doc/tutorials/index.rst to .gitignore; it is automatically generated by the Makefile.

CLN: Fix docstring, comment; remove old print statements

84611a2

danielballan added a commit that referenced this pull request Feb 12, 2015

Merge pull request #192 from nkeim/adaptive-search

41c1f1e

Adaptive search

danielballan merged commit 41c1f1e into soft-matter:master Feb 12, 2015

danielballan added a commit to danielballan/trackpy that referenced this pull request Feb 12, 2015

DOC: Update API docs after soft-matter#192.

0514da6

danielballan mentioned this pull request Feb 12, 2015

DOC: Update API docs after #192. #207

Merged

tacaswell added a commit that referenced this pull request Feb 13, 2015

Merge pull request #207 from danielballan/update-api

926e8e0

DOC : Update API docs after #192.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adaptive search #192

Adaptive search #192

nkeim commented Jan 5, 2015

tacaswell Jan 5, 2015

tacaswell commented Jan 5, 2015

nkeim commented Jan 9, 2015

nkeim commented Jan 11, 2015

danielballan commented Jan 12, 2015

nkeim commented Feb 2, 2015

danielballan Feb 3, 2015

nkeim commented Feb 3, 2015

nkeim Feb 3, 2015

danielballan commented Feb 3, 2015

nkeim commented Feb 9, 2015

danielballan commented Feb 12, 2015

tacaswell commented Feb 12, 2015

danielballan commented Feb 12, 2015

Adaptive search #192

Adaptive search #192

Conversation

nkeim commented Jan 5, 2015

tacaswell Jan 5, 2015

Choose a reason for hiding this comment

tacaswell commented Jan 5, 2015

nkeim commented Jan 9, 2015

nkeim commented Jan 11, 2015

danielballan commented Jan 12, 2015

nkeim commented Feb 2, 2015

danielballan Feb 3, 2015

Choose a reason for hiding this comment

nkeim commented Feb 3, 2015

nkeim Feb 3, 2015

Choose a reason for hiding this comment

danielballan commented Feb 3, 2015

nkeim commented Feb 9, 2015

danielballan commented Feb 12, 2015

tacaswell commented Feb 12, 2015

danielballan commented Feb 12, 2015