Better choice of offset-text. #5785

Merged
merged 5 commits into from May 11, 2016

Conversation

Projects
None yet
7 participants
Contributor

anntzer commented Jan 2, 2016

The axis offset text is chosen as follows:

xlims => offsettext
123, 189 => 0
12341, 12349 => 12340
99999.5, 100010.5 => 100000 # (also a test for #5780)
99990.5, 100000.5 => 100000
1233999, 1234001 => 1234000

(and the same for negative limits).

See #5755.

tacaswell added this to the next major release (2.0) milestone Jan 2, 2016

Member

WeatherGod commented Jan 6, 2016

I have some test cases from a while back when I last tried to come up with a better offset text logic. These test cases really helped explore the possible edge cases in that attempt. You might find it useful (the test data, that is, not the code): https://gist.github.com/WeatherGod/272f4022bf7a8ca12ff4

Contributor

anntzer commented Jan 6, 2016

Thanks @WeatherGod, I incorporated them in the test suite and slightly changed the algorithm. Now, the main condition of whether to use the offset is whether there are two common significant digits at the beginning of every label (allowing for negative labels as discussed above, I think that's the main questionable choice remaining).

Contributor

anntzer commented Jan 6, 2016

(hum, the failing test seems to work for me (and doesn't involve offset texts))

@tacaswell tacaswell and 1 other commented on an outdated diff Jan 7, 2016

lib/matplotlib/tests/test_ticker.py
@@ -159,6 +159,49 @@ def test_SymmetricalLogLocator_set_params():
nose.tools.assert_equal(sym.numticks, 8)
+def test_ScalarFormatter_offset_value():
@tacaswell

tacaswell Jan 7, 2016

Owner

This needs an @cleanup decorator

@anntzer

anntzer Jan 7, 2016

Contributor

As it is, the @cleanup decorator doesn't support generative tests. I'll write an patch for that first.

@tacaswell

tacaswell Jan 7, 2016

Owner

Awesome. That is something that has driven me crazy a couple of times. I have dealt with it by putting the decorator on the called function and creating all of the figures/axes in that function.

@anntzer

anntzer Jan 7, 2016

Contributor

See #5809.

Member

WeatherGod commented Jan 27, 2016

Huh, never seen this error before: "RuntimeError: In set_text: could not load glyph". Also, this branch needs rebasing.

Contributor

anntzer commented Jan 27, 2016

Rebased. I cannot reproduce the glyph-loading errors locally.

@QuLogic QuLogic commented on an outdated diff Feb 18, 2016

lib/matplotlib/ticker.py
- if ave_loc < 0:
- self.offset = (math.ceil(np.max(locs) / p10) * p10)
- else:
- self.offset = (math.floor(np.min(locs) / p10) * p10)
- else:
- self.offset = 0
+ if not len(locs):
+ self.offset = 0
+ return
+ lmin, lmax = locs.min(), locs.max()
+ # min, max comparing absolute values (we want division to round towards
+ # zero so we work on absolute values).
+ abs_min, abs_max = sorted([abs(float(lmin)), abs(float(lmax))])
+ # Only use offset if there are at least two ticks and every tick has
+ # the same sign.
+ if lmin == lmax or lmin <= 0 <= lmax:
@QuLogic

QuLogic Feb 18, 2016

Member

Can move this section above the absolute values to short-circuit a little earlier.

@QuLogic QuLogic commented on the diff Feb 18, 2016

lib/matplotlib/tests/test_ticker.py
+ (12592.82, 12591.43, 12590),
+ (9., 12., 0),
+ (900., 1200., 0),
+ (1900., 1200., 0),
+ (0.99, 1.01, 1),
+ (9.99, 10.01, 10),
+ (99.99, 100.01, 100),
+ (5.99, 6.01, 6),
+ (15.99, 16.01, 16),
+ (-0.452, 0.492, 0),
+ (-0.492, 0.492, 0),
+ (12331.4, 12350.5, 12300),
+ (-12335.3, 12335.3, 0)]
+
+ for left, right, offset in test_data:
+ yield check_offset_for, left, right, offset
@QuLogic

QuLogic Feb 18, 2016

Member

I think all but two test cases are left < right; would it also make sense to yield in the reverse order? Also, I don't think there are any tests for left == right (though I don't see why that won't work correctly.)

@anntzer

anntzer Feb 18, 2016

Contributor

I don't really think I need to support left == right because I don't see how this can ever happen. Other issues handled in new (rebased) commit.

@QuLogic

QuLogic Feb 18, 2016

Member

Well, your code does check for the left == right case, so it's good so codify what the expected behaviour is in that case. I think the default locators would avoid this situation, but I'm not sure about user-created locators.

@anntzer

anntzer Feb 18, 2016

Contributor

Sure. I'll wait for #6022 to be merged in so that the tests are actuall run.

@anntzer

anntzer Feb 21, 2016

Contributor

updated.

@QuLogic QuLogic commented on an outdated diff Feb 18, 2016

lib/matplotlib/ticker.py
+ sign = math.copysign(1, lmin)
+ # What is the smallest power of ten such that abs_min and abs_max are
+ # equal up to that precision?
+ # Note: Internally using oom instead of 10 ** oom avoids some numerical
+ # accuracy issues.
+ oom = math.ceil(math.log10(abs_max))
+ while True:
+ if abs_min // 10 ** oom != abs_max // 10 ** oom:
+ oom += 1
+ break
+ oom -= 1
+ if (abs_max - abs_min) / 10 ** oom <= 1e-2:
+ # Handle the case of straddling a multiple of a large power of ten
+ # (relative to the span).
+ # What is the smallest power of ten such that abs_min and abs_max
+ # at most 1 apart?
@QuLogic

QuLogic Feb 18, 2016

Member

s/at/are at/

@QuLogic QuLogic commented on an outdated diff Feb 18, 2016

lib/matplotlib/ticker.py
+ # What is the smallest power of ten such that abs_min and abs_max are
+ # equal up to that precision?
+ # Note: Internally using oom instead of 10 ** oom avoids some numerical
+ # accuracy issues.
+ oom = math.ceil(math.log10(abs_max))
+ while True:
+ if abs_min // 10 ** oom != abs_max // 10 ** oom:
+ oom += 1
+ break
+ oom -= 1
+ if (abs_max - abs_min) / 10 ** oom <= 1e-2:
+ # Handle the case of straddling a multiple of a large power of ten
+ # (relative to the span).
+ # What is the smallest power of ten such that abs_min and abs_max
+ # at most 1 apart?
+ oom = math.ceil(math.log10(abs_max))
@QuLogic

QuLogic Feb 18, 2016

Member

Instead of re-calculating this (which is maybe expensive?), can you just increase oom until it meets the opposite condition?

@QuLogic

QuLogic Feb 18, 2016

Member

(Assuming you won't get an infinite loop somehow...)

tacaswell closed this Feb 18, 2016

tacaswell reopened this Feb 18, 2016

@tacaswell tacaswell added needs_review and removed needs_review labels Feb 18, 2016

@VincentVandalon VincentVandalon commented on the diff Mar 21, 2016

lib/matplotlib/ticker.py
self._set_orderOfMagnitude(d)
self._set_format(vmin, vmax)
- def _set_offset(self, range):
- # offset of 20,001 is 20,000, for example
+ def _compute_offset(self):
@VincentVandalon

VincentVandalon Mar 21, 2016

I fully agree that renaming this method was really needed; the new name actually reflects the purpose of the function.

@tacaswell

tacaswell Mar 21, 2016

Owner

Contra my reluctance to change the public API at all, changing anything with a leading _ is fair game.

Owner

efiring commented May 2, 2016

This is failing the two tests in which left==right. It's not clear to me whether the problem is in the algorithm, or in the tests.

anntzer added some commits Jan 2, 2016

@anntzer anntzer Better choice of offset-text.
The axis offset text is chosen as follows:

xlims => offsettext
123, 189 => 0
12341, 12349 => 12340
99999.5, 100010.5 => 100000 # (also a test for #5780)
99990.5, 100000.5 => 100000
1233999, 1234001 => 1234000

(and the same for negative limits).

See #5755.
9a4ecfb
@anntzer anntzer Only use offset if it saves >=2 sig. digits.
Tests cases courtesy of @WeatherGod.  Slightly improved numerical
accuracy.
f37fd3d
@anntzer anntzer Slightly more efficient impl.; more tests. 0c39a78
Contributor

anntzer commented May 2, 2016

The test failure was probably masked at some point by the misuse of a generative test. Anyways, this comes down to a policy choice for the left == right case. I guess that "by continuity", we should just let the offset be equal to the single value in that case (say the values were x-eps and x+eps, then it would be normal for x (rounded to ~eps) to be the offset, so we should keep that as eps->0).
Thoughts?

Owner

efiring commented May 2, 2016

The present algorithm is giving (1, 1, 1) and (123, 123, 120) as left, right, offset. Nonsingular is expanding the actual left, right ranges to (0.999, 1.001) and (122.877, 123.123). So the test cases with left==right are not really special cases. Either we like what the algorithm does, or we don't. I still don't completely understand what the algorithm's criteria are, but the results at least look reasonable, so I would be inclined to just change the tests to match. Maybe the tests should actually be changed to use the values provided by the present implementation of nonsingular rather than feeding in equal numbers, since otherwise a change to nonsingular would cause this test to fail for reasons not related to the algorithm it is testing.

@anntzer anntzer Fix test values.
9d0acb2
Contributor

anntzer commented May 2, 2016

OK, now I remember what I did: an offset text is used if it "saves" at least two significant digits, i.e. if the length of the (low, high) range is no more than a hundredth of the largest power of ten below high. So yes, the (1, 1, 1) and (123, 123, 120) outputs are expected given the effect of nonsingular. I rewrote the tests (still using equal left and right for now because using different left and right is basically already tested) and rebased on master.

Member

WeatherGod commented May 2, 2016

This is looking good to me. Were there any remaining concerns? Do we need to update any documentation about the improved offset text algorithm?

Contributor

anntzer commented May 2, 2016

I can add a snippet like "The default offset-text choice was changed to only use significant digits that are common to all ticks (e.g. 1231..1239 -> 1230, instead of 1231), except when they straddle a relatively large multiple of a power of ten in which case that multiple is chosen (e.g. 1999..2001->2000)." Looks good?

Owner

efiring commented May 2, 2016

@anntzer Yes, adding that explanation would be good.

@anntzer anntzer Add What's new entry.
782b8a1
Contributor

anntzer commented May 2, 2016

Done.

Contributor

dopplershift commented May 11, 2016

Looks like the only failure on AppVeyor was spurious. Anyone with proper permissions wanna restart the appveyor run so this can go green and be merged?

efiring closed this May 11, 2016

efiring reopened this May 11, 2016

@efiring efiring added needs_review and removed needs_review labels May 11, 2016

Owner

efiring commented May 11, 2016

I don't see a way to restart just one appveyor test, so I triggered the whole set of checks.

Contributor

dopplershift commented May 11, 2016 edited

If you have permissions, you can log into AppVeyor and re-run the build. Unfortunately, it doesn't key off of GitHub permissions; those permissions are managed manually by whoever owns the AppVeyor project (looks like @mdboom )

@efiring efiring merged commit 4e1792b into matplotlib:master May 11, 2016

2 of 3 checks passed

coverage/coveralls Coverage decreased (-0.01%) to 69.634%
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

efiring removed the needs_review label May 11, 2016

@efiring efiring added a commit that referenced this pull request May 11, 2016

@efiring efiring Merge pull request #5785 from anntzer/better-offsettext-choice
Better choice of offset-text.
beb08c8
Owner

efiring commented May 11, 2016

Backported to v2.x as beb08c8.

anntzer deleted the anntzer:better-offsettext-choice branch May 11, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment