abbreviations for timezones, #25 #53

ijlyttle · 2017-09-16T14:43:17Z

This is a first pass to address #25.

Things I still need to sort out:

document the function better
make sure this does something sensible on non-Mac platforms
work out some details on dictionaries
tests

Questions

abbreviate_olson() has an argument to specify the maximum width of what is returned. I am confident it will work for any value 14 or greater. It may not work at less than 13. For now, I throw a warning if width < 14; is this the "right" thing to do?
should abbreviate_olson() be exported? - I am doing so just to motivate documentation, but perhaps it does not need to be.
is there a better file to put these functions in?

I'm sure you will have more suggestions. I will signal when I think it will be worth looking at, but will welcome your feedback in the meantime.

addresses r-lib#25

ijlyttle · 2017-09-18T19:24:40Z

Hi @krlmlr and @hadley, I think this is ready for your input.

I have made some tests that make sure that all Olson abbreviations are in fact 14 characters or fewer, and that all Olson abbreviations are unique. These tests pass on Mac, Windows, and Linux.

I also had a look at the abbreviations themselves - they seem OK to me, but you may have different opinions :)

The function takes only one timezone at a time, so to see them all at once I use (unsurprisingly)

purrr::map_chr(OlsonNames(), abbreviate_olson)

Earlier in this thread, I have a few questions. Thanks for having a look!

krlmlr · 2017-10-25T12:18:41Z

Thanks. I've uploaded a copy of the list of abbreviations from my system: https://gist.github.com/krlmlr/cc981acfd19931f9d56061848ff2447d

I wonder if we always should be using the abbreviation for the first component, even if the long form fits, so that Africa/Bissau becomes Afr/Bissau just like Afr/Blantyre. This would give a more consistent display.

We shouldn't give a warning, but maybe indicate non-uniqueness with a special symbol such as clisymbols::symbol$ellipsis. The pillar code will do this for you automatically, but maybe we should consider adding an ellipsis if and only if the abbreviation is not unique?

krlmlr · 2017-10-25T12:20:17Z

The filename seems ok, exporting seems fine as well. (This is a fairly low-level package, users are mostly other packages and may have legitimate uses for this functionality.)

ijlyttle · 2017-10-28T13:09:53Z

Thanks! I will remove the warning and see what I can do with clisymbols::symbol$ellipsis.

I spent some time arguing with myself on the abbreviation-consistency question you raised, when I put the function together. Maybe what I can do is make that an option in the function call, e.g. be_consistent = TRUE (I'll find a better name for the argument).

Perhaps pillar could propose a default, and it could be overriden with an option?

…d level names are abbreviated consistently. adds tests

ijlyttle · 2017-10-30T01:55:25Z

I think this may be ready for another look.

I changed the abbreviation-function so that its default is to make a consistent abbreviation of the first and second elements, e.g. Africa/Bissau becomes Afr/Bissau

I have removed the warning.

I have made it possible to request a width of as small as 8. A few things to keep in mind as the width gets smaller:

at 14 characters, I can verify consistency and uniqueness
at 10 characters, I can verify uniqueness, but not consistency
below 10 characters, I can verify neither uniqueness nor consistency

To test uniqueness at runtime would require evaluating all the timezones whenever the function is called - I don't know if this would make this function too "heavy".

Could this function be called using width = 14, then have pillar use an ellipsis whenever it requires something shorter (as you suggest, if I understand correctly)?

Here are the current results (Mac) for using width = 14:

https://gist.github.com/ijlyttle/58152793e0d7854961fdfc86d5cf60b9

vectorize abbreviate_olson()

krlmlr · 2017-10-31T14:07:29Z

Thanks. I think it's important to have consistent display across all invocations of printing a tibble (for the same width), so I suggest to compute a dictionary for the abbreviations of all OlsonNames(), and then just do plain lookup for displaying. It seems unlikely that the Olson names change while an R session is run, we could use memoise to avoid expensive computation all over.

I think the abbreviations will be better if one-, two- and three-component time zones are processed separately. We need to do a bit of gymnastics with map_int(..., length) and perhaps by() to achieve this. With strict = FALSE the abbreviate() function guarantees uniqueness, but we still may end up with non-unique (or too long) strings, which we then shorten with an ellipsis (which may be three characters wide on some systems). The shortening is applied to the dictionary of time zones, and available automatically for all subsequent displays.

Would that work?

ijlyttle · 2017-11-05T22:27:36Z

I think I get the idea - let me wrestle with the newly-vectorized function so that it will do what you describe. In the short term, I can get us to a non-memoised version. I need to do some reading on memoise, as I have not yet used it.

ijlyttle · 2017-11-06T04:35:46Z

@krlmlr - Another question for you: your vectorization uses purrr functions, but pillar does not import purrr. Does this present a difficulty?

I have started to see what I can do using only base - I think there may be a way through, but I would sure like to have dplyr::summarise() :) (I think I can get aggregate() to work).

ijlyttle · 2018-01-11T03:56:28Z

Hi @krlmlr,

I am embarrassed to say that I have not used memoise yet, so I am on shaky ground. My idea is to:

add memoise to Suggests
add abbreviate_olson <- memoise::memoise(abbreviate_olson) (if available) to utils-olson.R
modify type_sum.POSIXct() in type_sum.R to call abbreviate_olson() then return the element corresponding to attr(x, "tzone") (prepending "dttm-")

Is this a plausible way?

It is not evident to me how type_sum.POSIXct() will know the available width. Can you point me in the right direction here?

krlmlr · 2018-01-11T08:30:49Z

I totally forgot type_sum() doesn't have a width argument yet. Even if we add it now, not all implementers will support it right away, and we'll need to work around this. We could also show the time zone information in a separate row, but I'm not sure we want this (#73).

Maybe we should stick to a hard-coded width of 12 (width of date minus width of <dttm >), or maybe a width of 14 (width of date minus width of <dt >), and leave dynamic width for later?

Your suggested approach looks like the safest way to avoid a build-time dependency on memoise, I keep forgetting about this problem. Would it work to just overwrite abbreviate_olson with memoise(abbreviate_olson)? Also, maybe we just suggest memoise and do this transformation only if the package is installed?

ijlyttle · 2018-01-11T13:55:36Z

I think I get the idea here. Let me see what I can do.

Crazy question: how should we approach the case where the tzone is not set? On my computer (MacOS), it uses the system timezone for the display.

The simplest thing to do, for now, would be to use the default <dttm> to indicate to timezone is not set, which I will do unless you want to do something else.

krlmlr · 2018-01-11T19:02:14Z

Thanks. Agree to use <dttm> if time zone is unset.

ijlyttle · 2018-01-11T19:48:23Z

@krlmlr I have everything done - I imagine that the last thing that I did, updating type_sum.POSIXct(), broke a bunch of tests.

I can poke around to amend the tests, but any guidance you can provide will be very welcome.

ijlyttle · 2018-01-11T20:14:41Z

Aside from tests, here's what happens on my computer:

library("lubridate")
library("pillar")
now_tz <- with_tz(now(), "America/New_York")
type_sum(now_tz)

[1] "dtm-Amer/New_York"

library("tibble")
as_tibble(now_tz)

# A tibble: 1 x 1
  value              
  <dttm-Amr/New_York>
1 2018-01-11 15:04:43

Now that things are "working", I thought I might raise some options because 12 characters seems a little confining.

Leave as is.
Use "dtm" rather than "dttm" as the prefix - buys us a character.
Keep "dttm" as header when no timezone, use only abbreviated-timezone when there is a timezone, e.g. <Amer/New_York> - buys us five characters.
Something else.

Thoughts?

ijlyttle · 2018-01-11T20:22:15Z

One last question - can you have a look at the implementation of memoize:
https://github.com/ijlyttle/pillar/blob/d7e0ca5c3c0cb7f2d19b1f9cba742b8a6d4b53df/R/utils-olson.R#L70

I could not get this to work in .onLoad (maybe I need to use <<-).

It works for me this way, but I have no idea if this is a good practice or a horrible practice.

Thanks!

krlmlr · 2018-01-11T20:24:15Z

Some tests create files with output, and fail the first time they are run if the output changes. They should work the second time, but you may need to rerun all tests a few times until convergence. (testtthat aborts early if there are too many failures.) Can you show the output for the various options you're suggesting?

The current memoise implementation installs the function during build time, it's better to do in .onLoad(). Have you tried to assign() to the package environment?

ijlyttle · 2018-01-11T20:34:52Z

I will fiddle with tests, as well as get things working with .onLoad().

Here are the assumed outputs resulting from the options.

library("lubridate")
library("pillar")
library("tibble")

now_tz <- with_tz(now(), "America/New_York")
as_tibble(now_tz)

As is:

# A tibble: 1 x 1
  value              
  <dttm-Amr/New_York>
1 2018-01-11 15:04:43

"dttm" -> "dtm":

# A tibble: 1 x 1
  value              
  <dtm-Amer/New_York>
1 2018-01-11 15:04:43

Discarding "dttm-" if timezone present:

# A tibble: 1 x 1
  value              
  <Amer/New_York>
1 2018-01-11 15:04:43

Same option, but run with timezone absent:

as_tibble(now())

# A tibble: 1 x 1
  value              
  <dttm>
1 2018-01-11 14:04:43

ijlyttle · 2018-01-12T01:51:35Z

Hi @krlmlr,

Good news: I think I have .onLoad() worked out, and it behaves well for me. code

Bad news: I am having trouble to get the tests to do what you describe - I keep getting the same errors again and again, no convergence.

Error for this test:

> devtools::test(filter = "time")
Loading pillar
Testing pillar
format_time: 1................

Failed ------------------------------------------------------------------------------------------------
1. Error: output test (@test-format_time.R#4) ---------------------------------------------------------
attempt to select less than one element in get1index
1: expect_pillar_output(as.POSIXct("2017-07-28 18:04:35 +0200"), filename = "time.txt") at /Users/ijlyttle/Documents/git/github/public_forked/pillar/tests/testthat/test-format_time.R:4
2: expect_pillar_output_utf8(object_quo, filename, output_width) at /Users/ijlyttle/Documents/git/github/public_forked/pillar/tests/testthat/helper-output.R:21
3: expect_known_display(object = !(!object_quo), file = file.path("out", filename), crayon = TRUE, width = output_width) at /Users/ijlyttle/Documents/git/github/public_forked/pillar/tests/testthat/helper-output.R:27
4: testthat::expect_output_file(print(eval_tidy(object)), file, update = TRUE) at /Users/ijlyttle/Documents/git/github/public_forked/pillar/R/testthat.R:53
5: capture_output_as_vector(object)
6: with_sink(temp, withVisible(code))
7: withVisible(code)
8: print(eval_tidy(object))
9: eval_tidy(object)
10: overscope_eval_next(overscope, expr)
11: get_pillar_output_object(x, ..., xp = xp, xf = xf)
12: pillar(xp, ...) at /Users/ijlyttle/Documents/git/github/public_forked/pillar/tests/testthat/helper-output.R:52
13: pillar_type(x, ...) at /Users/ijlyttle/Documents/git/github/public_forked/pillar/R/pillar.R:38
14: as_character(type_sum(x) %||% "") at /Users/ijlyttle/Documents/git/github/public_forked/pillar/R/type.R:10
15: coerce_type_vec(x, friendly_type("character"), string = , character = set_chr_encoding(set_attrs(x, NULL), 
       encoding))
16: type_of(.x)
17: type_sum(x) %||% ""
18: is_null(x)
19: type_sum(x)
20: type_sum.POSIXct(x) at /Users/ijlyttle/Documents/git/github/public_forked/pillar/R/type-sum.R:11

DONE ==================================================================================================

ijlyttle · 2018-01-14T15:03:08Z

Hi @krlmlr,

I found that the tests were not passing because I had an actual problem in my code; I have taken care of it.

The tests run OK now, but now I am having a problem running devtools::document().

Edit - having updated to latest github version of hadley/devtools, I get:

> devtools::document()
Updating pillar documentation
Loading pillar
Writing expect_known_display.Rd
Error in usethis::use_build_ignore(file_name, base_path = base_path) : 
  unused argument (base_path = base_path)

I see this is addressed in r-lib/pkgapi#21.

Once that is sorted out, it remains to see if you want to change the styling of the header (above).

krlmlr · 2018-05-23T14:47:13Z

Thanks for your patience. I suspect this should live elsewhere (perhaps in lubridate?), tidyverse/tibble#411 (or a variant thereof) will provide the necessary infrastructure.

ijlyttle · 2018-05-23T15:00:41Z

I agree, I will look elsewhere :)

krlmlr · 2018-05-23T15:04:02Z

Thanks again! Let me know when you find a home for this useful functionality.

ijlyttle added 9 commits September 16, 2017 16:14

first pass at functions

10a34b3

remove examples for unexported function

8827cea

actually removing examples

5e76370

activates user-supplied dictionary

580fa6f

addresses r-lib#25

adds tests

708ea59

spell "get" correctly

a59f45b

first pass at documentation

96001f4

adds documentation, dictionary abbreviation for "SystemV" (linux)

71d82e1

updates examples

72fa45e

ijlyttle added 9 commits October 29, 2017 08:12

add rproj

1d324ac

Merge branch 'master' of github.com:r-lib/pillar

e493908

adds arg consistent to abbreviate_olson() so that first and secon…

c74a75a

…d level names are abbreviated consistently. adds tests

teaks test

1aec929

roxygenize with new argument

e1aeccf

adds internal helper function to budget tz width

33851c2

adds budget for each component, tests

c4b2e85

fixes bug in tests

9659720

removes test for 10-character abbreviations

9ad6577

krlmlr and others added 2 commits October 31, 2017 11:46

vectorize abbreviate_olson()

68acce3

Merge pull request #1 from krlmlr/ijlyttle-master

90e23da

vectorize abbreviate_olson()

ijlyttle added 2 commits November 5, 2017 18:19

removes test for non-vector input

6651d38

new helper function to create data frame for tz components

8aa8494

ijlyttle and others added 2 commits January 11, 2018 12:59

memoises abbreviate_olson

f9f8514

Merge branch 'master' into master

675b731

ijlyttle and others added 3 commits January 11, 2018 13:02

Merge branch 'master' into master

6059961

adds timezone-abbreviations into POSIXct headers

f9a272c

Merge branch 'master' of github.com:ijlyttle/pillar

d7e0ca5

move memoisation to .onLoad

f30a6f7

make the tz check more-robust

4ad932c

ijlyttle and others added 5 commits January 14, 2018 11:00

Merge branch 'master' into master

331052f

Merge branch 'master' into master

7560c44

Merge branch 'master' into master

7a896e6

Merge branch 'master' into master

2e9e7cf

sorts out incorrect conflict-resolution

f4faa52

krlmlr added this to To Do in krlmlr Jan 15, 2018

ijlyttle closed this May 23, 2018

krlmlr moved this from To Do to Done in krlmlr May 24, 2018

github-actions bot locked as resolved and limited conversation to collaborators Dec 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

abbreviations for timezones, #25 #53

abbreviations for timezones, #25 #53

ijlyttle commented Sep 16, 2017 •

edited

Loading

ijlyttle commented Sep 18, 2017

krlmlr commented Oct 25, 2017

krlmlr commented Oct 25, 2017

ijlyttle commented Oct 28, 2017

ijlyttle commented Oct 30, 2017 •

edited

Loading

krlmlr commented Oct 31, 2017

ijlyttle commented Nov 5, 2017 •

edited

Loading

ijlyttle commented Nov 6, 2017 •

edited

Loading

ijlyttle commented Jan 11, 2018 •

edited

Loading

krlmlr commented Jan 11, 2018

ijlyttle commented Jan 11, 2018 •

edited

Loading

krlmlr commented Jan 11, 2018

ijlyttle commented Jan 11, 2018 •

edited

Loading

ijlyttle commented Jan 11, 2018 •

edited

Loading

ijlyttle commented Jan 11, 2018 •

edited

Loading

krlmlr commented Jan 11, 2018 •

edited

Loading

ijlyttle commented Jan 11, 2018 •

edited

Loading

ijlyttle commented Jan 12, 2018

ijlyttle commented Jan 14, 2018 •

edited

Loading

krlmlr commented May 23, 2018

ijlyttle commented May 23, 2018

krlmlr commented May 23, 2018

abbreviations for timezones, #25 #53

abbreviations for timezones, #25 #53

Conversation

ijlyttle commented Sep 16, 2017 • edited Loading

Questions

ijlyttle commented Sep 18, 2017

krlmlr commented Oct 25, 2017

krlmlr commented Oct 25, 2017

ijlyttle commented Oct 28, 2017

ijlyttle commented Oct 30, 2017 • edited Loading

krlmlr commented Oct 31, 2017

ijlyttle commented Nov 5, 2017 • edited Loading

ijlyttle commented Nov 6, 2017 • edited Loading

ijlyttle commented Jan 11, 2018 • edited Loading

krlmlr commented Jan 11, 2018

ijlyttle commented Jan 11, 2018 • edited Loading

krlmlr commented Jan 11, 2018

ijlyttle commented Jan 11, 2018 • edited Loading

ijlyttle commented Jan 11, 2018 • edited Loading

ijlyttle commented Jan 11, 2018 • edited Loading

krlmlr commented Jan 11, 2018 • edited Loading

ijlyttle commented Jan 11, 2018 • edited Loading

ijlyttle commented Jan 12, 2018

ijlyttle commented Jan 14, 2018 • edited Loading

krlmlr commented May 23, 2018

ijlyttle commented May 23, 2018

krlmlr commented May 23, 2018

ijlyttle commented Sep 16, 2017 •

edited

Loading

ijlyttle commented Oct 30, 2017 •

edited

Loading

ijlyttle commented Nov 5, 2017 •

edited

Loading

ijlyttle commented Nov 6, 2017 •

edited

Loading

ijlyttle commented Jan 11, 2018 •

edited

Loading

ijlyttle commented Jan 11, 2018 •

edited

Loading

ijlyttle commented Jan 11, 2018 •

edited

Loading

ijlyttle commented Jan 11, 2018 •

edited

Loading

ijlyttle commented Jan 11, 2018 •

edited

Loading

krlmlr commented Jan 11, 2018 •

edited

Loading

ijlyttle commented Jan 11, 2018 •

edited

Loading

ijlyttle commented Jan 14, 2018 •

edited

Loading