Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we make it easier to test UTF-8 handling? #1285

Closed
lionel- opened this issue Dec 15, 2020 · 4 comments · Fixed by #1525
Closed

Can we make it easier to test UTF-8 handling? #1285

lionel- opened this issue Dec 15, 2020 · 4 comments · Fixed by #1525
Labels
bug an unexpected problem or unintended behavior tests 📘
Milestone

Comments

@lionel-
Copy link
Member

lionel- commented Dec 15, 2020

Porting the UTF-8 tests of rlang to testthat3 was a bit tricky. The waldo_compare() function called by all testthat expectations calls reporter$local_user_output() which automatically skips the test when the locale is set to non-utf-8.

I found this workaround:

local_utf8_test <- function(frame = caller_env()) {
  reporter <- get_reporter()

  old <- reporter$unicode
  defer(reporter$unicode <- old, envir = frame)

  reporter$unicode <- FALSE
}

This doesn't seem like a proper long-term solution though.

@gaborcsardi
Copy link
Member

Yeah, that should not happen. local_reproducible_output() should probably not skip anything when not called for its primary purpose.

@gaborcsardi gaborcsardi added the bug an unexpected problem or unintended behavior label Dec 15, 2020
@gaborcsardi
Copy link
Member

@krlmlr
Copy link
Member

krlmlr commented Jul 24, 2021

This affects tests in {utf8}.

Reprex:

library(testthat)

test_that("WAT", {
  withr::local_locale(LC_CTYPE = "C")
  expect_true(TRUE)
})
#> ── Skip (<text>:5:3): WAT ──────────────────────────────────────────────────────
#> Reason: non utf8 locale

Created on 2021-07-24 by the reprex package (v2.0.0.9000)

I have confirmed that this comes from local_reproducible_output(unicode = TRUE), I'm not sure which component is calling this.

krlmlr added a commit to patperry/r-utf8 that referenced this issue Jul 24, 2021
clrpackages pushed a commit to clearlinux-pkgs/R-utf8 that referenced this issue Jul 29, 2021
Kirill Müller (65):
      Also test on Windows R-devel
      Migrate to testthat 3e
      Suggest rlang
      Restyle NEWS
      Bump version to 1.2.0.9000
      Edit
      Version 13 now
      Take over maintenance
      Hygiene
      Tweak README
      Remove context()
      Avoid covr comments in pull requests
      Not all HTML
      Remove for now
      Update CRAN comments
      Build-ignore
      Tidy description
      NEWS and CRAN comments
      Skip on Mac for now
      Dev mode and authors
      Bump version to 1.2.0.9001
      Fix URLs
      Update `.gitignore` and/or `.Rbuildignore`
      Escape
      Bump version to 1.2.1
      https
      Bump version to 1.2.1.9000
      Also build for tags
      Upgrade to Ubuntu 18.04
      Fix cache keys
      Fix if package not on CRAN yet
      Sync with pillar
      Tweaks
      Tweaks
      Harmonize
      Upgrade to 18.04
      Tweaks
      Arrgh
      Shorter name
      Add revdep workflow
      Alias checkbashisms to /bin/true
      Add customization point
      Sync
      URL
      LazyData
      Run apt-get update for pkgdown
      Style without strictness
      Add merge workflow
      Add cancel workflow
      Sync with fledge
      Reschedule
      Reduce parallelism for now
      Move custom section in preparation for wrapping a GitHub Action
      Harmonize
      Harmonize
      Add Makefile target for synchronization
      Use snapshot tests
      Suggest
      Use withr::local_locale(), switch_ctype -> local_ctype
      Work around r-lib/testthat#1285
      Only require testthat
      Bump version to 1.2.1.9001
      Bump version to 1.2.2
      Update CRAN comments
      NEWS and CRAN comments

Patrick Perry (121):
      initial import
      add Travis CI
      update README
      add utf8lite_text
      remove dead code
      split up utf8lite.c into separate modules
      split up utf8 code; add message
      add xcode project
      add message to utf8lite_scan_utf8
      remove 'text.h' add 'textiter.c'
      switch to signed integer for utf-32
      keep format attribute when compiling with gcc or clang
      remove -Weverything
      merge headers; doc update
      add textmap
      doc fix
      add casts to surrogate pair extractors
      add more error codes
      add grapheme data
      import 'render' code from corpus
      start render tests
      add rmdi flag
      fix compiler warning
      more portable printf
      add error codes to 'render'
      cleanup
      add quote, backslash escaping
      finish escape handling
      more control tests
      fix dquote test
      handle default ignorables
      doc update
      use format attribute for clang on windows
      whitespace
      add graphbreak.h
      start graphscan
      add graphscan test; fix hangul syllable handling
      revise charwidth definitions
      ingore a.out
      add property table
      fix ZWJ handling
      refactor CR rule
      simplify graphscan
      remove dead code
      documentation
      add grapheme types
      change field to enum
      reorder charwidth properties
      start grapheme width calculation
      implement char width
      add escape width tests
      test ascii width
      add width tests for narrow, ambiguous, wide, mark
      add emoji test
      add special emoji width handling, overflow detection
      add newline, tab render tests
      add printf render test
      add utflite_render_graph
      add emoji render tests
      fix handling of non-extended emoji
      remove graph_type
      fix clang warning
      start graphscan_retreat
      get retreat tests to pass
      code cleanup
      fix tests
      add double-retreat test
      add empty test
      add singleton test
      remove un-needed file
      split of 'measure' from render
      update xcode build
      report control width as -1
      add render_bytes
      fix (?) travis build
      fix compiler warning
      compiler warning
      add 'render_spaces'
      fix zwsp after emoji with rmdi
      revise error message
      better error message
      error message
      add 'utf8lite_render_char'
      remove 'UTF8LITE_TEXT_UTF8_BIT', iter_can_advance, iter_can_retreat
      make utf8lite_text_equals not require bitwise equality, just decoded equality
      use 'utf8lite' error code, not errno
      make text comparison function handle escapes
      more comparison tests
      comparison tests
      fix failing test
      simplify comparison code
      make text_hash decode escapes
      add another hash test
      add text copy tests
      add validation shortcut assertions
      rename 'emoji' flags to 'emojizwsp'
      refactor escape code
      simplify render esc code
      decouple utf8lite_escape_ascii and utf8lite_escape_utf8
      add escfaint render style
      replace 'render_spaces' with 'render_chars'
      rename 'render_bytes' -> 'render_raw'
      add test
      remove printf attribute on Windows, regardless of compiler
      add flexible escape styling
      simplify style open/close
      return error code, not old state
      add emoji escape width test
      add emoji zwj test
      fix rendering of emoji zwj sequences
      import wordscan from corpus
      add emojiprop.h
      update graphscan
      update wordscan
      simplified definition of emoji (for width)
      add doc
      simplify isignorable
      remove unnecessary decode
      update to Unicode 13.0; Closes #1
      check that isolated codepoints are single graphemes
      fix emoi charwidth
@hadley hadley added this to the v3.1.2 milestone Dec 21, 2021
@hadley
Copy link
Member

hadley commented Jan 4, 2022

I think the behaviour of local_reproducible_output(unicode = TRUE) is ok; if you're deliberately requesting unicode output and you're in a non-UTF-8 locale, there's nothing you can do except for skipping. The problem arises because it's called automatically by the reporters, which have to switch back and forth between "reproducible output" (inside tests) and "user output" (outside testthat).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior tests 📘
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants