Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with output of tables containing multibyte characters #821

Closed
ujtwr opened this issue Feb 15, 2024 · 2 comments · Fixed by #823
Closed

Problems with output of tables containing multibyte characters #821

ujtwr opened this issue Feb 15, 2024 · 2 comments · Fixed by #823

Comments

@ujtwr
Copy link

ujtwr commented Feb 15, 2024

Describe the bug

When a table containing multibyte characters is output with the kable() and kable_styling() functions, some characters are substituted into the pipe and disrupted.

To Reproduce

  1. No multibyte characters are included (No problem)
library(tidyverse)
library(knitr)
library(kableExtra)

tibble(
  col1 = c("aaa", "bbb", "ccc"),
  col2 = c(123, 456, 789)
) %>% 
  kable( ) %>% 
  kable_styling()

image

  1. Multibyte characters are included and only the kable function is used (No problem)
tibble(
  col1 = c("あいう", "えお", "かきく"),
  col2 = c(123, 456, 789)
) %>% 
  kable()

image

  1. Multibyte characters are included and the kable function and kable_styling function are used (problem arises)
tibble(
  col1 = c("あいう", "えお", "かきく"),
  col2 = c(123, 456, 789)
) %>% 
  kable() %>% 
  kable_styling()

image

The pipe symbol is placed at the end of the string in each columns. And the first number in the second column is missing.

  1. Multibyte characters are included and the kable function and kable_styling function are used, but specify the format option to the kable function (No problem)
tibble(
  col1 = c("あいう", "えお", "かきく"),
  col2 = c(123, 456, 789)
) %>% 
  kable(
    format = "html"
  ) %>% 
  kable_styling()

image

  1. Format option to the kable function is "latex" (No problem)
tibble(
  col1 = c("あいう", "えお", "かきく"),
  col2 = c(123, 456, 789)
) %>% 
  kable(
    format = "latex"
  ) %>% 
  kable_styling()
\begin{table}
\centering
\begin{tabular}{l|r}
\hline
col1 & col2\\
\hline
あいう & 123\\
\hline
えお & 456\\
\hline
かきく & 789\\
\hline
\end{tabular}
\end{table}
  1. Format option to the kable function is "pipe" (problem arises)
tibble(
  col1 = c("あいう", "えお", "かきく"),
  col2 = c(123, 456, 789)
) %>% 
  kable(
    format = "pipe"
  ) %>% 
  kable_styling()

image

The same phenomenon occurs. Is this a problem when the "pipe" is specified as the format option in the kable function?

  1. Use kbl() function instead of kable with kable_styleying (No problem)
tibble(
  col1 = c("あいう", "えお", "かきく"),
  col2 = c(123, 456, 789)
) %>% 
  kbl() %>% 
  kable_styling()

image

@dmurdoch
Copy link
Collaborator

kable() isn't defined in kableExtra, it's just imported from knitr and exported without changes. So really only the cases where kbl() was used belong here, but in fact kbl(format = "pipe") shows the same problem.

What happens is that kable_styling() needs to convert the pipe table to either HTML or LaTeX. It chooses HTML by default, but something is causing kableExtra:::md_table_parser() to leave the pipe symbols in place.

After some debugging, it looks like the issue is a bug in md_table_parser(). It assumes that the pipe symbols separating columns are always at the same character position in the input table, as they normally are with ASCII characters. But the hiragana characters are double width, so kable() puts the pipe characters at varying positions to make the pipe table look better. When kable_styling() processes that input, it gets the column break locations wrong.

@dmurdoch
Copy link
Collaborator

I think the PR I just submitted will fix this. You can use it by installing the devel version and making sure you get version 1.4.0.3 or higher.

If you can't do that, you could avoid the problem by avoiding pipe format tables. By default kbl() doesn't produce them, so stick to kbl() and you should be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants