Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Add an option to html_table() not to apply utils::type.convert() #311

Closed
yutannihilation opened this issue Feb 5, 2021 · 2 comments · Fixed by #312
Closed

Comments

@yutannihilation
Copy link
Member

There seems no option to prevent html_table() from converting integer-ish texts to integers. For example, consider the case below. I want to parse 001 and 002 as "001" and "002", but actually they are interpreted integer 1 and 2. I think html_table() should provide some option to control this behavior.

It will probably be straight to implement to wrap the type.covert() line with if. If this sounds good, I'm happy to contribute a pull request.

library(rvest)
library(tibble)

d <- tibble(code = c("001", "002", "101", "102"),
            label = c("apple", "banana", "lemon", "orange"))

table_html <- knitr::kable(d, format = "html")
table_html <- as.character(table_html)

cat(table_html)
#> <table>
#>  <thead>
#>   <tr>
#>    <th style="text-align:left;"> code </th>
#>    <th style="text-align:left;"> label </th>
#>   </tr>
#>  </thead>
#> <tbody>
#>   <tr>
#>    <td style="text-align:left;"> 001 </td>
#>    <td style="text-align:left;"> apple </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:left;"> 002 </td>
#>    <td style="text-align:left;"> banana </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:left;"> 101 </td>
#>    <td style="text-align:left;"> lemon </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:left;"> 102 </td>
#>    <td style="text-align:left;"> orange </td>
#>   </tr>
#> </tbody>
#> </table>

table_html %>% 
  read_html() %>% 
  html_table()
#> [[1]]
#> # A tibble: 4 x 2
#>    code label 
#>   <int> <chr> 
#> 1     1 apple 
#> 2     2 banana
#> 3   101 lemon 
#> 4   102 orange

Created on 2021-02-05 by the reprex package (v1.0.0)

rvest/R/table.R

Line 135 in 62cef0d

utils::type.convert(out[, i], as.is = TRUE, dec = dec, na.strings = na.strings)

@hadley
Copy link
Member

hadley commented Feb 5, 2021

I actually planned to do that, but forgot about it. A PR would be great 😄

@yutannihilation
Copy link
Member Author

Oh, I see. Sure, I'll do a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants