Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add knit_print method for DataFrame (experimental) #125

Merged
merged 36 commits into from
Apr 27, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
838e839
feat: add knit_print method for DataFrame
eitsupi Apr 18, 2023
6842c79
test: add tests for knitr
eitsupi Apr 18, 2023
ef8413f
fix: support gfm output
eitsupi Apr 18, 2023
6e49072
docs: update readme
eitsupi Apr 18, 2023
c9b7752
format: `<-` -> `=`
eitsupi Apr 18, 2023
91a8b9a
format: `<-` -> `=`
eitsupi Apr 18, 2023
afd0550
format: `<-` -> `=`
eitsupi Apr 18, 2023
58ea15e
format: `<-` -> `=`
eitsupi Apr 18, 2023
5d71f65
add knit_print doc + update snapshot
sorhawell Apr 18, 2023
f3e5e6c
test: use a small data frame for snapshot tests
eitsupi Apr 18, 2023
8fcfcec
fix: remove print prefix
eitsupi Apr 19, 2023
7f8d4ce
feat: df_print for polars DataFrame
eitsupi Apr 19, 2023
675df16
Merge branch 'main' into knitr-print
eitsupi Apr 20, 2023
ae6d5e6
feat: add a new function to_html_table
eitsupi Apr 20, 2023
6fd0902
Merge branch 'main' into knitr-print
eitsupi Apr 21, 2023
49c0f62
Merge branch 'main' into knitr-print
eitsupi Apr 22, 2023
0d79e01
fix: support polars DataFrame |> to_html_table
eitsupi Apr 22, 2023
8709555
fix: update imports
eitsupi Apr 22, 2023
7109ec9
test: update snapshot
eitsupi Apr 22, 2023
578f1ee
fix: update docs about html format and fix detect html table format
eitsupi Apr 22, 2023
b69e035
docs: update readme
eitsupi Apr 22, 2023
c68caaa
docs: update news
eitsupi Apr 22, 2023
4921ee8
fix!: remove the format param from as.character.Series
eitsupi Apr 22, 2023
d2decc8
fix: to_html_table also works for POSIXlt class
eitsupi Apr 22, 2023
f93a881
docs: fix as.character document
eitsupi Apr 22, 2023
40719c9
docs: fix missing param
eitsupi Apr 22, 2023
1abc182
Merge branch 'main' into knitr-print
eitsupi Apr 24, 2023
da5d723
fix: should mark as raw html
eitsupi Apr 24, 2023
e8228ad
chore: don't export to_html_table
eitsupi Apr 26, 2023
8d8a82c
Merge branch 'main' into knitr-print
eitsupi Apr 26, 2023
6841bb6
chore: update Rd file
eitsupi Apr 26, 2023
4921884
Merge branch 'main' into knitr-print
eitsupi Apr 27, 2023
396ef66
fix: ensure TRUE or FALSE
eitsupi Apr 27, 2023
f3e67a9
fix: check knitr is installed
eitsupi Apr 27, 2023
811205e
fix: also check pillar and fix error message
eitsupi Apr 27, 2023
0361c29
format: replace `<-` -> `=`
eitsupi Apr 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ Suggests:
bit64,
knitr,
tibble,
rmarkdown
rmarkdown,
withr
Config/testthat/edition: 3
Collate:
'utils.R'
Expand Down Expand Up @@ -59,6 +60,7 @@ Collate:
'namespace.R'
'options.R'
'parquet.R'
'pkg-knitr.R'
'pkg-nanoarrow.R'
'rlang.R'
'rust_result.R'
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,7 @@ export(GroupBy_agg)
export(GroupBy_as_data_frame)
export(LazyFrame_print)
export(csv_reader)
export(knit_print.DataFrame)
export(ncol.DataFrame)
export(ncol.LazyFrame)
export(nrow.DataFrame)
Expand Down
23 changes: 23 additions & 0 deletions R/pkg-knitr.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#' @export
knit_print.DataFrame = function(x, ...) {
.env_formatting = Sys.getenv("POLARS_FMT_TABLE_FORMATTING")
.env_inline_data_type = tolower(Sys.getenv("POLARS_FMT_TABLE_INLINE_COLUMN_DATA_TYPE"))

on.exit({
Sys.setenv(POLARS_FMT_TABLE_FORMATTING = .env_formatting)
Sys.setenv(POLARS_FMT_TABLE_INLINE_COLUMN_DATA_TYPE = .env_inline_data_type)
})

if (.env_formatting %in% c("", "ASCII_MARKDOWN") && .env_inline_data_type %in% c("", "true", "1")) {
Sys.setenv(POLARS_FMT_TABLE_FORMATTING = "ASCII_MARKDOWN")
Sys.setenv(POLARS_FMT_TABLE_INLINE_COLUMN_DATA_TYPE = "1")
}

out <- capture.output(print(x))
eitsupi marked this conversation as resolved.
Show resolved Hide resolved

# Needs to insert a blank line
out <- c(out[1], "", out[-1]) |>
eitsupi marked this conversation as resolved.
Show resolved Hide resolved
paste(collapse = "\n")

knitr::asis_output(out)
}
1 change: 1 addition & 0 deletions R/zzz.R
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ pl$mem_address = mem_address
s3_register("nanoarrow::infer_nanoarrow_schema", "DataFrame")
s3_register("arrow::as_record_batch_reader", "DataFrame")
s3_register("arrow::as_arrow_table", "DataFrame")
s3_register("knitr::knit_print", "DataFrame")

pl$numeric_dtypes = pl$dtypes[substr(names(pl$dtypes),1,3) %in% c("Int","Flo")]

Expand Down
70 changes: 32 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,24 +126,22 @@ library(polars)

dat = pl$DataFrame(mtcars)
dat
#> polars DataFrame: shape: (32, 11)
#> ┌──────┬─────┬───────┬───────┬─────┬─────┬─────┬──────┬──────┐
#> │ mpg ┆ cyl ┆ disp ┆ hp ┆ ... ┆ vs ┆ am ┆ gear ┆ carb │
#> │ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
#> │ f64 ┆ f64 ┆ f64 ┆ f64 ┆ ┆ f64 ┆ f64 ┆ f64 ┆ f64 │
#> ╞══════╪═════╪═══════╪═══════╪═════╪═════╪═════╪══════╪══════╡
#> │ 21.0 ┆ 6.0 ┆ 160.0 ┆ 110.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 4.0 ┆ 4.0 │
#> │ 21.0 ┆ 6.0 ┆ 160.0 ┆ 110.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 4.0 ┆ 4.0 │
#> │ 22.8 ┆ 4.0 ┆ 108.0 ┆ 93.0 ┆ ... ┆ 1.0 ┆ 1.0 ┆ 4.0 ┆ 1.0 │
#> │ 21.4 ┆ 6.0 ┆ 258.0 ┆ 110.0 ┆ ... ┆ 1.0 ┆ 0.0 ┆ 3.0 ┆ 1.0 │
#> │ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... │
#> │ 15.8 ┆ 8.0 ┆ 351.0 ┆ 264.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 4.0 │
#> │ 19.7 ┆ 6.0 ┆ 145.0 ┆ 175.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 6.0 │
#> │ 15.0 ┆ 8.0 ┆ 301.0 ┆ 335.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 8.0 │
#> │ 21.4 ┆ 4.0 ┆ 121.0 ┆ 109.0 ┆ ... ┆ 1.0 ┆ 1.0 ┆ 4.0 ┆ 2.0 │
#> └──────┴─────┴───────┴───────┴─────┴─────┴─────┴──────┴──────┘
```

polars DataFrame: shape: (32, 11)

| mpg (f64) | cyl (f64) | disp (f64) | hp (f64) | … | vs (f64) | am (f64) | gear (f64) | carb (f64) |
|-----------|-----------|------------|----------|-----|----------|----------|------------|------------|
| 21.0 | 6.0 | 160.0 | 110.0 | … | 0.0 | 1.0 | 4.0 | 4.0 |
| 21.0 | 6.0 | 160.0 | 110.0 | … | 0.0 | 1.0 | 4.0 | 4.0 |
| 22.8 | 4.0 | 108.0 | 93.0 | … | 1.0 | 1.0 | 4.0 | 1.0 |
| 21.4 | 6.0 | 258.0 | 110.0 | … | 1.0 | 0.0 | 3.0 | 1.0 |
| … | … | … | … | … | … | … | … | … |
| 15.8 | 8.0 | 351.0 | 264.0 | … | 0.0 | 1.0 | 5.0 | 4.0 |
| 19.7 | 6.0 | 145.0 | 175.0 | … | 0.0 | 1.0 | 5.0 | 6.0 |
| 15.0 | 8.0 | 301.0 | 335.0 | … | 0.0 | 1.0 | 5.0 | 8.0 |
| 21.4 | 4.0 | 121.0 | 109.0 | … | 1.0 | 1.0 | 4.0 | 2.0 |

Once our Polars DataFrame has been created, we can chain together a
series of data manipulations as part of the same query. For example:

Expand All @@ -156,19 +154,17 @@ dat$filter(
pl$col("mpg")$mean()$alias("mean_mpg"),
pl$col("hp")$median()$alias("med_hp")
)
#> polars DataFrame: shape: (4, 4)
#> ┌─────┬─────┬───────────┬────────┐
#> │ cyl ┆ am ┆ mean_mpg ┆ med_hp │
#> │ --- ┆ --- ┆ --- ┆ --- │
#> │ f64 ┆ f64 ┆ f64 ┆ f64 │
#> ╞═════╪═════╪═══════════╪════════╡
#> │ 6.0 ┆ 1.0 ┆ 20.566667 ┆ 110.0 │
#> │ 6.0 ┆ 0.0 ┆ 19.125 ┆ 116.5 │
#> │ 8.0 ┆ 0.0 ┆ 15.05 ┆ 180.0 │
#> │ 8.0 ┆ 1.0 ┆ 15.4 ┆ 299.5 │
#> └─────┴─────┴───────────┴────────┘
```

polars DataFrame: shape: (4, 4)

| cyl (f64) | am (f64) | mean_mpg (f64) | med_hp (f64) |
|-----------|----------|----------------|--------------|
| 6.0 | 1.0 | 20.566667 | 110.0 |
| 6.0 | 0.0 | 19.125 | 116.5 |
| 8.0 | 0.0 | 15.05 | 180.0 |
| 8.0 | 1.0 | 15.4 | 299.5 |

The above is an example of Polars’ eager execution engine. But for
maximum performance, it is preferable to use Polars’ lazy execution
mode, which allows the package to apply additional query optimizations.
Expand All @@ -184,19 +180,17 @@ ldat$filter(
pl$col("mpg")$mean()$alias("mean_mpg"),
pl$col("hp")$median()$alias("med_hp")
)$collect()
#> polars DataFrame: shape: (4, 4)
#> ┌─────┬─────┬───────────┬────────┐
#> │ cyl ┆ am ┆ mean_mpg ┆ med_hp │
#> │ --- ┆ --- ┆ --- ┆ --- │
#> │ f64 ┆ f64 ┆ f64 ┆ f64 │
#> ╞═════╪═════╪═══════════╪════════╡
#> │ 6.0 ┆ 1.0 ┆ 20.566667 ┆ 110.0 │
#> │ 6.0 ┆ 0.0 ┆ 19.125 ┆ 116.5 │
#> │ 8.0 ┆ 0.0 ┆ 15.05 ┆ 180.0 │
#> │ 8.0 ┆ 1.0 ┆ 15.4 ┆ 299.5 │
#> └─────┴─────┴───────────┴────────┘
```

polars DataFrame: shape: (4, 4)

| cyl (f64) | am (f64) | mean_mpg (f64) | med_hp (f64) |
|-----------|----------|----------------|--------------|
| 6.0 | 1.0 | 20.566667 | 110.0 |
| 6.0 | 0.0 | 19.125 | 116.5 |
| 8.0 | 0.0 | 15.05 | 180.0 |
| 8.0 | 1.0 | 15.4 | 299.5 |

## Contribute

Contributions are very welcome!
Expand Down
80 changes: 80 additions & 0 deletions tests/testthat/_snaps/knitr.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Snapshot test of knitr

Code
.knit_file("dataframe.Rmd")
Output

```r
pl$DataFrame(mtcars)
```

polars DataFrame: shape: (32, 11)

| mpg (f64) | cyl (f64) | disp (f64) | hp (f64) | ... | vs (f64) | am (f64) | gear (f64) | carb (f64) |
|-----------|-----------|------------|----------|-----|----------|----------|------------|------------|
| 21.0 | 6.0 | 160.0 | 110.0 | ... | 0.0 | 1.0 | 4.0 | 4.0 |
| 21.0 | 6.0 | 160.0 | 110.0 | ... | 0.0 | 1.0 | 4.0 | 4.0 |
| 22.8 | 4.0 | 108.0 | 93.0 | ... | 1.0 | 1.0 | 4.0 | 1.0 |
| 21.4 | 6.0 | 258.0 | 110.0 | ... | 1.0 | 0.0 | 3.0 | 1.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 15.8 | 8.0 | 351.0 | 264.0 | ... | 0.0 | 1.0 | 5.0 | 4.0 |
| 19.7 | 6.0 | 145.0 | 175.0 | ... | 0.0 | 1.0 | 5.0 | 6.0 |
| 15.0 | 8.0 | 301.0 | 335.0 | ... | 0.0 | 1.0 | 5.0 | 8.0 |
| 21.4 | 4.0 | 121.0 | 109.0 | ... | 1.0 | 1.0 | 4.0 | 2.0 |

---

Code
.knit_file("dataframe.Rmd")
Output

```r
pl$DataFrame(mtcars)
```

polars DataFrame: shape: (32, 11)

┌──────┬─────┬───────┬───────┬─────┬─────┬─────┬──────┬──────┐
│ mpg ┆ cyl ┆ disp ┆ hp ┆ ... ┆ vs ┆ am ┆ gear ┆ carb │
│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
│ f64 ┆ f64 ┆ f64 ┆ f64 ┆ ┆ f64 ┆ f64 ┆ f64 ┆ f64 │
╞══════╪═════╪═══════╪═══════╪═════╪═════╪═════╪══════╪══════╡
│ 21.0 ┆ 6.0 ┆ 160.0 ┆ 110.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 4.0 ┆ 4.0 │
│ 21.0 ┆ 6.0 ┆ 160.0 ┆ 110.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 4.0 ┆ 4.0 │
│ 22.8 ┆ 4.0 ┆ 108.0 ┆ 93.0 ┆ ... ┆ 1.0 ┆ 1.0 ┆ 4.0 ┆ 1.0 │
│ 21.4 ┆ 6.0 ┆ 258.0 ┆ 110.0 ┆ ... ┆ 1.0 ┆ 0.0 ┆ 3.0 ┆ 1.0 │
│ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... │
│ 15.8 ┆ 8.0 ┆ 351.0 ┆ 264.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 4.0 │
│ 19.7 ┆ 6.0 ┆ 145.0 ┆ 175.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 6.0 │
│ 15.0 ┆ 8.0 ┆ 301.0 ┆ 335.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 8.0 │
│ 21.4 ┆ 4.0 ┆ 121.0 ┆ 109.0 ┆ ... ┆ 1.0 ┆ 1.0 ┆ 4.0 ┆ 2.0 │
└──────┴─────┴───────┴───────┴─────┴─────┴─────┴──────┴──────┘

---

Code
.knit_file("dataframe.Rmd")
Output

```r
pl$DataFrame(mtcars)
```

polars DataFrame: shape: (32, 11)

┌──────┬─────┬───────┬───────┬─────┬─────┬─────┬──────┬──────┐
│ mpg ┆ cyl ┆ disp ┆ hp ┆ ... ┆ vs ┆ am ┆ gear ┆ carb │
│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
│ f64 ┆ f64 ┆ f64 ┆ f64 ┆ ┆ f64 ┆ f64 ┆ f64 ┆ f64 │
╞══════╪═════╪═══════╪═══════╪═════╪═════╪═════╪══════╪══════╡
│ 21.0 ┆ 6.0 ┆ 160.0 ┆ 110.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 4.0 ┆ 4.0 │
│ 21.0 ┆ 6.0 ┆ 160.0 ┆ 110.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 4.0 ┆ 4.0 │
│ 22.8 ┆ 4.0 ┆ 108.0 ┆ 93.0 ┆ ... ┆ 1.0 ┆ 1.0 ┆ 4.0 ┆ 1.0 │
│ 21.4 ┆ 6.0 ┆ 258.0 ┆ 110.0 ┆ ... ┆ 1.0 ┆ 0.0 ┆ 3.0 ┆ 1.0 │
│ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... ┆ ... │
│ 15.8 ┆ 8.0 ┆ 351.0 ┆ 264.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 4.0 │
│ 19.7 ┆ 6.0 ┆ 145.0 ┆ 175.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 6.0 │
│ 15.0 ┆ 8.0 ┆ 301.0 ┆ 335.0 ┆ ... ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 8.0 │
│ 21.4 ┆ 4.0 ┆ 121.0 ┆ 109.0 ┆ ... ┆ 1.0 ┆ 1.0 ┆ 4.0 ┆ 2.0 │
└──────┴─────┴───────┴───────┴─────┴─────┴─────┴──────┴──────┘

3 changes: 3 additions & 0 deletions tests/testthat/files/dataframe.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
```{r}
pl$DataFrame(mtcars)
```
23 changes: 23 additions & 0 deletions tests/testthat/test-knitr.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
.knit_file <- function(file_name) {
file <- file.path("files", file_name)
output <- tempfile(fileext = "md")
eitsupi marked this conversation as resolved.
Show resolved Hide resolved
on.exit(unlink(output))

suppressWarnings(knitr::knit(file, output, quiet = TRUE, envir = new.env()))

readLines(output) |>
paste0(collapse = "\n") |>
cat()
}

test_that("Snapshot test of knitr", {
expect_snapshot(.knit_file("dataframe.Rmd"), cran = TRUE)
withr::with_envvar(
new = c("POLARS_FMT_TABLE_INLINE_COLUMN_DATA_TYPE" = "false"),
expect_snapshot(.knit_file("dataframe.Rmd"), cran = TRUE)
)
withr::with_envvar(
new = c("POLARS_FMT_TABLE_FORMATTING" = "DEFAULT"),
expect_snapshot(.knit_file("dataframe.Rmd"), cran = TRUE)
)
})