Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print matrices as tables by default #484

Closed
nalimilan opened this issue Feb 19, 2013 · 36 comments

Comments

@nalimilan
Copy link

commented Feb 19, 2013

Currently, matrices printed from R code result in text output in the resulting document. For example, matrix(1:10, 2, 5) gives

     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10

This is not optimal since it only looks right with fixed-width fonts, and cannot be copied to e.g. a text editor or a spreadsheet application. In one word: it's in an unstructured, useless format.

One of the current situations is to use print(xtable(...), type="html") in a results="asis" chunk. This works great, but requires additional code, while I think this behavior should be the default. Tables are not required to look incredible by default, but at least they should be tables, with a minimally correct formatting (e.g. row/column names in bold, which is not possible in the R console but makes things much easier to read in a real document). There could be an option to disable this behavior, but I think everybody would gain from getting clean tables out of the box.

My use case (but there are many others) is that I'm using RStudio with students, and I would like to tell them to use knitr to create reproducible HTML reports from where they can copy their tables. It would be much more convenient to avoid the xtable trick.

Implementation-wise, I'm not sure how you'd do this, but the R2HTML package already has some support for this kind of things after HTMLStart() is called. I guess you merely need to replace the print.matrix(), but you know better than I do.

Thanks for this very nice package!

@ramnathv

This comment has been minimized.

Copy link
Contributor

commented Feb 19, 2013

You can achieve this easily by setting options(xtable.type = 'html') in your set up chunk. Then you can print your table as a html table by just invoking xtable(mytable).

EDIT. If your objective is reproducibility, I would avoid copy-paste as much as possible. So, I don't think it is a good idea to ask students to copy paste tables from knitr reports. It is better to save the table directly to a csv file, which they can they open on their computers. It is easier and prone to lower error.

@nalimilan

This comment has been minimized.

Copy link
Author

commented Feb 19, 2013

Interesting. This is definitely a useful option to avoid the print(..., type="html") code overhead. Still, I think it makes sense to print matrices as real tables by default.

About copy/pasting, of course it's far from optimal, while it's completely orthogonal to the present issue. I think that for absolute beginners, I will tell them that they can copy/paste when they are OK with their results if they want, because formatting tables directly to a document is hard for students that do not use LaTex and never used a programming language. In the long term, I think it's better to produce full reports directly indeed.

@ramnathv

This comment has been minimized.

Copy link
Contributor

commented Feb 19, 2013

The issue is that a chunk with results='asis' can produce more than just a table. Moreover, some of us might want to display the output asis without calling xtable on it. Finally, I don't think it is a good idea to tie things down to xtable, since someone might want to use a different package like hwriter to display their html tables.

Now, I think it might be possible to write a hook to achieve this effect. The idea is to write a hook, which when invoked on a chunk will automatically run the final output through xtable. I am not sure if the current design of knitr permits such a hook, but this concept is well utilized in other programming languages like python where they are called decorators.

@nalimilan

This comment has been minimized.

Copy link
Author

commented Feb 19, 2013

When results="asis", the automated conversion of matrices to tables could be disabled, as you could expect the user wants to handle things manually. Then, people could use their preferred package to format tables. I don't really suggest using xtable() itself by default, rather a more basic function that would not accept options, and only produce a reasonable output: for complex cases, calling xtable() or anything else manually is already possible.

@ramnathv

This comment has been minimized.

Copy link
Contributor

commented Feb 19, 2013

I see your point. Actually knitr runs a function output_asis in the background when the option results = "asis" is invoked. I see the possibility of making this more general by allowing users to provide their own results function. So, for example results = "xtable" could lead to the final output being run through the xtable function. I am going to take a crack at implementing this in my fork of knitr. Would be good to know what @yihui thinks.

@ramnathv

This comment has been minimized.

Copy link
Contributor

commented Feb 19, 2013

There is a problem with this approach. knitr only returns a character string with the evaluated output, and not the object. So we will have to manipulate the character string containing the matrix and not the matrix itself, which makes it more difficult.

@nalimilan

This comment has been minimized.

Copy link
Author

commented Feb 19, 2013

Yeah, it looks like the problem comes directly from package evaluate, which only returns text if I understand correctly. But Yihui will probably now...

@ramnathv

This comment has been minimized.

Copy link
Contributor

commented Feb 19, 2013

I think the overhead of a single function call to xtable is not worth hacking through knitr to make the changes desired.

@yihui

This comment has been minimized.

Copy link
Owner

commented Feb 19, 2013

This is a reasonable request and is possible/easy to implement. There are a few other issues that the evaluate package needs to take care of (not only the text output), and I'll think about this when I have got time. Thanks!

@ramnathv

This comment has been minimized.

Copy link
Contributor

commented Feb 19, 2013

If the evaluate package can return the raw object, that would be terrific and I see the potential to do a lot more than just use xtable.

@nalimilan

This comment has been minimized.

Copy link
Author

commented Feb 19, 2013

Thanks! If you give me a few hints about the solutions you consider, I can try hacking on this.

@yihui

This comment has been minimized.

Copy link
Owner

commented Feb 19, 2013

@nalimilan Take a look at the output_handler argument in evaluate::evaluate() (specifically, the value handler), and it needs to be integrated into knitr in https://github.com/yihui/knitr/blob/master/R/block.R#L117-L118

@nalimilan

This comment has been minimized.

Copy link
Author

commented Feb 20, 2013

Thanks for the pointers. I've found two soutions:

  1. Create an output_handler and use is.matrix(x) to special case matrices.
  2. Override print.matrix() in the evaluate() environment.

After trying the first way, I eventually retained the second for now. The advantage IMO is that we could potentially override print.matrix() in base for the evaluate() call, so that e.g. summary.lm() prints real tables automatically.

I've created a branch with a very basic proof-of-concept patch with a hacky temporary function formatting matrices to Markdown tables. Other formats can theoretically be added easily this way, using a special output hook called "table". This can easily be tested using RStudio's knitr to HTML feature. One gotcha is that Markdown does not seems to support including tables inside frames, so results="asis" needs to be used if you want tables to be converted to HTML. This limitation probably does not apply to LaTex and HTML, though I've not looked into it yet.

https://github.com/nalimilan/knitr/tree/table

Let me know what you think of this solution. Of course this needs to be implemented with cleaner functions, this is just to illustrate the idea. One thing to decide is whether table syntax should be created with ad-hoc code in knitr, or whether an already existing package should be used (e.g. xtable for LaTex/HTML, ascii for Markdown...).

@ramnathv

This comment has been minimized.

Copy link
Contributor

commented Feb 20, 2013

It looks neat, but I won't recommend overriding base functions, since knitr is supposed to faithfully reproduce output as if a command is typed in the console. So, overriding print.matrix within knitr IMHO is dangerous. Note that you can override print.matrix within your document by adding a code chunk with the following function. It used xtable to print a matrix as a HTML table. I believe, this is more transparent since a reader can quickly understand by looking at the source document, why matrices are being printed as tables.

print.matrix <- function(x){
  require(xtable)
  options(xtable.type = 'html')
  xtable(x)
}

If you want to indeed hack at print.matrix, I would recommend doing it in a way where the default print.matrix behavior is retained, but the alternate function is triggered using a chunk option. This would be cleaner and will not break existing documents.

@nalimilan

This comment has been minimized.

Copy link
Author

commented Feb 21, 2013

As I said, I think this should indeed be an option, though I would argue enabling it by default would make sense, at least when a matrix is printed directly (i.e. not though e.g. summary.lm).

@daroczig

This comment has been minimized.

Copy link
Contributor

commented Mar 24, 2013

For printing tables and matrices with basic formatting, you might give a try my pander method, which can transform a variety of R objects to markdown. Currently, it supports 4 types of table styles for pandoc and PHP Markdown Extra. Please see the examples at the above link.

And why I am commenting here: I had the very same problems with evaluate::evaluate that it did not returned the R objects, that's why I came up with my own version called eval.msgs and evals in the very same package. The goal of these functions was to grab every interesting stuff while evaluation like

  • stdout,
  • printed version of the output,
  • all messages, warnings, errors
  • and the original R object.

Please see in pander package, I hope you may find this useful. Quick example:

> eval.msgs('matrix(1:4,2)')
$src
[1] "matrix(1:4,2)"

$result
     [,1] [,2]
[1,]    1    3
[2,]    2    4

$output
[1] "     [,1] [,2]" "[1,]    1    3" "[2,]    2    4"

$type
[1] "matrix"

$msg
$msg$messages
NULL

$msg$warnings
NULL

$msg$errors
NULL


$stdout
NULL

And e.g. evals can have hooks (what I use in pander's brew function to automatically run pander on each chunk elements), which would turn all the tables and matrices (among others) to markdown format. Another lame example:

> evals('matrix(1:4,2)', hooks=list(default=pander.return))
[[1]]
$src
[1] "matrix(1:4, 2)"

$result
[1] ""    "---" "1 3" ""    "2 4" "---" ""   

$output
[1] "     [,1] [,2]" "[1,]    1    3" "[2,]    2    4"

$type
[1] "character"

$msg
$msg$messages
NULL

$msg$warnings
NULL

$msg$errors
NULL


$stdout
NULL

attr(,"class")
[1] "evals"

Where:

> cat(evals('matrix(1:4,2)', hooks=list(default=pander.return))[[1]]$result, sep = '\n')

---
1 3

2 4
---

Or without the bells and whistles:

> pander(matrix(1:4, 2))

---
1 3

2 4
---
@yihui

This comment has been minimized.

Copy link
Owner

commented Mar 24, 2013

Thanks. This is definitely something knitr needs to learn from you. As I said above, it is not hard to provide such an option to allow user-defined print() methods to print the objects. I'm busy with other things at the moment.

I guess it is probably not a good idea to copy evaluate and brew into pander and modify them; it is quick to cure the illness but also adds burden to you to maintain them. I'm not sure if you have talked to Hadley or Jeff (maybe you did; I don't remember). At least evaluate allows you to use your own output handlers now. It is similar to what you called hooks; perhaps you can take a look at the output_handler argument and see if that is useful to you.

@daroczig

This comment has been minimized.

Copy link
Contributor

commented Mar 24, 2013

Thank you @yihui for the update about evaluate, unfortunately I stopped following the news of the project about a year ago when I decided to rewrite the evaluation function from scratch to fit my needs instead of tweaking evaluate further - so it's not a fork actually. And now it has too many extra features that I build on (e.g. caching, "unify" images), that there is no turning back. Although I would have been really happy not to battle the problems of evals in the last year, it might have been smarter to wait a bit for evaluate.

But you are definitely right about brew. I did not contact Jeff about this issue, as the forked version of brew depends on my evals function, so it has no chance to be merged back to upstream. And Jeff would probably go blind after checking what I did to his function :)

So I am pretty sure that these parts of the pander package would remain my eccentric project that is needed to our Rapporter webapp, although I also hope that it could be useful for others too. Ah, and I almost forgot: keep up the great work, it's really awesome to see what you have managed to build with knitr!

@malcook

This comment has been minimized.

Copy link

commented Jan 19, 2014

I'm having pretty good success with this attempt to get knitr to render pander-ized results:

library(knitr)                                                                                                                                                                      
library(pander)                                                                                                                                                                     
library(evaluate) # used internally by knitr                                                                                                                                                   
pander_output_handler<-new_output_handler(value=function(x) {                                                                                                                       
    if (isS4(x))                                                                                                                                                                    
        {show(x)}                                                                                                                                                                   
    else if (! is.null(getS3method('pander',class(x),TRUE)))                                                                                                                        
        {pander(x)}                                                                                                                                                                 
    else                                                                                                                                                                            
        print(x)                                                                                                                                                                    
})                                                                                                                                                                                  
assignInNamespace('default_output_handler',pander_output_handler,'evaluate')

I think knitr might simply provide some mechanism for providing the output_handler that it passes to evaluate which I would use instead of using assignInNamespace.

Any 'gothas' I'm missing in going down this road.

Or is this entirely not cricket?

@yihui

This comment has been minimized.

Copy link
Owner

commented Jan 19, 2014

@malcook Yes, that is basically what I plan to do. I need to make a few changes in the evaluate package as well, for which I have not found time yet.

@Thell

This comment has been minimized.

Copy link
Contributor

commented Feb 15, 2014

@yihui I've been working on getting something working with xtable (posted it to the knitr list for feedback) that implements output hooking and opts.label as well as a modified toLatex.xtable function that produces some nice results. I don't know if you could use any of it for something more automagic with overriding 'print' in the eval environment to check object classes and chunk result types... If it helps though... http://rpubs.com/Thell/xtable

@yihui

This comment has been minimized.

Copy link
Owner

commented Mar 1, 2014

@Thell Yes, that is an excellent use case. I will definitely consider it! Thanks!

@kohske

This comment has been minimized.

Copy link
Contributor

commented Mar 3, 2014

@Thell Fantastic!! I love latex-style table!

@yihui yihui closed this in 550e7b5 Mar 26, 2014

@nalimilan

This comment has been minimized.

Copy link
Author

commented Mar 26, 2014

@yihui IIUC this means we can now override for each chunk the render function, but that one still needs to pass a custom function like print(xtable(...), type="html"). Am I right?

@yihui

This comment has been minimized.

Copy link
Owner

commented Mar 26, 2014

Yes, you can pass a custom render function, or extend the knit_print generic function for the xtable class, e.g.

library(knitr)
knit_print.xtable = function(x) {
  res = paste(capture.output(print(x)), collapse = '\n')
  asis_output(res)
}

Then you can just write xtable(iris) in R code chunks without explicitly printing it.

There is an issue of the table format (LaTeX or HTML), and you can query opts_knit$get('out.format') to decide which format you want.

This is a significant change in knitr and I will have to write more detailed documentation for it.

@nalimilan

This comment has been minimized.

Copy link
Author

commented Mar 26, 2014

OK, thanks. Though my original request was to print matrices as tables by default. Adapting the default print function would be very nice. :-p

@yihui

This comment has been minimized.

Copy link
Owner

commented Mar 27, 2014

Your original request has been made very easy now (although it can be even easier), and I do not think matrices should be printed as tables by default.

@nalimilan

This comment has been minimized.

Copy link
Author

commented Mar 27, 2014

Ah, OK. Why do you think printing matrices as raw text can be useful? I find it quite confusing, and not consistent with the fact that knitr usually makes things look right by default.

@yihui

This comment has been minimized.

Copy link
Owner

commented Mar 27, 2014

There are two reasons:

  1. R users are familiar with how matrices are printed in the R console (I agree this is not a strong reason at all);
  2. It is hard to decide which implementation I should use for the table generation; xtable, Hmisc, tables, sjPlot, ...? Moreover, I have to consider the output format: HTML, LaTeX, Markdown, ... All these issues make it non-trivial to implement a general-purpose table generator that makes everybody happy.

That said, I can definitely implement this request in a separate package (e.g. writing S3 methods knit_print.matrix(), knit_print.data.frame(), and so on). Then you opt-in by loading that package.

@malcook

This comment has been minimized.

Copy link

commented Mar 28, 2014

Great & Hooray!

In addition to being able to extend knit_print for individual classes, is it still possible to override entirely for a session the output handler entirely as I did above with:

assignInNamespace('default_output_handler',my_output_handler_fun,'evaluate')

Of course that is a hack!

Perhaps a new option could/should (not?) be introduced, such as:

opts_knit$set('default_output_handler',my_output_handler_fun)

or, maybe preferred terminology now would be (?)

opts_knit$set('default_render_function',my_render_fun)

Or is this not in the spirit of things?

@yihui

This comment has been minimized.

Copy link
Owner

commented Mar 28, 2014

@malcook evaluate:::default_output_handler has a few sub elements, and I guess only the value element matters. This value function can be set via the chunk option render, e.g.

opts_chunk$set(render = my_output_handler_fun$value)

If you have to use the dark voodoo assignInNamespace(), either you or me must have done something wrong :)

@nalimilan

This comment has been minimized.

Copy link
Author

commented Mar 28, 2014

Regarding your points:

  1. I'd say you cannot compare the default R console output with knitr-processed documents, since in the R console the fixed-width font makes tables look OK, while in documents by default columns are not aligned due to the use of variable width fonts. Anyway, the R console output is not that great, it's just that it's hard to get a better result in a terminal.
  2. Any simple implementation will do, it could be adapted from any of the packages which support basic tables. More complex options, like the ones xtable supports, could be left to specialized packages. The format should of course adapt to what kind of document knitr is currently processing. Printing a LaTex table into a HTML document does not make sense. ;-)
@yihui

This comment has been minimized.

Copy link
Owner

commented Mar 28, 2014

  1. Agreed.
  2. I'm likely to do this in a separate package and see how it goes. If people like it, I'll make knitr depend on it, so the magic will happen automatically.
@nalimilan

This comment has been minimized.

Copy link
Author

commented Mar 28, 2014

Makes sense!

@daroczig

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2014

Just a side-note again promoting my package a bit: if you are writing markdown text that can be turned to HTML/docx/pdf/whatever, you might simply use pander from the pander package as a general hook -- which might be the already available S3 method for printing matrix, data.frame etc. mentioned by Yihui above. Docs: http://rapporter.github.io/pander/#methods

@malcook

This comment has been minimized.

Copy link

commented Aug 30, 2014

Indeed @daroczig, this is a nice turn of affairs!

Exporting this .Rmd (in emacs using polymode) to 'github extended markdown'

```{r  render=pander, results='asis'}
mtcars[1:8,1:5]
```

emits markdown which renders here, in github, nicely, as:

mtcars[1:8,1:5]
  mpg cyl disp hp drat
Mazda RX4 21 6 160 110 3.9
Mazda RX4 Wag 21 6 160 110 3.9
Datsun 710 23 4 108 93 3.9
Hornet 4 Drive 21 6 258 110 3.1
Hornet Sportabout 19 8 360 175 3.1
Valiant 18 6 225 105 2.8
Duster 360 14 8 360 245 3.2
Merc 240D 24 4 147 62 3.7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.