Skip to content
This repository has been archived by the owner on May 19, 2021. It is now read-only.

Google Docs/Drive workflow and package ecosystem #9

Open
noamross opened this issue Feb 25, 2016 · 12 comments
Open

Google Docs/Drive workflow and package ecosystem #9

noamross opened this issue Feb 25, 2016 · 12 comments

Comments

@noamross
Copy link
Contributor

Google Drive/Docs are a pretty common set of tools used for collaboration that many organizations and scientists use, especially on mixed teams where some members have non-computational foci. Concurrent editing, commenting and ease of use are some of the main reasons.

An idea would be to extend the current R/Google Drive package ecosystem to better enable collaboration in mixed teams via Google Drive and Docs. Some useful outputs might be:

  • RMarkdown --> Google Docs template/workflow (as a package)
  • A reverse workflow for team members to edit some or parts of R Markdown documents or apps via Google Docs.
  • Nice examples and documentation for these.

So far, the relevant package ecosystem includes:

  • The great googlesheets package.
  • googleAuthR for handling authentication
  • The incomplete driver package for google drive access.
  • The incomplete rchie package, implementing ArchieML, a markup the NYT uses for reporters to provide structured data and text to graphics developers via Google Docs.
@jennybc
Copy link
Member

jennybc commented Feb 26, 2016

I am of course very interested in this!

There are functions in googlesheets that wrap pure Drive operations and would be great to move into a general Drive client, such as a resurrected driver package, if it will go to CRAN. I'd love to help move that along. Specifically: query/modify file permissions and capabilities, manipulate the directory structure. Something that's more out there is to think about creating a bridge between the Drive revision history and git(hub)? That would be pretty cool for a mixed team.

Another space to watch re: Google authentication is this by @craigcitro

@craigcitro
Copy link

I'm also interested in hearing what people are up to here; as disclaimer/background, at Google I work on colab, which is basically hosted ipynb files in Drive (for internal use only, sadly).

@noamross
Copy link
Contributor Author

I have an draft function for importing google doc history into a git repository: https://github.com/Ironholds/driver/blob/commit-gdoc-history/inst/commit_gdoc_history.R

@noamross noamross closed this as completed Mar 4, 2016
@noamross noamross reopened this Mar 4, 2016
@karthik
Copy link
Member

karthik commented Mar 4, 2016

An idea would be to extend the current R/Google Drive package ecosystem to better enable collaboration in mixed teams via Google Drive and Docs. Some useful outputs might be:

RMarkdown --> Google Docs template/workflow (as a package)

So I still have to wrangle the results of the big survey, but this has often been the most requested feature from people that answered and students that come to speak with me during office hours. I work in Rmd but my collaborators prefer Word and Google docs. Short of copying stuff back and forth, how do I collaborate?

I suppose the roundtrip is challenging, because edits made by a non-R collaborator on the Google doc would be hard to merge back. One idea I've had is:
a) Keep track of all the code chunks with identifiers.
b) Keep track of all the blocks of text in between

Never diff on the code ever (leave that in the domain of R), and just diff text between Rmarkdown → Google doc for the roundtrip. Perhaps even hide the code blocks if possible.

Larger output rendered from the code chunks like figures, tables etc just remain in the doc, but if anyone attempts to change those (edit a table, which they shouldnt), or replace a figure (which makes no sense), it gets overwritten in the next parse of the Rmd.

There are functions in googlesheets that wrap pure Drive operations and would be great to move into a general Drive client, such as a resurrected driver package, if it will go to CRAN. I'd love to help move that along. Specifically: query/modify file permissions and capabilities, manipulate the directory structure. Something that's more out there is to think about creating a bridge between the Drive revision history and git(hub)? That would be pretty cool for a mixed team.

This would seem like a separate topic to spin into another issue (since file ops, and auth can be spun off separately.)

@noamross
Copy link
Contributor Author

noamross commented Mar 5, 2016

Related: There was a paper a while back on "Invertible Reproducible Documents" which described a prototype using .Rhtml that produced HTML that could be edited and then returned back to .Rhtml. It relied on having the R Code stored as comments in the HTML.

I'm not sure this is a great solution for Google Docs, as I'm not sure there's a way to have code hidden in comments that survives the round trip. Perhaps the Google Doc version shouldn't have code blocks stored at all, just have placeholders with the chunk identifiers. The text blocks from the Google Doc are then applied to the Rmd, overwriting its text blocks, just as applying the Rmd to the Google Doc overwrites knitted output section.

One particularly challenging area: inline r chunks, especially if you are using knitcitations.

@jennybc
Copy link
Member

jennybc commented Mar 5, 2016

@craigcitro I have this sinking feeling that we should be doing things via Google Apps Script and the Apps Script Execution API, rather than using the Drive and Sheets REST APIs. Can you comment on that?

@noamross
Copy link
Contributor Author

In anticipation of more work on this, I've updated rchie, as well as created another package, juicer. juicer is a CSS in-liner, which will help with sending compiling R Markdown files to Google Docs and Gmail via intermediate HTML. Gdocs/Gmail strip external CSS from HTML, so users can keep (some) styles by inlining.

@noamross
Copy link
Contributor Author

After some experimentation, I've been finding that markdown --> word --> Google Doc is more reliable in preserving document structure and formatting than markdown --> HTML (css or inline styles) --> Google Doc. So juicer may not be so useful here (though still useful for R Markdown --> Gmail). A higher-effort approach may be to create Google Docs directly via the Google Apps Script as Jenny suggests above. This would be like making a Google Doc pandoc writer that processes pandoc's JSON document format.

@jules32
Copy link

jules32 commented Mar 31, 2016

I would be very interested in this!

@dani-lbnl
Copy link

@dani-lbnl
Copy link

Kristoffer created a converter and apps to inspect structure of gdocs. More @ http://www.lorut.no/apps/gdoc-structure-viewer/

@data-steve
Copy link

These are old resources, but back when I first started googleformr project, I stumbled across these resources:

Again, the posts are 2012 old, but they have some decent description of their attempts trying to get R and AppScripts to play well together.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants