-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for RMarkdown and Offline Support #11
Conversation
…rmd_support merge :w
Codecov Report
@@ Coverage Diff @@
## master #11 +/- ##
===========================================
- Coverage 72.76% 55.59% -17.18%
===========================================
Files 8 11 +3
Lines 224 295 +71
===========================================
+ Hits 163 164 +1
- Misses 61 131 +70
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick look and didn't actually test the code (just read it), and I think it all looks good!
I have a suggestion to store the glossary as a plain text YAML file inside the inst/
folder instead of storing it as a binary data package file which should simplify the code.
@@ -1,16 +1,20 @@ | |||
#!/usr/bin/env Rscript | |||
|
|||
# Create .rda version of glossary for R package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for this, personally, I would only fetch the yml file and put it in a inst/glosario
folder instead of making it a binary data file. Then, when reading the glossary locally, you can directly call yaml::read_yaml
on system.file("glosario/glossary.yml", package = "glosario")
. This would have the advantage of providing people with a plain text version of the file if they wanted to inspect it (also easier to diff if needed), and it would make the code in the package simpler because we could use the same code to parse from the web or from the package directly. But maybe there is a reason you chose this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also agree with this, especially if otherwise a binary is being updated every day on the github repo, which will make it grow and grow.
One of the things that I would advocate for is using an application directory instead for offline use, which comes in both R and python flavors (see {rappdirs} and appdirs). This way, people would be able to periodically update their dictionary without needing to update the package itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could have both?
Update the content of the inst/
folder daily with GitHub Actions so there is always a fresh copy of the glossary with the package, and create a function "update_local_glossary()` (or something like it) that would update the glossary in the application directory.
This feature seems however non-critical for the first release and could be implemented later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature seems however non-critical for the first release and could be implemented later.
Agreed. My thoughts are largely centered around the patterns I see from R users and reproducibility:
- people don't update their packages that often or don't know how to update their packages.
- there is no clear indicator of what version of the glossary people have on their machines, so if it defaults to the one in the package, then definitions that exist in the global glossary may be missing in their local version.
- if {glosario} is released on CRAN, it will be updated every two months (as per CRAN's policy) at most, but the main glosario repository will be constantly updating.
These situations mean that if Belle installs the package on March 4th and Sebastian installs the package on July 17th, they will have two different versions of the glossary on their machines. Let's say they contribute a few new definitions to the main glossary on July 16th, but neither of them see these definitions on their lesson because the package was only pushed to CRAN on July 1st.
I think it would be good to consider these situations before we release this to CRAN and coordinate with the python implementation so that we can reduce the friction that users see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are good points @zkamvar! and made me think that we might want to consider including some type versioning inside the YAML (a hash of the glossary content? a Git SHA1? an increasing counter? a datetime stamp?) so people/machines could compare versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to open a new issue for these discussions
@fmichonneau and @zkamvar I think this is now at a point where we can merge. Agreed or any other changes that you can think of? |
Agreed we can merge this to get it moving since RMD support is key. We need to add tests for this, though but that may depend on the outcome of carpentries/glosario#27 |
R/glosario.R
Outdated
#' | ||
#' `glosario` provides the infrastructure to create and read multilingual | ||
#' glossaries. By default, `glosario` providesaccess to a community-curated | ||
#' glossary hosted by [The | ||
#' Carpentries](https://github.com/carpentries/glosario). | ||
#' | ||
#' @docType package | ||
#' @name glossary | ||
#' @name glosario |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the name here should be glosario-package
?
I left 2 small comments, but otherwise, this is good to merge! Thanks Ian! |
This PR adds support for three use cases:
It also adds a GitHub action for a daily download of the package data so that it has the latest version.