Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for RMarkdown and Offline Support #11

Merged
merged 31 commits into from
Aug 13, 2020
Merged

Support for RMarkdown and Offline Support #11

merged 31 commits into from
Aug 13, 2020

Conversation

ian-flores
Copy link
Member

This PR adds support for three use cases:

It also adds a GitHub action for a daily download of the package data so that it has the latest version.

@codecov
Copy link

codecov bot commented Jul 23, 2020

Codecov Report

Merging #11 into master will decrease coverage by 17.17%.
The diff coverage is 6.57%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #11       +/-   ##
===========================================
- Coverage   72.76%   55.59%   -17.18%     
===========================================
  Files           8       11        +3     
  Lines         224      295       +71     
===========================================
+ Hits          163      164        +1     
- Misses         61      131       +70     
Impacted Files Coverage Δ
R/define.R 66.66% <0.00%> (ø)
R/list_glosario_terms.R 0.00% <0.00%> (ø)
R/text_definition.R 0.00% <0.00%> (ø)
R/validate_document.R 0.00% <0.00%> (ø)
R/entry.R 58.04% <9.67%> (-13.89%) ⬇️
R/gdef.R 81.81% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44f6acd...527913c. Read the comment docs.

Copy link
Contributor

@fmichonneau fmichonneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look and didn't actually test the code (just read it), and I think it all looks good!

I have a suggestion to store the glossary as a plain text YAML file inside the inst/ folder instead of storing it as a binary data package file which should simplify the code.

@@ -1,16 +1,20 @@
#!/usr/bin/env Rscript

# Create .rda version of glossary for R package.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this, personally, I would only fetch the yml file and put it in a inst/glosario folder instead of making it a binary data file. Then, when reading the glossary locally, you can directly call yaml::read_yaml on system.file("glosario/glossary.yml", package = "glosario"). This would have the advantage of providing people with a plain text version of the file if they wanted to inspect it (also easier to diff if needed), and it would make the code in the package simpler because we could use the same code to parse from the web or from the package directly. But maybe there is a reason you chose this approach?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also agree with this, especially if otherwise a binary is being updated every day on the github repo, which will make it grow and grow.

One of the things that I would advocate for is using an application directory instead for offline use, which comes in both R and python flavors (see {rappdirs} and appdirs). This way, people would be able to periodically update their dictionary without needing to update the package itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could have both?
Update the content of the inst/ folder daily with GitHub Actions so there is always a fresh copy of the glossary with the package, and create a function "update_local_glossary()` (or something like it) that would update the glossary in the application directory.
This feature seems however non-critical for the first release and could be implemented later.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature seems however non-critical for the first release and could be implemented later.

Agreed. My thoughts are largely centered around the patterns I see from R users and reproducibility:

  1. people don't update their packages that often or don't know how to update their packages.
  2. there is no clear indicator of what version of the glossary people have on their machines, so if it defaults to the one in the package, then definitions that exist in the global glossary may be missing in their local version.
  3. if {glosario} is released on CRAN, it will be updated every two months (as per CRAN's policy) at most, but the main glosario repository will be constantly updating.

These situations mean that if Belle installs the package on March 4th and Sebastian installs the package on July 17th, they will have two different versions of the glossary on their machines. Let's say they contribute a few new definitions to the main glossary on July 16th, but neither of them see these definitions on their lesson because the package was only pushed to CRAN on July 1st.

I think it would be good to consider these situations before we release this to CRAN and coordinate with the python implementation so that we can reduce the friction that users see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good points @zkamvar! and made me think that we might want to consider including some type versioning inside the YAML (a hash of the glossary content? a Git SHA1? an increasing counter? a datetime stamp?) so people/machines could compare versions?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to open a new issue for these discussions

R/validate_document.R Outdated Show resolved Hide resolved
R/validate_document.R Outdated Show resolved Hide resolved
.github/workflows/build_data.yaml Outdated Show resolved Hide resolved
R/gdef.R Outdated Show resolved Hide resolved
@ian-flores
Copy link
Member Author

@fmichonneau and @zkamvar I think this is now at a point where we can merge. Agreed or any other changes that you can think of?

@zkamvar
Copy link

zkamvar commented Aug 3, 2020

@fmichonneau and @zkamvar I think this is now at a point where we can merge. Agreed or any other changes that you can think of?

Agreed we can merge this to get it moving since RMD support is key. We need to add tests for this, though but that may depend on the outcome of carpentries/glosario#27

R/glosario.R Outdated Show resolved Hide resolved
R/glosario.R Outdated
#'
#' `glosario` provides the infrastructure to create and read multilingual
#' glossaries. By default, `glosario` providesaccess to a community-curated
#' glossary hosted by [The
#' Carpentries](https://github.com/carpentries/glosario).
#'
#' @docType package
#' @name glossary
#' @name glosario
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the name here should be glosario-package?

@fmichonneau
Copy link
Contributor

I left 2 small comments, but otherwise, this is good to merge! Thanks Ian!

@ian-flores ian-flores merged commit 3039e8c into master Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants