Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve show method; add nodelist show functions #108

Draft
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

zkamvar
Copy link
Member

@zkamvar zkamvar commented May 8, 2024

This PR improves QOL for people subsetting documents by allowing them to see the context of the nodes they create.

I'm not terribly set on the names, but I've got:

  • show_list() which shows each element in its own paragraph
  • show_block() which shows each element in its own context, with other contexts stripped away
    • adding mark = TRUE adds markers indicating if there was context preceding or following
  • show_censor() shows the nodes in the full context, censoring the other nodes visually.

I'm hoping to include this as tinkr 0.3.0 (which does not include #107)

library("tinkr")

path <- system.file("extdata", "example1.md", package = "tinkr")
y <- tinkr::yarn$new(path, sourcepos = TRUE)

# we can show a range of the document
y$show(22:32)
#> 
#> In the [second post of the series where we obtained data from
#> eBird](https://ropensci.org/blog/2018/08/21/birds-radolfzell/) we
#> determined what birds were observed in the county of Constance, and we
#> complemented this knowledge with some taxonomic and trait information in
#> [the fourth post of the
#> series](https://ropensci.org/blog/2018/09/04/birds-taxo-traits/). Now,
#> we could be curious about the occurrence of these birds in *scientific
#> work*. In this post, we will query the scientific literature and an open
#> scientific data repository for species names: what have these birds been
#> studied for? Read on if you want to learn how to use R packages allowing

items <- xml2::xml_find_all(y$body, ".//md:item", tinkr::md_ns())
links <- xml2::xml_find_all(y$body, ".//md:link", tinkr::md_ns())
code <- xml2::xml_find_all(y$body, ".//md:code", tinkr::md_ns())
blocks <- xml2::xml_find_all(y$body, ".//md:code_block", tinkr::md_ns())

# convert markdown to a vector, useful when converting markdown blocks that contain other elements (link links)

to_md_vec(items)[1:2]
#> [1] "study the results of such queries (e.g. meta studies of number of,\n  say, versions by datasets)"                                                                                                                   
#> [2] "or find data to integrate to a new study. If you want to *download*\n  data from DataONE, refer to the [download data\n  vignette](https://github.com/DataONEorg/rdataone/blob/master/vignettes/download-data.Rmd)."


# show a list of things
show_list(links[20:31])
#> 
#> 
#> [much
#> more](https://ropensci.org/packages/)
#> 
#> [`dataone`
#> package](https://github.com/DataONEorg/rdataone)
#> 
#> [`rfigshare`](https://github.com/ropensci/rfigshare)
#> 
#> [Figshare](https://figshare.com/)
#> 
#> [`EML` package](https://github.com/ropensci/EML)
#> 
#> [unconf
#> `dataspice` project](https://github.com/ropenscilabs/dataspice)
#> 
#> [here](https://ropensci.org/packages/)
#> 
#> [How to identify spots for birding using open geographical
#> data](https://ropensci.org/blog/2018/08/14/where-to-bird/)
#> 
#> [How to obtain bird occurrence data in
#> R](https://ropensci.org/blog/2018/08/21/birds-radolfzell/)
#> 
#> [How to extract text from old natural history
#> drawings](https://ropensci.org/blog/2018/08/28/birds-ocr/)
#> 
#> [How to complement an occurrence dataset with taxonomy and trait
#> information](https://ropensci.org/blog/2018/09/04/birds-taxo-traits/)
#> 
#> [our friendly discussion
#> forum](https://discuss.ropensci.org/c/usecases)
show_list(code[1:10])
#> 
#> 
#> `glue::glue_collapse(species, sep = ", ", last = " and ")`
#> 
#> `taxize`
#> 
#> `spocc`
#> 
#> `fulltext`
#> 
#> `fulltext`
#> 
#> `tidytext`
#> 
#> `dplyr::bind_rows`
#> 
#> `fulltext`
#> 
#> `cites`
#> 
#> `rcites`
# show them within the structure of the document
show_block(links[20:31])
#> 
#> 
#> [much
#> more](https://ropensci.org/packages/)
#> 
#> [`dataone`
#> package](https://github.com/DataONEorg/rdataone)[`rfigshare`](https://github.com/ropensci/rfigshare)[Figshare](https://figshare.com/)[`EML` package](https://github.com/ropensci/EML)[unconf
#> `dataspice` project](https://github.com/ropenscilabs/dataspice)
#> 
#> [here](https://ropensci.org/packages/)
#> 
#> - [How to identify spots for birding using open geographical
#>   data](https://ropensci.org/blog/2018/08/14/where-to-bird/)
#> 
#> - [How to obtain bird occurrence data in
#>   R](https://ropensci.org/blog/2018/08/21/birds-radolfzell/)
#> 
#> - [How to extract text from old natural history
#>   drawings](https://ropensci.org/blog/2018/08/28/birds-ocr/)
#> 
#> - [How to complement an occurrence dataset with taxonomy and trait
#>   information](https://ropensci.org/blog/2018/09/04/birds-taxo-traits/)
#> 
#> [our friendly discussion
#> forum](https://discuss.ropensci.org/c/usecases)
show_bare(code[1:10])
#> 
#> 
#> [`glue::glue_collapse(species, sep = ", ", last = " and ")`](https://twitter.com/LucyStats/status/1031938964796657665?s=19)
#> 
#> [`taxize`](https://github.com/ropensci/taxize)[`spocc`](https://github.com/ropensci/spocc)[`fulltext`](https://github.com/ropensci/fulltext)
#> 
#> `fulltext``tidytext`
#> 
#> `dplyr::bind_rows``fulltext`
#> 
#> [`cites`](https://github.com/ecohealthalliance/cites/)[`rcites`](https://ibartomeus.github.io/rcites/)
# show them within the structure of the document with context markers
show_block(links[20:31], mark = TRUE)
#> 
#> 
#> [...] [much
#> more](https://ropensci.org/packages/) [...]
#> 
#> [...] [`dataone`
#> package](https://github.com/DataONEorg/rdataone) [...][...] [`rfigshare`](https://github.com/ropensci/rfigshare) [...][...] [Figshare](https://figshare.com/) [...][...] [`EML` package](https://github.com/ropensci/EML) [...][...] [unconf
#> `dataspice` project](https://github.com/ropenscilabs/dataspice) [...]
#> 
#> [...] [here](https://ropensci.org/packages/) [...]
#> 
#> - [How to identify spots for birding using open geographical
#>   data](https://ropensci.org/blog/2018/08/14/where-to-bird/) [...]
#> 
#> - [How to obtain bird occurrence data in
#>   R](https://ropensci.org/blog/2018/08/21/birds-radolfzell/) [...]
#> 
#> - [How to extract text from old natural history
#>   drawings](https://ropensci.org/blog/2018/08/28/birds-ocr/) [...]
#> 
#> - [How to complement an occurrence dataset with taxonomy and trait
#>   information](https://ropensci.org/blog/2018/09/04/birds-taxo-traits/) [...]
#> 
#> [...] [our friendly discussion
#> forum](https://discuss.ropensci.org/c/usecases) [...]
show_context(code[1:10])
#> 
#> 
#> [[...] `glue::glue_collapse(species, sep = ", ", last = " and ")` [...]](https://twitter.com/LucyStats/status/1031938964796657665?s=19)
#> 
#> [`taxize`](https://github.com/ropensci/taxize)[`spocc`](https://github.com/ropensci/spocc)[`fulltext`](https://github.com/ropensci/fulltext)
#> 
#> [...] `fulltext` [...][...] `tidytext` [...]
#> 
#> [...] `dplyr::bind_rows` [...][...] `fulltext` [...]
#> 
#> [`cites`](https://github.com/ecohealthalliance/cites/)[`rcites`](https://ibartomeus.github.io/rcites/)

path <- system.file("extdata", "example1.md", package = "tinkr")
y <- tinkr::yarn$new(path, sourcepos = TRUE)
links <- xml2::xml_find_all(y$body, ".//md:link", tinkr::md_ns())
tinkr::show_censor(links)
    #> 
    #> 
    #> ▇▇ ▇▇▇ [second post of the series where we obtained data from
    #> eBird](https://ropensci.org/blog/2018/08/21/birds-radolfzell/) ▇▇
    #> ▇▇▇▇▇▇▇▇▇▇ ▇▇▇▇ ▇▇▇▇▇ ▇▇▇▇ ▇▇▇▇▇▇▇▇ ▇▇ ▇▇▇ ▇▇▇▇▇▇ ▇▇ ▇▇▇▇▇▇▇▇▇▇ ▇▇▇ ▇▇
    #> ▇▇▇▇▇▇▇▇▇▇▇▇ ▇▇▇▇ ▇▇▇▇▇▇▇▇▇ ▇▇▇▇ ▇▇▇▇ ▇▇▇▇▇▇▇▇▇ ▇▇▇ ▇▇▇▇▇ ▇▇▇▇▇▇▇▇▇▇▇ ▇▇
    #> [the fourth post of the
    #> series](https://ropensci.org/blog/2018/09/04/birds-taxo-traits/)▇ ▇▇▇▇
    #> ▇▇ ▇▇▇▇▇ ▇▇ ▇▇▇▇▇▇▇ ▇▇▇▇▇ ▇▇▇ ▇▇▇▇▇▇▇▇▇▇ ▇▇ ▇▇▇▇▇ ▇▇▇▇▇ ▇▇ *▇▇▇▇▇▇▇▇▇▇
    #> ▇▇▇▇*▇ ▇▇ ▇▇▇▇ ▇▇▇▇▇ ▇▇ ▇▇▇▇ ▇▇▇▇▇ ▇▇▇ ▇▇▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇▇▇▇ ▇▇▇ ▇▇ ▇▇▇▇
    #> ▇▇▇▇▇▇▇▇▇▇ ▇▇▇▇ ▇▇▇▇▇▇▇▇▇▇ ▇▇▇ ▇▇▇▇▇▇▇ ▇▇▇▇▇▇ ▇▇▇▇ ▇▇▇▇ ▇▇▇▇▇ ▇▇▇▇▇ ▇▇▇▇
    #> ▇▇▇▇▇▇▇ ▇▇▇▇ ▇▇▇▇ ▇▇ ▇▇ ▇▇▇ ▇▇▇▇ ▇▇ ▇▇▇▇▇ ▇▇▇ ▇▇ ▇▇▇ ▇ ▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇▇

<!-- SNIP ----------------------------------------------------------->   

    #> - [How to complement an occurrence dataset with taxonomy and trait
    #>   information](https://ropensci.org/blog/2018/09/04/birds-taxo-traits/)▇
    #>   ▇▇▇▇▇▇▇▇▇ `▇▇▇▇▇▇`▇ ▇▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇▇ ▇▇▇ ▇▇ ▇▇▇ `▇▇▇▇▇▇`▇
    #>   ▇▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇ ▇▇ ▇▇▇▇▇▇▇ ▇▇▇▇▇▇ ▇▇▇▇▇
    #> 
    #> - ▇▇▇ ▇▇ ▇▇▇▇▇ ▇▇▇ ▇▇▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇▇▇▇ ▇▇▇ ▇▇▇▇▇▇▇▇▇▇ ▇▇▇▇ ▇▇▇▇
    #>   ▇▇▇▇▇▇▇▇▇▇▇▇▇ ▇▇▇▇ ▇▇ ▇▇▇ ▇▇▇▇ ▇▇▇▇▇▇ ▇▇▇▇ ▇▇▇▇▇
    #> 
    #> ▇▇▇▇▇▇ ▇ ▇▇▇▇▇ ▇▇▇ ▇▇▇▇ ▇▇▇▇▇ *▇▇▇* ▇▇▇▇▇▇▇▇ ▇▇ ▇▇▇▇▇▇▇ ▇▇▇ ▇▇▇▇▇▇▇▇
    #> ▇▇▇▇▇ ▇▇▇ ▇▇▇▇ ▇▇▇ ▇▇▇▇▇▇ ▇▇▇ ▇▇ ▇▇▇▇▇ ▇▇▇▇▇ ▇▇▇▇ ▇▇▇ ▇▇▇▇▇ ▇▇ ▇▇▇▇▇▇▇▇
    #> ▇▇▇▇▇▇▇▇ ▇▇ ▇ ▇▇▇▇▇▇ ▇▇ ▇▇▇ ▇▇▇ [our friendly discussion
    #> forum](https://discuss.ropensci.org/c/usecases)▇ ▇▇▇▇▇ ▇▇▇▇▇▇▇▇

Created on 2024-05-08 with reprex v2.1.0

@zkamvar zkamvar mentioned this pull request May 9, 2024
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant