Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document the importance of the internet connection to create codemeta.json #270

Closed
maelle opened this issue Dec 4, 2019 · 7 comments
Closed
Milestone

Comments

@maelle
Copy link
Member

maelle commented Dec 4, 2019

No connection -> less data.
Poor connection -> very slooow (ask me how I know).

@maelle maelle added this to the 0.1.9 milestone Dec 4, 2019
@Bisaloo
Copy link
Member

Bisaloo commented Dec 4, 2019

Oooh, so this was the source of my problems? I have a good connection at work but I'm guessing some requests were blocked due to the network configuration. And write_codemeta() just kept hanging for literally hours... Would it be worth adding a timeout to the requests?

In my opinion, it would be useful if said documentation specified which websites/URL are queried, and what for.

@maelle
Copy link
Member Author

maelle commented Dec 4, 2019

yes and yes, good points!

@maelle
Copy link
Member Author

maelle commented Dec 4, 2019

Keeping notes here

The package queries

  • utils::available.packages() for CRAN and Bioconductor packages (no timeout parameter?)

  • GitHub API if it finds a GitHub repo URL in DESCRIPTION or as git remotes. GitHub API is queried to find the preferred README, and the repo topics. If you use codemetar for many packages having a GITHUB_PAT is better.

  • R-hub sysreqs API to parse SystemRequirements.

@maelle
Copy link
Member Author

maelle commented Dec 12, 2019

Reg timeout. I don't understand why the requests don't fail since @jeroen said curl has a default timeout of 10 seconds. I have trouble debugging this.

To set the timeout and other useful options (Jeroen mentioned https://curl.haxx.se/libcurl/c/CURLOPT_CONNECTTIMEOUT.html, https://curl.haxx.se/libcurl/c/CURLOPT_LOW_SPEED_LIMIT.html, https://curl.haxx.se/libcurl/c/CURLOPT_LOW_SPEED_TIME.html) one needs to create a handle https://jeroen.cran.dev/curl/articles/intro.html#setting-handle-options

Currently the call GitHub API uses gh::gh() and the call to sysreqs uses jsonlite::fromJSON(), I don't think one could set the options without removing them.

@maelle
Copy link
Member Author

maelle commented Dec 12, 2019

available.packages() uses download.file().

@Bisaloo
Copy link
Member

Bisaloo commented Dec 12, 2019

I wonder if this should also be documented via message()s in write_codemeta() 🤔 (possibly controlled by a verbose boolean argument).

It would be useful for users who don't necessarily read the vignette and it would help pinpoint the problematic domains when working with an annoying firewall.

Fetching online data from CRAN... ✔️
Fetching online data from sysreqs API...

@maelle
Copy link
Member Author

maelle commented Dec 19, 2019

Excellent idea! I'll work on that.

@maelle maelle closed this as completed Jan 8, 2020
maelle added a commit that referenced this issue Apr 1, 2020
## Deprecation

* The use_git_hook argument of write_codemeta() has been deprecated. Solutions for keeping DESCRIPTION and codemeta.json in sync are available in the docs.

## Enhancements

* Docs were improved to make a better case for codemetar.

* Changes in the way codeRepository is guessed. codemetar can now recognize an URL from GitHub, GitLab, Bitbucket, R-Forge among several URLs in DESCRIPTION, to assign it to codeRepository. If no URL in DESCRIPTION is from any of these providers, `guess_github()` is called.

* Adds documentation of internet needs and verbosity to steps downloading information from the web (#270, @Bisaloo)

* New argument `write_minimeta` for `write_codemeta()` indicating whether to also create the file schemaorg.json that  corresponds to the metadata Google would validate, to be inserted to a webpage for SEO. It is saved as "schemaorg.json" alongside `path` (by default, "codemeta.json"). This functionality requires the `jsonld` package (listed under `Suggests`).

## Bug fixes

* Fix for detecting rOpenSci review badge (@sckott, #236)

* Fix extraction of ORCID when composite comment (@billy34, #231)

* Fix bug in crosswalking (#243)

* Bug fix: the codeRepository is updated if there's any URL in DESCRIPTION.

* Bug fix: the README information is now updated by codemeta_readme(). Previously if e.g. a developmentStatus had been set previously, it was never updated.

## Internals

* Code cleaning following the book Martin, Robert C. Clean code: a handbook of agile software craftsmanship. Pearson Education, 2009. (@hsonne, #201, #202, #204, #205, #206, #207, #209, #210, #211, #212, #216, #218, #219, #220, #221).

* Use of re-usable Rmd pieces for the README, intro vignette and man pages to reduce copy-pasting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants