Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use HTTPS for all URLs in R package #7863

Open
exalate-issue-sync bot opened this issue May 11, 2023 · 9 comments
Open

Use HTTPS for all URLs in R package #7863

exalate-issue-sync bot opened this issue May 11, 2023 · 9 comments

Comments

@exalate-issue-sync
Copy link

Hi! There's a [CRAN Policy|https://cran.r-project.org/web/packages/policies.html] that states:

Downloads of additional software or data as part of package
installation or startup should only use secure download mechanisms
(e.g., ‘https’ or ‘ftps’).

and h2o seems to be in violation of said policy in several functions. For example, when we install the library within R sessions, this function is called using "http" urls to download additional files: [https://github.com/cran/h2o/blob/15f015a62befd8ca7fd0fe7f151d3fcefe3ad0e5/R/connection.R#L789|https://github.com/cran/h2o/blob/15f015a62befd8ca7fd0fe7f151d3fcefe3ad0e5/R/connection.R#L789]

As I work with extremely security-sensible people, they are not allowing me to install the library as long as this policy is not on track. I've used and promoted h2o (for R) since about 3 years ago with my lares library, and find it absolutely amazing; top of the market. It would be a waste to have to pivot to another similar library for this single reason.

Hope you can fix this soon and many thanks!

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: Hi [~accountid:5e7a344b47dc780c3c6cc79f] thanks for the report! We were notified by CRAN that we were in violation about two weeks ago, so this is going to be fixed on the next release of H2O (getting released next week, so you should have the fix very soon!). I have a separate ticket for this here: [https://0xdata.atlassian.net/browse/PUBDEV-7779|https://0xdata.atlassian.net/browse/PUBDEV-7779|smart-link] It’s been merged already, so the fix is already available in the nightly releases (you can download the patched R package [here|http://h2o-release.s3.amazonaws.com/h2o/master/latest.html] if you want to give it a try).

Did you see any other non-HTTPS violations, or was it just the h2o.jar download line of that you referenced in the description?

@exalate-issue-sync
Copy link
Author

Bernardo Lares commented: Thanks Erin, so glad this will be solved soon!
I’ve done a quick search on the code and have found a couple of {{http}}s.. they are not the most relevant URLs but here are some.

  • [https://github.com/cran/h2o/blob/master/man/h2o.getTypes.Rd#L23|https://github.com/cran/h2o/blob/master/man/h2o.getTypes.Rd#L23]
  • CSV: [https://github.com/cran/h2o/blob/master/R/frame.R#L335|https://github.com/cran/h2o/blob/master/R/frame.R#L335]
  • CSV: [https://github.com/cran/h2o/blob/master/R/frame.R#L2300|https://github.com/cran/h2o/blob/master/R/frame.R#L2300]
    (there are more, repeated…)
  • Documentation: [https://github.com/cran/h2o/blob/master/R/connection.R#L43|https://github.com/cran/h2o/blob/master/R/connection.R#L43]
  • Documentation: [https://github.com/cran/h2o/blob/master/R/connection.R#L198|https://github.com/cran/h2o/blob/master/R/connection.R#L198]

I’m attaching a screenshot on the regex search that helped me find them.

!Captura de Pantalla 2020-09-22 a la(s) 8.48.24 p. m..png|width=1189,height=782!

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: Hi [~accountid:5e7a344b47dc780c3c6cc79f] the CRAN policy only relates to software/data that’s installed on install/startup, so I don’t think the examples using data using http are not in violation, but it would be a good thing to fix anyway, so we can use this ticket to fix those! Thanks.

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: Hi [~accountid:5d1185d4f46aa30c271c7cc6] I just assigned this to you (it does not need to go into the 3.32 release, but if you have time you can put it in). We need to use https in most places where we use http in URLs in the R code/docs. If a URL was duplicated in a file, i just added it to the list below once (there are a lor in frame.R examples).

Here’s a list of the URLs that need to be updated from http to https (I removed some http lines that are used in h2o.init()). The R files where they appear are to the left of the file:

communication.R:#' f <- "[http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_wheader.csv"|http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_wheader.csv%22]
connection.R: "([http://docs.h2o.ai/h2o/latest-stable/h2o-docs/welcome.html#new-user-quick-start)."))|http://docs.h2o.ai/h2o/latest-stable/h2o-docs/welcome.html#new-user-quick-start).%22))]
connection.R: Upgrade H2O and R to latest stable version - [http://h2o-release.s3.amazonaws.com/h2o/latest_stable.html",|http://h2o-release.s3.amazonaws.com/h2o/latest_stable.html%22,]
connection.R: Install the matching h2o-R version from - [http://h2o-release.s3.amazonaws.com/h2o/%s/%s/index.html",|http://h2o-release.s3.amazonaws.com/h2o/%25s/%25s/index.html%22,]
connection.R: "For more information visit [http://docs.h2o.ai\n",|http://docs.h2o.ai%5Cn%22,]
connection.R: "[http://www.oracle.com/technetwork/java/javase/downloads/index.html")|http://www.oracle.com/technetwork/java/javase/downloads/index.html%22)]
coxph.R:#' f <- "[http://s3.amazonaws.com/h2o-public-test-data/smalldata/coxph_test/heart.csv"|http://s3.amazonaws.com/h2o-public-test-data/smalldata/coxph_test/heart.csv%22]
datasets.R:#' learning databases [[http://www.ics.uci.edu/~mlearn/MLRepository.html].|http://www.ics.uci.edu/~mlearn/MLRepository.html%5D.] Irvine, CA: University of
frame.R:#' f <- "[http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_train.csv"|http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_train.csv%22]
frame.R:#' f <- "[http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_wheader.csv"|http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_wheader.csv%22]
frame.R:#' f <- "[http://s3.amazonaws.com/h2o-public-test-data/smalldata/coxph_test/heart.csv"|http://s3.amazonaws.com/h2o-public-test-data/smalldata/coxph_test/heart.csv%22]
frame.R:#' f <- "[http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_train.csv"|http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_train.csv%22]
frame.R:#' df <- h2o.importFile("[http://s3.amazonaws.com/h2o-public-test-data/smalldata/prostate/prostate.csv")|http://s3.amazonaws.com/h2o-public-test-data/smalldata/prostate/prostate.csv%22)]
frame.R:#' f <- "[http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_train.csv"|http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_train.csv%22]
frame.R:#' @references [http://www.cs.ucr.edu/~eamonn/iSAX_2.0.pdf|http://www.cs.ucr.edu/~eamonn/iSAX_2.0.pdf]
frame.R:#' @references [http://www.cs.ucr.edu/~eamonn/SAX.pdf|http://www.cs.ucr.edu/~eamonn/SAX.pdf]
glrm.R:#' @references M. Udell, C. Horn, R. Zadeh, S. Boyd (2014). {Generalized Low Rank Models}[[http://arxiv.org/abs/1410.0342].|http://arxiv.org/abs/1410.0342%5D.] Unpublished manuscript, Stanford Electrical Engineering Department.
glrm.R:#' N. Halko, P.G. Martinsson, J.A. Tropp. {Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions}[[http://arxiv.org/abs/0909.4061].|http://arxiv.org/abs/0909.4061%5D.] SIAM Rev., Survey and Review section, Vol. 53, num. 2, pp. 217-288, June 2011.
kvstore.R:#' f <- "[http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_train.csv"|http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_train.csv%22]
models.R:#' f <- "[http://s3.amazonaws.com/h2o-public-test-data/smalldata/prostate/prostate_complete.csv.zip"|http://s3.amazonaws.com/h2o-public-test-data/smalldata/prostate/prostate_complete.csv.zip%22]
models.R:#' f <- "[http://h2o-public-test-data.s3.amazonaws.com/smalldata/chicago/chicagoCensus.csv"|http://h2o-public-test-data.s3.amazonaws.com/smalldata/chicago/chicagoCensus.csv%22]
models.R:#' f <- "[http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_train.csv"|http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_train.csv%22]

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: [~accountid:5e7a344b47dc780c3c6cc79f] Fixed.

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: [~accountid:5e7a344b47dc780c3c6cc79f] Unfortunately, h2o was pulled from CRAN today because of this issue. These fixes were scheduled to go in our new 3.32.0.1 release (which was scheduled for tomorrow). We had been emailing with CRAN about it and they knew that we were going to fix it in our next release. Unfortunately it was removed anyway.

Instead, we are doing a 3.30.1.3 release tomorrow (Monday Sept 28) and the 3.32 release will be delayed until later this week. Just an FYI.

@exalate-issue-sync
Copy link
Author

Bernardo Lares commented: Thanks for the follow up and quick response Erin. Looking forward to have the package back into CRAN very soon! 🙏🏼

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: [~accountid:5e7a344b47dc780c3c6cc79f] we are back on cran.

@exalate-issue-sync
Copy link
Author

Bernardo Lares commented: Thanks Erin. Great news indeed! Noticed it a couple of days back. Congratulations and happy we have h2o back on track. Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants