Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cyphr #114

Closed
10 of 13 tasks
richfitz opened this issue May 8, 2017 · 36 comments
Closed
10 of 13 tasks

cyphr #114

richfitz opened this issue May 8, 2017 · 36 comments

Comments

@richfitz
Copy link
Member

richfitz commented May 8, 2017

Summary

  • What does this package do? (explain in 50 words or less):

cyphr provides a set of user-friendly wrappers around openssl and cyphr to encrypt and decrypt data.

  • Paste the full DESCRIPTION file inside a code block below:
Package: cyphr
Title: High Level Enryption Wrappers
Version: 0.1.0
Author: Rich FitzJohn
Maintainer: Rich FitzJohn <rich.fitzjohn@gmail.com>
Description: Encryption wrappers, using low-level support from sodium and
    openssl.
License: MIT + file LICENSE
LazyData: true
URL: https://github.com/dide-tools/cyphr
BugReports: https://github.com/dide-tools/cyphr/issues
Depends: R (>= 3.2.1)
Imports:
    getPass,
    openssl,
    sodium
Suggests:
    knitr,
    rmarkdown,
    testthat
RoxygenNote: 5.0.1
VignetteBuilder: rmarkdown,
    knitr
  • URL for the package (the development repository, not a stylized html page):

https://github.com/richfitz/cyphr

  • Who is the target audience?

Data analysts who deal with sensitive data, especially those who need to collaborate on it.

  • Are there other R packages that accomplish the same thing? If so, what is different about yours?

The packages openssl and sodium provide the lower level support; this package builds on those to avoid exposing users to raw vectors and to try and make it easy to do the right thing. There is also overlap with gpg (but again that's lower level)

Requirements

Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • has a CRAN and OSI accepted license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a vignette with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration with Travis CI and/or another service.

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you wish to automatically submit to the Journal of Open Source Software? If so:
    • The package contains a paper.md with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI:
    • (Do not submit your package separately to JOSS)

Detail

  • Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:

  • Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:

  • If this is a resubmission following rejection, please explain the change in circumstances:

  • If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

If either are up for it, Ironholds or hrbrmstr (or someone else with similar security focus) (not @-copied directly to avoid adding them to a thread).

@maelle
Copy link
Member

maelle commented May 16, 2017

Thanks for your submission @richfitz! 😸

Editor checks:

  • [x ] Fit: The package meets criteria for fit and overlap
  • [ x] Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
  • [ x] License: The package has a CRAN or OSI accepted license
  • [x ] Repository: The repository link resolves correctly
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Nothing now that R CMD check succeeded on my weird Windows machine! 😉

Since you asked me whether the README was clear, I have two comments:

  • "To generate keys, you really should read the underling documentation in the sodium or openssl packages!"
    are these documentations as user-friendly as cyphr's?

  • Maybe add a link to https://github.com/hrbrmstr/rpwnd and maybe a few sentences to explain why R is not safe in order to make "(you're using R, remember)" more explicit.


Reviewers: @stephlocke @hrbrmstr
Due date: 2017-06-07 for @hrbrmstr but 2017-07-01 for @stephlocke because of travelling.

@maelle
Copy link
Member

maelle commented May 16, 2017

Thanks @stephlocke and @hrbrmstr for agreeing to review this package!

Here are links to our review template and to our reviewing guide.

@hrbrmstr
Copy link

Gracias. Starting this a bit later this afternoon for a change of pace from tracking global cybercrime ;-)

@maelle
Copy link
Member

maelle commented Jun 2, 2017

@hrbrmstr friendly reminder that your review is due on next Wednesday, June the 7th 😉

@maelle
Copy link
Member

maelle commented Jun 7, 2017

@hrbrmstr friendly reminder that your review is due today 😸

@maelle
Copy link
Member

maelle commented Jul 7, 2017

@hrbrmstr @stephlocke friendly reminder about your reviews

@hrbrmstr
Copy link

hrbrmstr commented Jul 24, 2017

cyphr Package Review

Package Naming

cyphr conforms with the rOpenSci requirements.

Function/variable naming & general syntax

cyphr uses snake case and has no function conflicts with other popular packages. It does have similar function names to those in other packages (e.g. openssl) but that should not be an issue as this is meant to be a more user-friendly and centralized wrapper for other encryption functions. Little benefit (if any) would be gained from a cyphr_ prefix for them. Furthermore, the documentation for the package encourages use of the cyphr:: full namespace prefix which should futher obviate any possible confusion or collisions.

README

I have submitted a PR to Rich with one spelling fix and a move of the "Installation" section to the rOpenSci suggested position in README's to conform with the the guidelines.

The README provides an excellent overview the intent, usage & utility of the package.

Code of Conduct

cyphr conforms with the rOpenSci requirements.

Documentation

All exported package functions are fully documented (in roxygen2 format) with examples.

I have submitted a PR to Rich with a bare-bones cyphr package-level documentation snippet. It likely does not need to be more than this since there is extensive documentation for the package in the vignettes that will end up being a "must read" for any new user(s).

cyphr contains two, thorough vignettes that will be extremely helpful for users who are not cybersecurity experts. Rich does a great job explainin "what", "why", and "how".

News

I have submitted a PR to Rich with a basic NEWS file (NEWS.md).

Authorship

cyphr conforms with the rOpenSci requirements.

Testing

cyphr passed checks on my local macOS and Ubuntu systems and I have reviewed & confirmed the same on appveyor and travis.

cyphr uses testthat and has an extensive test harness that fully exercises & covers all the funtionality of the package.

cyphr has 100% code coverage reported on codecov. #nocov is used (appropriately) five (5) times

cyphr passes all goodpractice checks.

The words/identifiers flagged with devtools::spell_check() are all fine.

Versioning

cyphr uses semantic versioning.

Continuous integration

cyphr uses travis and appveyor but there are only Windows and Linux checks. All relevant badges are present.

Examples

Most users will likely need to reference and rely on the vignettes which (as stated previously) are extensive & comprehensive. Not all functions have examples, but many of them would require duplicating substantial content from the vignettes which seems counter-productive. To that end, the PR with the package-level documentation provides links to the vignettes to further encourage users to review the before using the package.

Package dependencies

cyphr conforms with the rOpenSci requirements.

Recommended scaffolding

N/A

Console messages

cyphr conforms with the rOpenSci requirements.

Miscellaneous

cyphr conforms with the rOpenSci recommendations.

General Review

The use of session keys is nothing short brilliant. Rich has also done a great job doing everything possible to help users avoid shooting themselves in the foot (feet, since multiple?). It's hard to stop R from being insecure while running but there's more than sufficient guide-rails to prevent overt damage (i.e. data loss) or improper use.

The "registration" capability will also make it possible for authors of other packages to provide encrypt/decrypt functionality for their import/export functions. It's a very clever "hack".

openssl and sodium are great packages but cyhpr makes cryptography far more accessible to the average user with the high level wrapper functions. The "dropbox" use-case used throughout the vignettes put the package use into a very common context that users should find more than adequate to get up and running if they have sensitive data.

The documentation also provides overall solid security guidance without getting "preachy". I likely would have been a bit more forceful (i.e. said "NEVER use ssh keys without a passphrase!!") but that's likely not going to change behaviour whereas Rich's thoughtful, plain-English description about the threat vector for such a practice should appeal to the vast majority of users.

NOTE: (The following is just for communication and should not prohibit the rest of the onboarding process) I haven't really tried "breaking into" other folks' RStudio Server sessions yet as I really haven't had a use-case (apart from trying to get R environment variables) to do so. Attempting to grift cyphr session keys would seem to be the first, real use-case. I'm not sure how likely it is that someone with sensitive data will be using a shared computing environment like RStudio Server, but if it's possible for anyone to connect to an R session in an RStudio Server environment then one could grab the session key. It might be worth cc'ing Kevin (Ushey) on this part of the discussion to see if it's worth the time to try such an attack. I only bring it up as this package is intended to secure data analyses and if there are still open attack vectors we should inform users who likely aren't thinking about them since they are just trying to Get Stuff Done.

"Safety" Review

(my initial attempt at a set of checks for malicious or user/system-unsafe code which I shld really get into a checklist format for rOpenSci soon. since this package is going to handle sensitive data i thought it would be good to test-drive said checks on it.)

  • No directly compiled code in the package so no hidden malicious code at the binary level
  • No obfuscated malicious R code
  • No network access calls
  • system2() calls are not obfuscated and are appropriate
  • .onLoad() functionality is appropriate and non-harmful to end users

I tried to "hack" rewrite_register() and was able to successfully remove utils::read.csv from the package environment which enabled me to have my stealth-registered malicious::read.csv called instead (with a bare call on the order of cyphr::decrypt(read.csv("/tmp/file.csv"), key)). It should likely be strongly suggested to the user that they always namespace prefix (::) i/o calls in encrypt()/decrypt() and/or those functions should pretty much not execute w/o prefixed calls. This is not a show-stopper, but I'm trying to think like an attacker who really wants data. There are likely easier ways to get said data, too, but this is a "security" package and one could steal stuff from many places if they convinced folks to use their malicious package (or subverted a legit pkg) in conjunction with this one.

Code Review

  • Good+thorough use of function argument assertions
  • The documentation inside the package (internal docs) is good but there are 3 TODOs. None of them appear to be show-stoppers.
  • In light of the one "attack" vector listed above, it might be worth having rewrite_register() being verbose output-wise when it's called and possibly have encrypt and decrypt be verbose about which function call it actually ran in the event folks don't namespace prefix (either that or force namespace prefixing). I would have put the messaging in, but I didn't want to impose my will upon Rich.

@maelle
Copy link
Member

maelle commented Jul 24, 2017

Thanks a lot for your review @hrbrmstr ! 😁 Could you add how many hours you spent doing it (for our data collection)?

@kevinushey
Copy link

I tried to "hack" rewrite_register() and was able to successfully remove utils::read.csv from the package environment which enabled me to have my stealth-registered malicious::read.csv called instead (with a bare call on the order of cyphr::decrypt(read.csv("/tmp/file.csv"), key)). It should likely be strongly suggested to the user that they always namespace prefix (::) i/o calls in encrypt()/decrypt() and/or those functions should pretty much not execute w/o prefixed calls. This is not a show-stopper, but I'm trying to think like an attacker who really wants data. There are likely easier ways to get said data, too, but this is a "security" package and one could steal stuff from many places if they convinced folks to use their malicious package (or subverted a legit pkg) in conjunction with this one.

I might be misunderstanding what you're poking at here, but for what it's worth, anyone with access to your R session (ie: malicious R packages) can easily overwrite any function you have with e.g.

ns <- asNamespace("utils")
unlockBinding("read.csv", ns)
assign("read.csv", function(...) stop("pwned"), envir = ns)
lockBinding("read.csv", ns)

ns <- as.environment("package:utils")
unlockBinding("read.csv", ns)
assign("read.csv", function(...) stop("pwned"), envir = ns)
lockBinding("read.csv", ns)

read.csv("hello.txt")

I think that cyphr then has to operate under the assumption that the rsession it's running in has not been compromised, as I don't think there's much that can be done here (but please correct me if there's something I might be missing).

On the RStudio side, we'd definitely be interested because we certainly don't want to enable new attack vectors in addition to what's already available on the R side :)

@hrbrmstr
Copy link

hrbrmstr commented Jul 24, 2017 via email

@hrbrmstr
Copy link

image

I'm speculating (from that really quick view with wireshark) that I'd need the session cookie to be able to do anything real to the rsession itself. I wish I had more cycles to poke at it w/o bugging you @kevinushey ;-)

so we shld prbly add some guidance around ensuring proper user permissions on the server, not having certain tools installed on said servers and encourage folks to run RStudio Server over TLS if they are indeed working with sensitive data in a shared server environment. that may seem obvious but most folks are just trying to get stuff done vs think about attack scenarios.

@kevinushey
Copy link

Indeed, RStudio uses a secure cookie (established at login) to ensure that RPC requests from the browser are routed to the right place. (This goes along with other various safeguards on the server side, e.g. using unix domain sockets validated against the running user ID to ensure that you and only you can interact with your rsession process).

@hrbrmstr
Copy link

hrbrmstr commented Jul 24, 2017 via email

@maelle
Copy link
Member

maelle commented Jul 25, 2017

Great to see such an animated discussion in this thread!

@richfitz for info you can add a badge for rOpenSci peer-review to cyphr README via

[![](http://badges.ropensci.org/114_status.svg)](https://github.com/ropensci/onboarding/issues/114)

@stephlocke
Copy link

stephlocke commented Jul 26, 2017

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 2


Review Comments

  • Well documented and consistent package!
  • I created a PR for some typos and to get through the CRAN policy of referring to other packages within apostrophes
  • @hrbrmstr's comments provide an excellent review of the security aspects of this package
  • A potential improvement to the documentation would be some diagrams, however, the vignettes do a great job anyway

@maelle
Copy link
Member

maelle commented Jul 27, 2017

Thanks a lot for your review @stephlocke ! 🐶

@richfitz
Copy link
Member Author

Thanks for the reviews and especially with PRs! I will probably get through these in detail next week

@maelle
Copy link
Member

maelle commented Aug 20, 2017

@richfitz do you have a timeline for the changes?

@richfitz
Copy link
Member Author

richfitz commented Aug 21, 2017

Thanks for the prod @maelle - things refuse to settle down at work!

I've merged the two PRs and added the badge to the README. The other substantive suggestions from the reviewers are:

  • to add some diagrams (@stephlocke) - this is a great suggestion, but realistically is unlikely to happen. My last attempt at this required significant amounts of elaboration in a talk and there's an art to getting this done. I will create an issue on the repo and tag it "help wanted" in case someone with more graphics skills than me would like to help. If @stephlocke or the editors feel strongly at this I can have a go, but it's likely to be a bit slow.
  • There are three TODO's (@hrbrmstr) - I had totally forgotten about these; one of them I think I should look into fixing (authenticated decryption with openssl). This should not take too long to implement - hopefully by the end of the week. The half-pairs one I think no longer makes sense, and the path improving one I also think can be skipped. I will tidy them out of the code though.
  • And @hrbrmstr's comment:

I tried to "hack" rewrite_register() and was able to successfully remove utils::read.csv from the package environment which enabled me to have my stealth-registered malicious::read.csv called instead (with a bare call on the order of cyphr::decrypt(read.csv("/tmp/file.csv"), key)). It should likely be strongly suggested to the user that they always namespace prefix (::) i/o calls in encrypt()/decrypt() and/or those functions should pretty much not execute w/o prefixed calls. This is not a show-stopper, but I'm trying to think like an attacker who really wants data. There are likely easier ways to get said data, too, but this is a "security" package and one could steal stuff from many places if they convinced folks to use their malicious package (or subverted a legit pkg) in conjunction with this one.

I think this falls into the situation where we must rely on all loaded packages not being malicious; there's nothing stopping a package adding a load hook that harvests ssh keys as they are unlocked. This is consistent with the points that Kevin raises above. I could move some of the unlocking into C at the cost of maintainability and clarity, but I think it would still be trivially circumvention by a malicious package.

At the same time, I think it would be interesting to explore what could be done to extend the lockEnvironment() approach that R has to namespacing to some sort of code signing (hey ho notary) to prevent tampering more generally. Then with something like that in place, we could replace the general pattern of a package environment for caching/extension with one that does verification. I'm not sure what bits of the puzzle would be needed, but this would require serious changes to base R I think. Obviously this falls out of scope of the package, but it seems something worth thinking about

@maelle
Copy link
Member

maelle commented Aug 21, 2017

Thanks @richfitz for the update and especially for having carried this out while being busy! I'll wait for you to say you've done that last thing (authenticated decryption with openssl) before asking for the reviewers' approval.

I'm also thinking of how to keep this security discussion going once we'll have closed this issue, maybe via Discuss / a channel in the Slack?

@richfitz
Copy link
Member Author

OK, I have made the last set of changes and pushed to github! Please let me know if there's anything I have missed

@maelle
Copy link
Member

maelle commented Aug 31, 2017

Thanks a lot for the update @richfitz!

@stephlocke @hrbrmstr can you look at Rich's work and answer the last question "The author has responded to my review and made changes to my satisfaction. I recommend approving this package." in your reviews? Thanks a lot in advance.

@stephlocke
Copy link

Done

@maelle
Copy link
Member

maelle commented Sep 1, 2017

Thanks @stephlocke!

@maelle
Copy link
Member

maelle commented Sep 30, 2017

@hrbrmstr could you have a look at @richfitz's answer and tell us whether it addresses all your comments? Thanks!

@sckott
Copy link
Contributor

sckott commented Oct 19, 2017

hey @hrbrmstr - did Rich address all your concerns? yea or nah?

@sckott sckott self-assigned this Oct 24, 2017
@maelle
Copy link
Member

maelle commented Oct 29, 2017

Not really part of the review, my "dummy-friendly" issues

@sckott sckott removed their assignment Nov 10, 2017
@maelle
Copy link
Member

maelle commented Nov 12, 2017

Further reviewer @jeroen, thanks Jeroen for accepting to have a look at the changed package! 😸

@hrbrmstr
Copy link

should be done in a cpl mins

@hrbrmstr
Copy link

Official response: The author has responded to my review and made changes to my satisfaction. I recommend approving this package

Commentary: I agree w/Rich that there are definitely items I mentioned that are out of scope for this package (since they cannot be solved by the package alone). I've Evernoted this issue and added it to the overall "security & safety TODO" for R work.

@maelle
Copy link
Member

maelle commented Nov 13, 2017

Thanks @hrbrmstr! 😸

@jeroen we won't need your review after all, sorry but thanks for your motivation!

@richfitz your package is now approved, thanks a lot to @stephlocke and @hrbrmstr for their reviews!

To-dos:

  • Transfer the repo to the rOpenSci organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do.
  • Add the rOpenSci footer to the bottom of your README
[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)
  • Fix any links in badges for CI and coverage to point to the ropensci URL. (We'll turn on the services on our end as needed)

@maelle
Copy link
Member

maelle commented Nov 28, 2017

@richfitz will you soon transfer your repo? 😉

@richfitz
Copy link
Member Author

Sure - I'll do that now. This is going into ropensci or ropenscilabs?

@maelle
Copy link
Member

maelle commented Nov 28, 2017

Cool! ropensci, we no longer have the transitional onboarding at ropenscilabs. :-)

@richfitz
Copy link
Member Author

done! 🎉

@maelle
Copy link
Member

maelle commented Nov 28, 2017

Awesome, thanks again for your submission!

@maelle maelle closed this as completed Nov 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants