Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add magmaR to this repo #64

Closed
wants to merge 2 commits into from
Closed

Add magmaR to this repo #64

wants to merge 2 commits into from

Conversation

dtm2451
Copy link
Contributor

@dtm2451 dtm2451 commented Nov 12, 2020

Add the magma <=> R client, R package magmaR, which I have been working on for a bit now.

@dtm2451
Copy link
Contributor Author

dtm2451 commented Nov 12, 2020

See mountetna/monoetna#139 for previous discussion, but I will comment soon to continue ongoing discussion here.

@dtm2451
Copy link
Contributor Author

dtm2451 commented Nov 12, 2020

Re: mountetna/monoetna#139 (comment), I recognize the value in having a centralized pointer to which Etna environment in most cases, but I think that magmaR is unique in that it will ALSO be distributed outside of Etna itself. So I'd like help in deciding the path forward. @coleshaw @graft

Context:

@graft's comment

Just a note - there are a bunch of places where you reference ucsf.edu [in magmR] - I would generally like to consider this as a bad smell; if the code is free of the string “ucsf.edu” then it means that there must exist a proper configuration path for specifying instance-specific path names.

Term definition: I’m going to use “Etna environment” to refer generically to a (privileged) user’s current production/staging/dev target.

Places with "ucsf.edu":

  • (One location is my email within the DESCRIPTION file and I’d like to leave that as the currently given email, so not worth discussing.)
  • Two other locations that both have the same goal: setting the url.base for the call to either /retrieve or /query, which is precisely what controls whether the call goes to a particular production vs staging vs dev Etna environment. I can abstract these 2 further into a single location by making a .perform_RCurl_request() internal function, but the pointer would still remain so that has little bearing on the goal of this discussion.

Relevant uniqueness of magmaR compared to previously build pieces

There are two major use cases for the R package magmaR which differ in where they will be used, and how they will be installed:

  1. Within the DL calculation service; installed via the Etna build/docker-image (once we built it to do so)
  2. Locally on a user’s personal system; installed by the user to their personal computer’s or cluster’s R environment.

For 1, I agree with the intended goal that it would be useful to embed a way to have a set url.base updated at install based on which Etna environment. That would save us from needing to give url.base = [staging/dev url] in literally every magmaR call.
But for 2, it’s best to keep the user-control offered by my current system; the user R environment is not going to be altered by an Etna environment switch, so such a control point is needed.

Potential Paths Forward

I can think of 2 options here, but there may be others:

  1. I build in some magmaR_options() control which holds the default for url.base (and perhaps verbose or other future options). This could be set to run a single time, at the start of an R-session, and then would remove the need to give url.base in subsequent calls.
  2. We set up this default to be auto-adjusted when installed through Etna for a non-production environment, via the method Cole was referring here Add magmaR to the repo monoetna#139 (comment)

I prefer 2 because it wouldn’t complicate the package for users who will only ever look to production (most users), and also because default_adjustment_functions, like for 1, use non-standard methods that might complicate an eventual CRAN submission. But I would need help setting 2 up.

@coleshaw
Copy link
Collaborator

Thanks for writing this up! I have a couple of thoughts:

  1. Totally unrelated to the domain-name issue, but I'm curious what the reasoning is to create this PR on the etna repository instead of in monoetna? There isn't a technical issue with doing that (they are broken up in separate repositories so we can develop on a single repo or multiple as needed) -- however, it might get confusing depending on how you plan on installing the dev version or distributing it internally (pre-CRAN install). For example, you gave me instructions to install via remotes::install_github("mountetna/monoetna@magmaR", subdir = "magmaR"). Unfortunately, when you edit code in etna, it doesn't automatically get pushed to monoetna (we do have hooks that work the other way around, though...so when you merge code in monoetna, it automatically gets pushed to etna). There is a script that you'd have to manually run to get the etna code pushed back to monoetna. To avoid having to remember that manual step, it might be worth re-pushing this PR to monoetna. If you prefer sticking to a single repository so that you don't have to touch the other stuff (because it's out of scope for what you're doing, anyway), that's definitely okay too! Sorry for the long point, hope that makes sense and happy to clarify or answer any questions!

  2. Just want to caveat this point by saying that I'm not really sure how people use R packages and configure them, especially against web services (since my impression is that most R work is done locally). Would it be worth using the same etna.yml that we're using for the command-line tool, and initially just making the assumption that users have also installed the CLI? The default behavior of magmaR could be to pull the production environment from that file. Future work could then create that file if it doesn't exist by pulling from polyphemus the same way. Is this like an altered version of option 2?

I think this is going to go back to the same discussion with @corps around ease-of-initial-use and trying to minimize configuration for the command-line interface ... however that looks in the R context. One idea we had floated during the CLI discussion was, is it possible to have a single-line prompt for an R package installation (not just an R session, but like one-time forever), that prompts the user for the default domain to use? I guess like a more permanent version of option 1 above.

@coleshaw
Copy link
Collaborator

Oh, sorry, I re-read @graft 's comment and I see why you pushed this PR here :-) Haha, sorry!

I think he was suggesting that you move the magmaR code into a sub-folder in mountetna/monoetna, like monoetna/etna/packages/magmaR/, (which you are doing in this PR!), but to also keep developing in the monoetna repository. Because of our auto-push hooks, the code will make its way to this repository automatically, but it is a one-way push, as I noted above...

@dtm2451 dtm2451 closed this Nov 17, 2020
@dtm2451 dtm2451 deleted the magmaR branch November 17, 2020 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants