Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow non-interactive use of load_dataset() #19

Closed
richierocks opened this issue Aug 6, 2019 · 8 comments
Closed

Allow non-interactive use of load_dataset() #19

richierocks opened this issue Aug 6, 2019 · 8 comments

Comments

@richierocks
Copy link

Currently load_dataset() calls printer(), which in turn calls menu(), which throws an error if R is run non-interactively.

See

An example of wanting to use this function non-interactively is including the datasets in a Docker image.

I think that in a non-interactive session you can always assume that the user wants to download the dataset, so a possible reworking of printer() might be something like the following.

printer <- function(name) {
  info_name <- print_info[[name]]
  if(interactive()) {
    cat("Do you want to download:\n",
      "Name:", info_name[["name"]], "\n",
      "URL:", info_name[["url"]], "\n",
      "License:", info_name[["license"]], "\n",
      "Size:", info_name[["size"]], "\n",
      "Download mechanism:", info_name[["download_mech"]], "\n"
    )
    menu(choices = c("Yes", "No"))
  } else {
    cat("Downloading:\n",
      "Name:", info_name[["name"]], "\n",
      "URL:", info_name[["url"]], "\n",
      "License:", info_name[["license"]], "\n",
      "Size:", info_name[["size"]], "\n",
      "Download mechanism:", info_name[["download_mech"]], "\n"
    )
    1
  }
}

That is, the message is changed from "Do you want to download" to "Downloading" and menu() is replaced by always returning 1.

@juliasilge
Copy link
Contributor

This will require some careful thinking through, because the lexicon creators who agreed to have their work included in this way agreed because we set up for a user to agree to the license when they download, i.e. no commercial use, etc. That alternative does not sound like it is within the parameters we set up with the lexicon creators.

@EmilHvitfeldt
Copy link
Owner

Hello @richierocks,

textdata was build to ensure the user would be forced to agree to the terms of the datasets. Allowing the functions to download the data non-interactively wouldn't work.

however it is worth noting that interactivity is only needed the first time the data is being accessed. So you could call the functions once to agree to the conditions of the data, and then have subsequent analysis be done non-interactively.

Please note that some of the datasets (especially the lexicons) comes with "do not redistribute", meaning that you wouldn't be able to put the data in a docker image for someone else to use as they haven't agreed to the conditions of use.

@richierocks
Copy link
Author

In that case, I have 2 questions:

Would you be happy with adding an accept_license argument to load_dataset(), like

load_dataset <- function(data_name, name, dir, delete, return_path, accept_license = printer("name")) {
  if(!accept_license) {
    stop("You need to accept the license before you can use this dataset.")
  }
  # rest of function as before
}

That way the user has to explicitly say they are accepting the license by passing accept_license = TRUE.


Failing that, is it possible to make lexicon_nrc() work with the commercial version of the lexicon? That is, if I buy a copy, how do I make lexicon_nrc() (and therefore tidytext::get_sentiments()) make use of it?

@EmilHvitfeldt
Copy link
Owner

I'm still not comfortable with adding an option to allow downloads without the prompt.

If you buy a copy of the data, then you can place it in textdata's search path, and then the preprosessing and delivery will happen without the prompt. You can do this one of 2 ways.

  1. Place the data in the default path. This will depend on your operating system but can be found by running textdata::lexicon_afinn(return_path = T).

  2. Placing the data in a folder of your choosing and directing textdata to use that directory by specifying the dir= argument. textdata::lexicon_afinn(dir = "my-data-folder")

@umasenthil
Copy link

umasenthil commented Nov 4, 2019

I completely new to R. I downloaded afinn dataset and tried to change the directory of the afinn data. It still prompts the user to download afinn. Is this expected? Thanks!

` textdata::lexicon_afinn(dir = "/Users//Downloads/AFINN")
Do you want to download:
Name: AFINN-111
URL: http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010
License: Open Database License (ODbL) v1.0
Size: 78 KB (cleaned 59 KB)
Download mechanism: https

1: Yes
2: No
`

@EmilHvitfeldt
Copy link
Owner

Hello @umasenthil
Yes this is correct behavior. Changing the dir argument doesn't move the dataset, but rather tells the function where to look.

@umasenthil
Copy link

@EmilHvitfeldt Thank you for the clarification!

@umasenthil
Copy link

@richierocks I tried your solution of overriding the printer() method. The afinn installation is still in an interactive mode:
Rscript --vanilla install_afinn2.R Do you want to download: Name: AFINN-111 URL: http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010 License: Open Database License (ODbL) v1.0 Size: 78 KB (cleaned 59 KB) Download mechanism: https Error in menu(choices = c("Yes", "No"), title = title) : menu() cannot be used non-interactively Calls: get_sentiments -> <Anonymous> -> load_dataset -> printer -> menu Execution halted

I am trying to automate Rscripts.
Is there a way to make the 'afinn' dataset installation non-interactive?
Or Is there a way to pass the user input as a parameter to Rscript?
Thank you

jtr13 added a commit to jtr13/cc21 that referenced this issue Mar 26, 2021
set eval=FALSE in code chunks that must be run interactively and add note explaining why
EmilHvitfeldt/textdata#19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants