Testing against a live Dataverse #40

wibeasley · 2020-01-02T02:51:59Z

There needs to be a different approach to initiating the test suite. Right now [tests that should fail... still pass. It's because testthat::test_check() currently won't run if the API key isn't found as an environmental variable.

I'm open to ideas as always. Currently I'm thinking:

Test only against demo.dataverse.org. (A few weeks ago @pdurbin advocated this in a phone call for several reasons, including that Dataverse's retrieval stats won't be misleading --because one article gets hundreds of hits a month just from automated tests.)
create a (demo) Dataverse account dedicated to testing. At this point, I don't think it needs to be kept secret. There's not really a need to keep it secret. It could even be set in tests/testthat.R.

@pdurbin, will you please check my claim --especially from a security standpoint?
If the above is safe, the api key might be kept in a yaml file in the inst/ directory.
If the API key to the demo server needs to be protected,
1. we could save it as Travis environmental variables (ref 1 & ref 2)
2. it would prevent other people from testing the packages on their own machine, so we'll get fewer quality contributions from others.

@skasberger, @rliebz, @tainguyenbui, and any others, I'd appreciate any advice from your experience with pyDataverse, dataverse-client-python, and dataverse-client-javascript. I'm not experienced with your languages, but looks like pyDataverse doesn't pass an API key, while client-python posts their API key to the demo server.

(This is different from #4 & #29, which involve the battery of tests/comparisons. Not the management of API keys or how testthat is initiated.)

The text was updated successfully, but these errors were encountered:

ref #40

This interferes with all the assignments that happens within the function parameter defaults. And also the tests. ref #40

https://demo.dataverse.org/dataverse/dataverse-client-r ref #40

pdurbin · 2020-01-02T17:03:06Z

@wibeasley my first thought is that you could create a one-off user for every run on the demo site like this:

curl -d @user-add.json -H "Content-type:application/json" "$SERVER_URL/api/builtin-users?password=$NEWUSER_PASSWORD&key=$BUILTIN_USERS_KEY"

That's from http://guides.dataverse.org/en/4.18.1/api/native-api.html#create-a-builtin-user

I just tested it on the demo server and it worked with these environment variables:

export SERVER_URL=https://demo.dataverse.org

export NEWUSER_PASSWORD=password1

export BUILTIN_USERS_KEY=burrito

You'd have to vary the JSON you send each time to avoid errors about non-unique usernames or email addresses. In the link above this is what we provide as an example:

{
  "firstName": "Lisa",
  "lastName": "Simpson",
  "userName": "lsimpson",
  "affiliation": "Springfield",
  "position": "Student",
  "email": "lsimpson@mailinator.com"
}

tainguyenbui · 2020-01-02T17:05:46Z

Is there any chance that the real environments are not hit? You could create mocks and just make sure that the http request is being made with the right parameters.

pdurbin · 2020-01-02T17:05:50Z

Oh and to be clear the JSON response you get back should include the API token, which you'd use for subsequent operations. Using jq you could grab it like this:

jq '.data.apiToken'

But I assume you'd want to implement all of this "create user and assign the API token to a variable" stuff in R.

ref #40

wibeasley · 2020-01-04T18:19:43Z

During a meeting Friday, @pdurbin tentatively planned that

the initial test dataverses will be static on the demo server (e.g., https://demo.dataverse.org/dataverse/dataverse-client-r), as described in the initial post. As long as we're using only the demo server, we don't think it's necessary to keep the token secret, so we don't have to deploy environmental variables to Travis.
accumulate tests until (a) we have good code coverage and decent corner-case coverage, and (b) the design that is stable
add some dynamic features that follows @pdurbin's preference of ephemeral resources, such a the temporary users and dataverses. This will help the packages me deployed to things like Jenkins

@tainguyenbui, right now I think the mocks might be overkill for the current goals. I do appreciate that it could isolate problems of the client from problems of the server software. But the server seems pretty stable, and might require less maintenance that whatever mock I develop. Tell me if you think I'm overlooking something important.

skasberger · 2020-01-10T18:01:31Z

Some thoughts from me (pyDataverse).

I think it would be great, to have a test-instance somewhere with the latest Dataverse version, so the clients can be tested there before (or after) the release. If there is one user for all, or one for each client, I don't know.

pyDataverse passes an API key if it is passed in the init of an Api() object.

pdurbin · 2020-01-10T19:17:54Z

Related is the idea of setting up a beta server that runs "develop" - IQSS/dataverse.harvard.edu#20

Also, a new instance of Dataverse is spun up after every pull request is merged at https://jenkins.dataverse.org/job/IQSS-dataverse-develop/ but then it gets terminated a few hours later.

Finally, https://demo.dataverse.org always runs the latest release. It's the officially blessed server for testing releases: http://guides.dataverse.org/en/4.18.1/api/getting-started.html#servers-you-can-test-with

adam3smith · 2020-01-10T20:21:52Z

If I understand this correctly and the decision is made, could you publish the dataverse on demo? I don't like the idea of writing tests that will have to be re-written.

pdurbin · 2020-01-10T21:41:51Z

@adam3smith I assume your question is for @wibeasley

You both have my blessing to publish whatever you want on https://demo.dataverse.org , especially if you're testing dataverse-client-r! 😄 🎉

adam3smith · 2020-01-10T22:08:44Z

Yes, this is referring to https://demo.dataverse.org/dataverse/dataverse-client-r which is unpublished, sorry for the confusion.

kuriwaki · 2020-12-27T16:59:21Z

Are there (or can we make) datasets on demo.dataverse.org that are permanent and can be used for testing the data download functions?

The current get_file tests read from a now-inexistent DOI.

For testing, it would be good to have files that are Stata (.dta), SPSS (.sav), .csv, as well as some non-tabular data (like R script and PDFs).

kuriwaki · 2020-12-30T16:22:49Z

Scratch that, I created the dataset in the dataverse-client-r dataverse mentioned in #65. We can put all test data in https://demo.dataverse.org/dataverse/dataverse-client-r, with a separate "dataset" for different topics. For example, I put my Stata test data within "National Longitudinal Study of Young Women - Example Dataset", in dataverse-client-r

skasberger · 2020-12-30T16:48:46Z

Some general thoughts from me how to move on:

Work out a test strategy. What functionalities are critical, how to test them etc. Again, this should be same for each client.
Define requirements: a) Create a gold standard for metadata and files. the idea behind is, to have 3-4 different datasets with files attached, which are representative and can be used for different testing purposes (unit, integrity). These resources then can be used by all clients. b) find out, which Dataverse versions have which endpoints available and to test this. c) which API response can be expected (status codes and data -> for exception handling).
Implement the tests. For each client different!
Setup Dataverse instances, where the needed test data is available and can be created/manipulated. develop and latest would be nice.

I have done some work for 1 + 2 (develop branch of pyDataverse), and will do a total overwhaul for all the points mentioned above in the next 2 months for a major release. Maybe a call to discuss different things mentioned would be a great starting point, so the resources created (e. g. metadata JSON) are usefull for all and can be shared. My strategy is to work with a Docker instance locally for development (with which I can switch easily from one Dataverse version to another) and work out full, minimal metadata and representative test data. I have done some parts for this already, but the AUSSDA test data is so far not public (GDPR checks missing). We will also setup a Dataverse instance for pyDataverse testing, but still, the testing of different Dataverse versions is an issue not easy to maintane. But maybe this does not need to be done regularly.

So, this is quite tricky and complex, and I am already thinking about this for a long time. So from my point of view, I think to have a call and talk about this together would be the most efficient way forward, as more brains reduce errors and amount of work. :)
What do you think about that?

kuriwaki · 2021-12-21T06:05:53Z

So from my point of view, I think to have a call and talk about this together would be the most efficient way forward, as more brains reduce errors and amount of work. :)

@wibeasley @skasberger and I met on Jan 2021 to discuss this. The setup we have landed on for the current CRAN submissions seems stable to me: i.e. to use a demo dataverse and run daily checks on it through Github Actions instead of CRAN (#96). We would still need to get better coverage on tests (#4) and perhaps consider Jenkins (#22).

wibeasley self-assigned this Jan 2, 2020

wibeasley added a commit that referenced this issue Jan 2, 2020

test should fail-but doesn't

cd499aa

ref #40

wibeasley added a commit that referenced this issue Jan 2, 2020

read Dataverse token from yaml --test should fail!

5104783

ref #40

wibeasley added a commit that referenced this issue Jan 2, 2020

remove intentionally failing test

468e77e

ref #40

wibeasley added a commit that referenced this issue Jan 2, 2020

Don't load server name by default

436f788

This interferes with all the assignments that happens within the function parameter defaults. And also the tests. ref #40

wibeasley added a commit that referenced this issue Jan 2, 2020

switch to example dataset

9df8c6c

https://demo.dataverse.org/dataverse/dataverse-client-r ref #40

wibeasley added a commit that referenced this issue Jan 3, 2020

move package assertion & warning closer together

551d11c

ref #40

wibeasley added a commit that referenced this issue Jan 3, 2020

checkmate package required for testing

664caf2

ref #40

wibeasley added a commit that referenced this issue Jan 3, 2020

create manifest and testing helpers

63d1ce2

ref #40

wibeasley mentioned this issue Jan 4, 2020

Portability of test suite to clients in other languages #44

Open

wibeasley added the testing label Jan 4, 2020

wibeasley mentioned this issue Jan 9, 2020

Add config for dataverse-client-r IQSS/dataverse-jenkins#26

Open

This comment has been minimized.

Sign in to view

kuriwaki changed the title ~~tests & API keys~~ Testing against a live Dataverse Dec 21, 2021

kuriwaki mentioned this issue Dec 21, 2021

Complete testsuite for upload / creation #4

Open

6 tasks

kuriwaki closed this as completed Dec 21, 2021

kuriwaki mentioned this issue Dec 24, 2021

Test with jenkins.dataverse.org #22

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing against a live Dataverse #40

Testing against a live Dataverse #40

wibeasley commented Jan 2, 2020 •

edited

pdurbin commented Jan 2, 2020 •

edited by kuriwaki

tainguyenbui commented Jan 2, 2020

pdurbin commented Jan 2, 2020

wibeasley commented Jan 4, 2020 •

edited

skasberger commented Jan 10, 2020

pdurbin commented Jan 10, 2020

adam3smith commented Jan 10, 2020

pdurbin commented Jan 10, 2020

adam3smith commented Jan 10, 2020

kuriwaki commented Dec 27, 2020

This comment has been minimized.

kuriwaki commented Dec 30, 2020

skasberger commented Dec 30, 2020

kuriwaki commented Dec 21, 2021

Testing against a live Dataverse #40

Testing against a live Dataverse #40

Comments

wibeasley commented Jan 2, 2020 • edited

pdurbin commented Jan 2, 2020 • edited by kuriwaki

tainguyenbui commented Jan 2, 2020

pdurbin commented Jan 2, 2020

wibeasley commented Jan 4, 2020 • edited

skasberger commented Jan 10, 2020

pdurbin commented Jan 10, 2020

adam3smith commented Jan 10, 2020

pdurbin commented Jan 10, 2020

adam3smith commented Jan 10, 2020

kuriwaki commented Dec 27, 2020

This comment has been minimized.

kuriwaki commented Dec 30, 2020

skasberger commented Dec 30, 2020

kuriwaki commented Dec 21, 2021

wibeasley commented Jan 2, 2020 •

edited

pdurbin commented Jan 2, 2020 •

edited by kuriwaki

wibeasley commented Jan 4, 2020 •

edited