-
Notifications
You must be signed in to change notification settings - Fork 62
GPII-3138: Move functionality of gpii-dataloader repo into universal #692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Instead of building and running a docker container, move the relevant code into universal
|
CI job passed: https://ci.gpii.net/job/universal-tests/1206/ |
Even though it passed, I discovered that
I guess CI doesn't include it? |
…rsal docker image
Added check for the required number of command line arguments.
GPII-3138: Merged Dataloader code import from Stepan
|
Thank @klown for merging in my work. I'll just paste my original PR comment below as it adds a bit of context (can't update this PR's description): This PR moves dataloader from original Main reasons behind this:
Most of the code was in universal already, and This is a continuation of gpii-ops/gpii-dataloader#6 PR. Main changes:
|
No problem @stepanstipl . Well, I can edit this PR's description, so I copied/promoted the gist of it. |
Fixed minor problems from previous merge of Stepan's pull request.
|
@stepanstipl I cleaned up the merge issues -- not too many in the end. But a second pair of eyes wouldn't hurt. |
|
CI job passed: https://ci.gpii.net/job/universal-tests/1242/ |
|
CI job failed: https://ci.gpii.net/job/universal-tests/1243/ |
|
CI job passed: https://ci.gpii.net/job/universal-tests/1244/ |
documentation/DataLoader.md
Outdated
|
|
||
| - Converts the preferences in universal into `snapset` Prefs Safes and GPII Keys, | ||
| - Optionally deletes existing database, | ||
| - Creates a CouchDB database if none exits, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and it's "exists" :-)
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind -- merging in Stepan's pull
documentation/DataLoader.md
Outdated
|
|
||
| ## Environment Variables | ||
|
|
||
| - `COUCHDB_URL`: URL of the CouchDB database. (required) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add prefixes to these (probably GPII_) to reduce chances of conflicts with other uses of the environment (and also as a hint/courtesy to anyone looking in the environment wondering how they got there)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that sounds like a good idea (although the chance of conflict is pretty much non-existent - data loader runs in its own container and there's nothing else expected to be running).
documentation/DataLoader.md
Outdated
| It does following: | ||
|
|
||
| - Converts the preferences in universal into `snapset` Prefs Safes and GPII Keys, | ||
| - Optionally deletes existing database, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Puzzled that the option here is to delete the entire database rather than merely all of the documents of type "snapset"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a production test (test:vagrantProduction) that executes vagrantCloudBasedContainers.sh without its --no-rebuild flag. When first run locally, or when run by CI, there is no database to delete, so deleting the database is irrelevant at that point.
But, if a developer runs the test a second time, they can at their option start from scratch with an empty database, or use the --no-rebuild flag and modify the database in situ, which is closer to a production environment.
The question is whether the first option -- start from scratch -- is useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I think this is just a documentation issue, since if I understand correctly, the action of steps 5 and 6 is actually the one of https://github.com/GPII/universal/blob/master/scripts/deleteAndLoadSnapsets.js - so we should just clarify the wording in the readme here to explain that steps 5 and 6 don't simply drop "snapsets and keys" but in particular only those keys which are associated with snapsets. I think it would be helpful for the comment here to explicitly link to or reproduce the comment at the head of the script https://github.com/GPII/universal/blob/master/scripts/deleteAndLoadSnapsets.js#L11 so that, for example, anyone invoking this script will do so in confidence that it will not delete user data (unless they enable GPII_CLEAR_INDEX, which should be supplied with a clear warning)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've added to the README -- take a look.
Ok great, I'm not doing anything about that one :). |
|
CI job passed: https://ci.gpii.net/job/universal-tests/1257/ |
- Called out that only views, and snapset PrefsSafes and their associated GPII keys are deleted/updated (steps 4, 5, and 6), - Explained usage of environment variables, - Added a warning about GPII_CLEAR_INDEX, - Explained usage difference between development vs. staging/production environments.
|
CI job failed: https://ci.gpii.net/job/universal-tests/1262/ |
It can't even create the VM: https://ci.gpii.net/job/universal-create-vm/782/console Trying again... |
|
ok to test |
|
CI job failed: https://ci.gpii.net/job/universal-tests/1263/ |
|
it's the very same It seems to be quite common reason for failure (just a quick search - #638, #632, #534 and those are only the fails that actually copy-pasted the fail message). I've started looking if I can reproduce this faling test locally & do something about it, but I'm not very familiar with the code and seems quite complex at a first look. |
|
Thanks for investigating, @stepanstipl - actually I think that @the-t-in-rtf is inclining to believe that, as a result of failures like this one, we should abandon the use of PouchDB for our integration tests since it seems prone to erratic failures like this one. @the-t-in-rtf - does any reasonably straightforward means strike you for reducing the probability of these failures in the meantime? |
|
@amb26, my first suspicion is that we would see this less often if we weren't reusing a single data directory between runs, as configured here: My suggestion would be to update that to include unique information like |
|
Thanks @amb26 and @the-t-in-rtf I completely failed to reproduce the issue locally - even re-running the pouchManager tests 1000x times in a loop, and running multiple of those in parallel, I wasn't able to get that error. Even messing with the files in Adding a variable part to the dir name sounds like a good idea - I've done it in klown#6 against this branch - but as I wasn't able to reproduce the error locally, can't say if it really helps :D. |
GPII-3138: Add unique ID to pouchManager temp test dir
|
CI job passed: https://ci.gpii.net/job/universal-tests/1264/ |
|
ok to test |
|
@stepanstipl I've pulled in the variable directory name change, ran it locally, and all tests passed. They passed in CI as well. But, as you noted, I don't know if this fixes the issue.
|
|
@klown thanks for that, I just noticed it passed the CI. I was just trying to trigger re-run to try to get a better idea if it's just a random success or if it might have helped :D But I don't think my comment triggered anything. |
|
@stepanstipl Good idea to run the test a couple of times 👍 On a related note, I'm not sure you have the authority to trigger CI -- nothing seems to be happening. @amatas can we add @stepanstipl to the group, if he isn't a member? |
|
In the mean time, ok to test. |
|
CI job passed: https://ci.gpii.net/job/universal-tests/1266/ |
|
Third time's the charm, ok to test. |
|
CI job passed: https://ci.gpii.net/job/universal-tests/1268/ |
|
@klown ok, that looks optimistic 🤞... is there anything else to be done before we can merge this? |
|
Thanks @amb26 and congrats @stepanstipl |
|
So, just as an afternote, In viewing all the rimraf cleanup errors in the logs and remembering past problems, my theory is that rimraf doesn't consistently complete its cleanup on particular platforms, including the one that's used to test in CI, and that this results in dirty data directories when the same location is used over and over again in different test runs. Even though I am thinking of replacing express-pouchdb, it could still be relevant when working directly with PouchDB instances outside of express, as the same "cleanup promise with timeout" is used there. If it happens before we have an alternative approach, another option when working with express-pouchdb is to increase this timeout. |
|
Can @amb26 or someone else with permission cut a dev release of universal that includes this work? |
|
0.3.0-dev.20181026T103556Z.112fa057 |
|
Cheers! |
@stepanstipl @cindyli @mrtyler, Here is a first pass at moving the gist of what gpii-dataloader pull 6. does into universal. I'm making separate pull request from pull 626 to avoid confusion. If that is the correct place to put it, I can move easily.
It's not complete, but good enough for others to start commenting. In particular, the relevant parts of the README need to come over, and I haven't start that yet.
For more background, see this discussion on gpii-dataloader pull 6.
In addition, @stepanstipl created a similar pull request (#696) which has since been merged into this one. Here is a copy of his PR description:
This PR moves dataloader from original gpii-ops/gpii-dataloader repo into universal.
Main reasons behind this:
Most of the code was in universal already, and gpii-ops/gpii-dataloader contained basically only shell wrapper and the docker image. Work on new dataloader (GPII-3138 and #626) requires changes to the existing dataloader wrapper and therefore presented a good opportunity for the move.
This is a continuation of gpii-ops/gpii-dataloader#6 PR.
Main changes: