Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPII-3138: Move functionality of gpii-dataloader repo into universal #692

Merged
merged 27 commits into from Oct 25, 2018

Conversation

Projects
None yet
6 participants
@klown
Copy link
Contributor

commented Oct 18, 2018

@stepanstipl @cindyli @mrtyler, Here is a first pass at moving the gist of what gpii-dataloader pull 6. does into universal. I'm making separate pull request from pull 626 to avoid confusion. If that is the correct place to put it, I can move easily.

It's not complete, but good enough for others to start commenting. In particular, the relevant parts of the README need to come over, and I haven't start that yet.

For more background, see this discussion on gpii-dataloader pull 6.

In addition, @stepanstipl created a similar pull request (#696) which has since been merged into this one. Here is a copy of his PR description:

This PR moves dataloader from original gpii-ops/gpii-dataloader repo into universal.

Main reasons behind this:

  • Get rid of the duplication of Dockerfile and necessity to maintain extra image and pipeline
  • Get rid of cloning the repo and installing npm modules when the container is executed (this would be slow, is prone to errors and would raise question of versioning)
  • Data loader will always have the new universal code (as it will be running in the universal image)
  • Data loader will get triggered every time there's a change to universal (as the universal image will get updated)

Most of the code was in universal already, and gpii-ops/gpii-dataloader contained basically only shell wrapper and the docker image. Work on new dataloader (GPII-3138 and #626) requires changes to the existing dataloader wrapper and therefore presented a good opportunity for the move.

This is a continuation of gpii-ops/gpii-dataloader#6 PR.

Main changes:

  • Import deleteAndLoadSnapsets.sh and DataLoader.md
  • Add dataLoader dependencies to docker image (jq and curl)
  • Update data loader script to reflect new location in universal
  • Minor formatting changes
  • Add DB connectivity check to dataloader
  • Update vagrantCloudBasedContainers.sh to work with new dl location
  • Update DataLoader docs to reflect current state
GPII-3138: Moved parts of gpii-dataloder code into universal
Instead of building and running a docker container, move the relevant
code into universal
@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Oct 18, 2018

@klown

This comment has been minimized.

Copy link
Contributor Author

commented Oct 18, 2018

CI job passed: https://ci.gpii.net/job/universal-tests/1206/

Even though it passed, I discovered that jq was not found. It's needed for the warm_indices() function. Buried in the console log for the production tests:

/home/vagrant/sync/universal/scripts/deleteAndLoadSnapsets.sh: line 14: jq: command not found

I guess CI doesn't include it?

@stepanstipl stepanstipl referenced this pull request Oct 22, 2018

Merged

Dataloader import #3

GPII-3138: Improved check for proper usage of convertPrefs.js
Added check for the required number of command line arguments.
Merge pull request #3 from stepanstipl/dataloader-import-klown
GPII-3138:  Merged Dataloader code import from Stepan
@stepanstipl

This comment has been minimized.

Copy link
Contributor

commented Oct 22, 2018

Thank @klown for merging in my work. I'll just paste my original PR comment below as it adds a bit of context (can't update this PR's description):

This PR moves dataloader from original gpii-ops/gpii-dataloader repo into universal.

Main reasons behind this:

  • Get rid of the duplication of Dockerfile and necessity to maintain extra image and pipeline
  • Get rid of cloning the repo and installing npm modules when the container is executed (this would be slow, is prone to errors and would raise question of versioning)
  • Data loader will always have the new universal code (as it will be running in the universal image)
  • Data loader will get triggered every time there's a change to universal (as the universal image will get updated)

Most of the code was in universal already, and gpii-ops/gpii-dataloader contained basically only shell wrapper and the docker image. Work on new dataloader (GPII-3138 and #626) requires changes to the existing dataloader wrapper and therefore presented a good opportunity for the move.

This is a continuation of gpii-ops/gpii-dataloader#6 PR.

Main changes:

  • Import deleteAndLoadSnapsets.sh and DataLoader.md
  • Add dataLoader dependencies to docker image (jq and curl)
  • Update data loader script to reflect new location in universal
  • Minor formatting changes
  • Add DB connectivity check to dataloader
  • Update vagrantCloudBasedContainers.sh to work with new dl location
  • Update DataLoader docs to reflect current state
@klown

This comment has been minimized.

Copy link
Contributor Author

commented Oct 22, 2018

Thank @klown for merging in my work. I'll just paste my original PR comment below as it adds a bit of context (can't update this PR's description) ...

No problem @stepanstipl . Well, I can edit this PR's description, so I copied/promoted the gist of it.

GPII-3138: Fixed merge issues.
Fixed minor problems from previous merge of Stepan's pull request.
@klown

This comment has been minimized.

Copy link
Contributor Author

commented Oct 22, 2018

@stepanstipl I cleaned up the merge issues -- not too many in the end. But a second pair of eyes wouldn't hurt.

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Oct 22, 2018

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Oct 22, 2018

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Oct 22, 2018


- Converts the preferences in universal into `snapset` Prefs Safes and GPII Keys,
- Optionally deletes existing database,
- Creates a CouchDB database if none exits,

This comment has been minimized.

Copy link
@amb26

amb26 Oct 22, 2018

Member

Typo exists

This comment has been minimized.

Copy link
@klown

klown Oct 23, 2018

Author Contributor

and it's "exists" :-)
Fixed.

This comment has been minimized.

Copy link
@klown

klown Oct 23, 2018

Author Contributor

Never mind -- merging in Stepan's pull


## Environment Variables

- `COUCHDB_URL`: URL of the CouchDB database. (required)

This comment has been minimized.

Copy link
@amb26

amb26 Oct 22, 2018

Member

Can we add prefixes to these (probably GPII_) to reduce chances of conflicts with other uses of the environment (and also as a hint/courtesy to anyone looking in the environment wondering how they got there)

This comment has been minimized.

Copy link
@stepanstipl

stepanstipl Oct 23, 2018

Contributor

Yes, that sounds like a good idea (although the chance of conflict is pretty much non-existent - data loader runs in its own container and there's nothing else expected to be running).

It does following:

- Converts the preferences in universal into `snapset` Prefs Safes and GPII Keys,
- Optionally deletes existing database,

This comment has been minimized.

Copy link
@amb26

amb26 Oct 22, 2018

Member

Puzzled that the option here is to delete the entire database rather than merely all of the documents of type "snapset"

This comment has been minimized.

Copy link
@klown

klown Oct 23, 2018

Author Contributor

There is a production test (test:vagrantProduction) that executes vagrantCloudBasedContainers.sh without its --no-rebuild flag. When first run locally, or when run by CI, there is no database to delete, so deleting the database is irrelevant at that point.

But, if a developer runs the test a second time, they can at their option start from scratch with an empty database, or use the --no-rebuild flag and modify the database in situ, which is closer to a production environment.

The question is whether the first option -- start from scratch -- is useful.

This comment has been minimized.

Copy link
@amb26

amb26 Oct 24, 2018

Member

OK, I think this is just a documentation issue, since if I understand correctly, the action of steps 5 and 6 is actually the one of https://github.com/GPII/universal/blob/master/scripts/deleteAndLoadSnapsets.js - so we should just clarify the wording in the readme here to explain that steps 5 and 6 don't simply drop "snapsets and keys" but in particular only those keys which are associated with snapsets. I think it would be helpful for the comment here to explicitly link to or reproduce the comment at the head of the script https://github.com/GPII/universal/blob/master/scripts/deleteAndLoadSnapsets.js#L11 so that, for example, anyone invoking this script will do so in confidence that it will not delete user data (unless they enable GPII_CLEAR_INDEX, which should be supplied with a clear warning)

This comment has been minimized.

Copy link
@klown

klown Oct 24, 2018

Author Contributor

OK, I've added to the README -- take a look.

@stepanstipl

This comment has been minimized.

Copy link
Contributor

commented Oct 24, 2018

BTW, I'm also updating the dataloader's README as @amb26 suggested. I hope I'm not stepping on your toes here.

Ok great, I'm not doing anything about that one :).

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Oct 24, 2018

GPII-3138: Improved dataloader README
- Called out that only views, and snapset PrefsSafes and their
  associated GPII keys are deleted/updated (steps 4, 5, and 6),
- Explained usage of environment variables,
- Added a warning about GPII_CLEAR_INDEX,
- Explained usage difference between development vs. staging/production
  environments.
@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Oct 24, 2018

@klown

This comment has been minimized.

Copy link
Contributor Author

commented Oct 24, 2018

@klown

This comment has been minimized.

Copy link
Contributor Author

commented Oct 24, 2018

ok to test

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Oct 24, 2018

@stepanstipl

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2018

it's the very same jq: Test concluded - Module "Testing Pouch Manager" Test name "Testing the load and the reset processes": 21/22 passed - FAIL causing this issue.

It seems to be quite common reason for failure (just a quick search - #638, #632, #534 and those are only the fails that actually copy-pasted the fail message).

I've started looking if I can reproduce this faling test locally & do something about it, but I'm not very familiar with the code and seems quite complex at a first look.

@amb26

This comment has been minimized.

Copy link
Member

commented Oct 25, 2018

Thanks for investigating, @stepanstipl - actually I think that @the-t-in-rtf is inclining to believe that, as a result of failures like this one, we should abandon the use of PouchDB for our integration tests since it seems prone to erratic failures like this one. @the-t-in-rtf - does any reasonably straightforward means strike you for reducing the probability of these failures in the meantime?

@the-t-in-rtf

This comment has been minimized.

Copy link
Member

commented Oct 25, 2018

@amb26, my first suspicion is that we would see this less often if we weren't reusing a single data directory between runs, as configured here:

https://github.com/GPII/universal/blob/master/gpii/node_modules/pouchManager/test/pouchManagerTests.js#L257

My suggestion would be to update that to include unique information like {that}.id, so that there is no chance of cleanup failures in earlier tests polluting the next run.

@stepanstipl

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2018

Thanks @amb26 and @the-t-in-rtf

I completely failed to reproduce the issue locally - even re-running the pouchManager tests 1000x times in a loop, and running multiple of those in parallel, I wasn't able to get that error. Even messing with the files in /tmp/pouchManagerTests directory during the test run (creating extra or deleting the LOCK and other files), I still wasn't able to get the same error, so I don't really know what's going on.

Adding a variable part to the dir name sounds like a good idea - I've done it in klown#6 against this branch - but as I wasn't able to reproduce the error locally, can't say if it really helps :D.

Merge pull request #6 from stepanstipl/dataloader-import3
GPII-3138: Add unique ID to pouchManager temp test dir
@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Oct 25, 2018

@stepanstipl

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2018

ok to test

@klown

This comment has been minimized.

Copy link
Contributor Author

commented Oct 25, 2018

@stepanstipl I've pulled in the variable directory name change, ran it locally, and all tests passed. They passed in CI as well.

But, as you noted, I don't know if this fixes the issue.

Adding a variable part to the dir name sounds like a good idea - I've done it in klown#6 against this branch - but as I wasn't able to reproduce the error locally, can't say if it really helps :D.

@stepanstipl

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2018

@klown thanks for that, I just noticed it passed the CI. I was just trying to trigger re-run to try to get a better idea if it's just a random success or if it might have helped :D

But I don't think my comment triggered anything.

@klown

This comment has been minimized.

Copy link
Contributor Author

commented Oct 25, 2018

@stepanstipl Good idea to run the test a couple of times 👍

On a related note, I'm not sure you have the authority to trigger CI -- nothing seems to be happening. @amatas can we add @stepanstipl to the group, if he isn't a member?

@the-t-in-rtf

This comment has been minimized.

Copy link
Member

commented Oct 25, 2018

In the mean time, ok to test.

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Oct 25, 2018

@klown

This comment has been minimized.

Copy link
Contributor Author

commented Oct 25, 2018

Third time's the charm, ok to test.

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Oct 25, 2018

@stepanstipl

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2018

@klown ok, that looks optimistic 🤞... is there anything else to be done before we can merge this?

@amb26 amb26 merged commit 8847cd3 into GPII:master Oct 25, 2018

1 check passed

default Build finished.
Details
@klown

This comment has been minimized.

Copy link
Contributor Author

commented Oct 25, 2018

Thanks @amb26 and congrats @stepanstipl

@the-t-in-rtf

This comment has been minimized.

Copy link
Member

commented Oct 26, 2018

So, just as an afternote, In viewing all the rimraf cleanup errors in the logs and remembering past problems, my theory is that rimraf doesn't consistently complete its cleanup on particular platforms, including the one that's used to test in CI, and that this results in dirty data directories when the same location is used over and over again in different test runs. Even though I am thinking of replacing express-pouchdb, it could still be relevant when working directly with PouchDB instances outside of express, as the same "cleanup promise with timeout" is used there.

If it happens before we have an alternative approach, another option when working with express-pouchdb is to increase this timeout.

@the-t-in-rtf

This comment has been minimized.

Copy link
Member

commented Oct 26, 2018

Can @amb26 or someone else with permission cut a dev release of universal that includes this work?

@amb26

This comment has been minimized.

Copy link
Member

commented Oct 26, 2018

0.3.0-dev.20181026T103556Z.112fa057

@the-t-in-rtf

This comment has been minimized.

Copy link
Member

commented Oct 26, 2018

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.