Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get the db-reset working with Docker #76

Closed
Pamplemousse opened this issue Aug 22, 2016 · 9 comments
Closed

Get the db-reset working with Docker #76

Pamplemousse opened this issue Aug 22, 2016 · 9 comments

Comments

@Pamplemousse
Copy link
Contributor

Following discussions from #70, it seems that the db-reset process needs some update to work well (and elegantly) with all the installation options.

What do you think about:

  • using seeds?
  • using a database.json containing a initial version of the database?
@binarymist
Copy link
Collaborator

binarymist commented Sep 14, 2016

Can you explain what you mean by seeds and database.json? I'm not sure of how these could be applied on container creation rather than container start.

The only way I could see that this could be done is that the mongo Dockerfile would have to know all about the domain, obviously we would need a mongo dockerfile (not that that's an issue https://github.com/binarymist/NodeGoat/blob/DockerNonRootUser-mongo-experiment/app/data/Dockerfile). The two ways I can see:

  1. The new mongo Dockerfile would need the database creation and population smarts from db-reset.js. Which means if we leverage that as it is, then the mongo Dockerfile now needs node as well in order to run the scirpt, this is obviously polluting the mongo container with node, which is not a good thing. Also we would have to copy the existing config and create a package.json in the new mongo Dockerfile context.
  2. Another way would be to separate out the datastore creation and population code from db-reset.js into something that docker can use (shell script), so that both grunt (JavaScript context) and Dockerfile (docker context) have access to the datastore creation and population code (obviously significantly affecting the current architecture), otherwise we end up maintaining more than one datastore creation and population script which violates DRY principle. Accessing the data script from JS and docker would mean crossing process boundries for JavaScript when running without docker. This option is probably the least offensive of the two, but it does mean re-writing the existing datastore creation and population script as a shell script and creating a mongo Docerfile.

Thoughts and other ideas please @ckarande @Pamplemousse ?

@ckarande
Copy link
Member

@Pamplemousse it would be helpful if you could elaborate more about database.json and seeds.
@binarymist, out of the two options you listed, I am more leaned towards 2nd, but I would like to understand option proposed by @Pamplemousse first.
Thank you both for your inputs.

@binarymist
Copy link
Collaborator

The other option is... Don't do anything. How many people does this affect? It's also increasing the complexity and LoC, which is not something that should be taken lightly.

@ckarande
Copy link
Member

Good point. As of now, nodegoat doesn't demonstrate any issues that are dependent on mongodb. So we could provide an option to use in memory db (using https://github.com/typicode/json-server or similar) as part of the config . For docker deployment, users can choose this option to avoid all the hassle about mongo container without affecting the what NodeGoat has to offer.

@binarymist
Copy link
Collaborator

binarymist commented Sep 14, 2016

This is a bit of a tangent though isn't it? in-memory would just replace what we have now with the same functionality. User looses data on container start. Doesn't actually solve the problem highlighted by @rrequero and @Pamplemousse , just makes things less complex from where they are currently, right? Don't get me wrong, I like the idea, but doesn't address this issue.

NodeGoat isn't supposed to be a production app, as far as I'm aware, so maybe it is the way to go

@ckarande
Copy link
Member

I need to give some more thought as well, but I was imagining to insert the seed data (which is part of the the db-rest currently) into in-memory version of mongo db (such as this https://www.npmjs.com/package/mongo-mock). So each time the container starts, the data is gets added to the in-memory db first. Most of the in-memory db for mongo are just npm modules, mainly designed to aid testing. It would run as part of the application code. However, I am also not completely convinced with this approach, as it would pollute app code with concerns related to creating seed data in in-memory db. Plus, I am not sure if all required features are supported in any mongo in-memory db module to run it seamlessly.

@binarymist
Copy link
Collaborator

binarymist commented Sep 15, 2016

The whole issue @Pamplemousse had was that the seed data should be inserted at container creation time, not run-time, so each time the container is started, the same data exists that existed when it was stopped. So, the container (Dockerfile) needs to handle this.

If we don't do anything, then the behavior you are suggesting still occurs, because when the container starts currently, the data-store is re-seeded.

@ckarande
Copy link
Member

Ah..thank you for clarifying it. If that is the case, I agree we shouldn't do anything about this issue. As you mentioned, nodegoat is not a production level app. So I wouldn't worry about the minor overhead of inserting data at container run-time. It is not a huge amount of data, so it is not worth engineering a solution for knowing the complexities involved.

Based on these inputs I am closing this issue. @Pamplemousse if you do not agree or have any other suggestions, please feel free to reply with a comment.

@Pamplemousse
Copy link
Contributor Author

As you said, this is maybe a bit of an over-engineering problem considering the usage of the nodegoat project.

Maybe future selves will change our minds, so I will clarify what I meant ; for the record.


Seeding

Seeds are data stored in a readable format in files (e.g .yml) that are read by a tool to populate the database.

However, as @binarymist in his first answer:

then the mongo Dockerfile now needs node as well in order to run the script, this is obviously polluting the mongo container with node, which is not a good thing

Even if database seeding is a weel-spread practice, it does not really fit this project: increasing complexity, containers with too many responsabilities.

This is kind of what the db-reset.js is doing, but this is unclear, and it is difficult for newcomers to see what kind of operations are performed by the script.

Database.json

Which lead me to the second solution: database.json.

The idea is to version an initial state of the db (containing basic data: make an export right after the current db-reset.js has been run) in the repo.

From there, use mongoimport when the db needs to be reset. Could be done in a script (db-reset.js) or via the Dockerfile.

I think this would not break the current ways of doing things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants