Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to add data into multiple databases (specifically admin & app db) #73

Closed
hongkongkiwi opened this issue Jun 18, 2019 · 8 comments
Labels
area/core Refers to Mongo Seeding library 🚀 enhancement New feature or request
Milestone

Comments

@hongkongkiwi
Copy link

Let me explain the use for this case.

Right now, I have a docker spinning up a mongodb database. Assuming that the database is totally empty I want to seed using "pkosiec/mongo-seeding" docker.

There's just one catch, I want to also seed mongodb users. These reside in the "admin" database, while my data is in an "app" database.

I worked around it by using a shell script to seed the db users when the mongodb docker spins up, then running the seeding docker.

It's really not ideal though, I'd like to use this docker to seed both db users in the admin database as well as my application data.

Any ideas on this one?

@pkosiec
Copy link
Owner

pkosiec commented Jun 18, 2019

Hello @hongkongkiwi!

Theoretically I could change the input data directory structure for Mongo Seeding and put another directory level on top of the existing one. The directory name would point to the database name. But, similarly to your case, most users want to provide different credentials to access different databases. That would make the whole mongo-seeding configuration quite complicated.

As you are using mongo-seeding Docker image, I would suggest running another instance in second container, side by side with the first one. The first one would seed users, and the second one - app data.
I think that would be the best solution to what you want to achieve. If you use docker-compose, or even if not, that won't be too complicated.

Tell me what do you think about this suggestion 🙂If it's not satisfying, we'll try to figure it out. I'm open for any ideas.

@pkosiec pkosiec added area/core Refers to Mongo Seeding library discussion labels Jun 18, 2019
@hongkongkiwi
Copy link
Author

hongkongkiwi commented Jun 19, 2019

Hi @pkosiec , thanks for the reply.

I think having an additional option to have an additional parent directory is a good one. It could be only used if we have an extra ENV var (that way we dont break backwards compatibility), for example SUPPORT_MULTIPLE_DATABASES=true. This is always set to false if the db name is already set via an ENV vari.

Then the structure could be /db/collection/*.js

On the other hand, I have another idea to solve this. How about if we allow two different formats for the JSON being inserted.
Method 1 (legacy):

module.exports = {
    bar: "foo"
}

Method 2 (new method)

module.exports = {
    options: {
        db: 'mydbname'
       collection: 'mycollectionname'
       db_username: 'myusername',
       db_password: 'mypassword'
    },
    data: {
      bar: "foo"
   }
}

In this way, if no special options are specified and it just works at present, however if options are specified it will use those to connect. All of the options can be optional and override existing values.

What do you think?

Currently I am using docker-compose and I do use depends_on between the mongodb and the mongo seeding image.

The problem is depends_on does not actually wait for the service to finish or become ready. It simply waits for the container to start. That means if we have lots of data in multiple databases then it's going to try to insert them almost all at once if the container starts quickly and going to cause a kind of race condition. If the app data gets there first, the users won't be ready.

The great thing about mongo_seeding currently is we can control the order of insertion if we need to. With docker you can't do that, although hopefully it should follow the depends_on chain, you can't be assured.

@pkosiec
Copy link
Owner

pkosiec commented Jun 19, 2019

The first idea is my favorite one - so I would pick extending data import directory structure and toggling multiple database support with env variable. I will try to implement the feature in the coming weeks.

In the meantime I would use depends_on with a trick - consider the following docker-compose config:

version: '3'
services:
  minio:
    image: minio/minio:latest
    environment:
      - MINIO_ACCESS_KEY=EXAMPLE_ACCESS_KEY
      - MINIO_SECRET_KEY=EXAMPLE_SECRET_KEY
    command: server /tmp/minio
  app_being_tested:
    build:
      context: ../
      dockerfile: ./Dockerfile
    depends_on:
    - minio
    command: sh -c 'while ! nc -z minio 9000; do sleep 1; done; /app/main'
    links:
    - minio
    volumes:
      - ./testdata/:/app/data/
    environment: &appConfig
      - APP_UPLOAD_ENDPOINT=minio
      - APP_UPLOAD_ACCESS_KEY=EXAMPLE_ACCESS_KEY
      - APP_UPLOAD_SECRET_KEY=EXAMPLE_SECRET_KEY
      - APP_UPLOAD_PORT=9000
      - APP_UPLOAD_SECURE=false

Here I start minio container and, once the Minio app is ready, the tested app launches.

So, in your case, I would modify the command to try to connect to MongoDB database as an user which should be inserted with first mongo-seeding container, which populates the users.

Anyway, thanks again for submitting the idea for the new feature. It is planned for 3.3.0 release.
Cheers!

@pkosiec pkosiec added 🚀 enhancement New feature or request and removed discussion labels Jun 19, 2019
@pkosiec pkosiec added this to the 3.3.0 milestone Jun 19, 2019
@hongkongkiwi
Copy link
Author

That's a great example thanks, normally I would build my own image ontop of minio and add one of the wait-for scripts (I linked in the other issue), but I didn't think of adding it directly into the command in the compose, that also makes sense as a simple example.

@pkosiec
Copy link
Owner

pkosiec commented Jun 20, 2019

BTW I've just came up with another idea: Actually, you can run two Mongo Seeding containers. If the second one uses user account which is populated with the first container, running two containers at once will work - because Mongo Seeding will try to reconnect every 0.5s until it reaches RECONNECT_TIMEOUT value. The second container will try to connect with non-existing user and then it will do some retries. So actually you don't have to use any tricks to achieve what you want to do. Correct?

@hongkongkiwi
Copy link
Author

Good idea. I think that would be a fix with the current code without restructuring.

You'd have to carefully construct the users so that the required one for the data is inserted last (otherwise if in random order it could kick it off earlier).

The only downside is it will fill the logs with failures, so hard to monitor if it really failed but that's ok.

@pkosiec
Copy link
Owner

pkosiec commented Jun 20, 2019

otherwise if in random order it could kick it off earlier

Why? The only dependency between two containers is a single user used to seed another database, is that correct? If so, there won't be any problem, you can populate users in random order.

The only downside is it will fill the logs with failures, so hard to monitor if it really failed but that's ok.

Right, at the beginning the logs from second container will look like that:

  mongo-seeding Starting... +0ms
  mongo-seeding Connecting to mongodb://127.0.0.1:27017/testing... +2ms
  mongo-seeding failed to connect to server [127.0.0.1:27017] on first connect [MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017]
  mongo-seeding Retrying... +16ms
  mongo-seeding failed to connect to server [127.0.0.1:27017] on first connect [MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017]
  mongo-seeding Retrying... +504ms
  mongo-seeding failed to connect to server [127.0.0.1:27017] on first connect [MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017]
  mongo-seeding Retrying... +502ms
  mongo-seeding failed to connect to server [127.0.0.1:27017] on first connect [MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017]
  mongo-seeding Retrying... +504ms
  mongo-seeding failed to connect to server [127.0.0.1:27017] on first connect [MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017]
  mongo-seeding Retrying... +502ms
  mongo-seeding failed to connect to server [127.0.0.1:27017] on first connect [MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017]
  mongo-seeding Retrying... +504ms

Anyway, this is just a temporary workaround, as I like the feature of allowing seeding multiple databases.

@pkosiec pkosiec modified the milestones: 3.3.0, 3.4.0 Jun 20, 2019
@pkosiec pkosiec added this to To Do in Mongo Seeding via automation Jul 13, 2019
@pkosiec pkosiec removed this from the 3.4.0 milestone Aug 20, 2019
@pkosiec pkosiec added this to the 4.0.0 milestone Dec 7, 2019
@pkosiec pkosiec moved this from To Do to Backlog in Mongo Seeding Nov 16, 2020
@pkosiec pkosiec moved this from Backlog to To Do in Mongo Seeding May 20, 2021
@pkosiec pkosiec moved this from To Do to Backlog in Mongo Seeding May 20, 2021
@pkosiec pkosiec moved this from Backlog to To Do in Mongo Seeding May 20, 2021
@pkosiec pkosiec moved this from To Do to Backlog in Mongo Seeding May 20, 2021
@pkosiec pkosiec modified the milestones: 4.0.0, Future Sep 28, 2022
@pkosiec
Copy link
Owner

pkosiec commented Dec 9, 2023

Hello again,
Here's a quick update from 2023:
Currently, the preferred way to seed multiple databases is to use the import method multiple times with a different partial config: https://github.com/pkosiec/mongo-seeding/tree/main/core#importcollections-partialconfig

The configuration merging has been improved so this should totally work:

const seeder = new Seeder(optionalPartialConfig); // you can pass some general connection options like host, port, username etc; you can also pass connection URI

const collections1 = seeder.readCollectionsFromPath(path);
await seeder.import(collections, {database: { name: "foo" }});

const collections2 = seeder.readCollectionsFromPath(path2);
await seeder.import(collections, {database: { name: "bar" }});

I'm closing this issue as resolved. We can totally revisit the idea to built-in some multi-database support, but for now there wasn't enough community interest. Feel free to comment this issue if you'd like to see something easier than the snippet above!

@pkosiec pkosiec closed this as completed Dec 9, 2023
@pkosiec pkosiec modified the milestones: Future, 4.0.0 Dec 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core Refers to Mongo Seeding library 🚀 enhancement New feature or request
Projects
Status: Done
Mongo Seeding
  
Backlog
Development

No branches or pull requests

2 participants