Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: implement database subseting for MongoDB #70

Open
evoxmusic opened this issue Apr 29, 2022 · 3 comments
Open

Feature: implement database subseting for MongoDB #70

evoxmusic opened this issue Apr 29, 2022 · 3 comments
Labels
feature New feature request

Comments

@evoxmusic
Copy link
Contributor

Implement database subsetting for MongoDB as we did for PostgreSQL..

However, MongoDB is not a relational database and we need to support "Virtual Foreign Key". Meaning, as a user I want to indicate that column collection_a.post_id is linked to column collection_b.id and then keep the consistency across the collections.

@evoxmusic evoxmusic added the feature New feature request label Apr 29, 2022
@benny-n
Copy link
Contributor

benny-n commented Apr 30, 2022

Meaning, as a user I want to indicate that column collection_a.post_id is linked to column collection_b.id

How would you indicate that as a user?
If there isn't a known standard in MongoDB for such behavior, I'm not really sure how we can add support for this concept.

@evoxmusic
Copy link
Contributor Author

evoxmusic commented Apr 30, 2022

From my experience, developers using MongoDB (and any NoSQL db) end up managing relations between collections (MongoDB table concept) from their code. We can add a way to declare virtual relations between tables in the YAML file. Eg.

source:
  connection_uri: postgres://root:password@localhost:5432/root
  database_subset:
    database: public
    table: orders
    strategy_name: random
    strategy_options:
      percent: 50
    passthrough_tables:
      - us_states
    virtual_relations:
      - from_table: collection_a
        from_column: post_id
      - to_table: collection_b
        to_column: id

WDYT?

@benny-n
Copy link
Contributor

benny-n commented May 1, 2022

This could work, but I think it'd be best if we'd limit this to IDs only, i.e you can only reference from a field of type bson::oid::ObjectId to another field of type bson::oid::ObjectId .

Another thing to keep in mind is that the MongoDB dump parser works differently from the postgres one, in the sense that postgres uses query strings to build its DB while Mongo actually builds all of the DB in memory from the archive dump. This will probably have an impact on the way the subsetting strategy should be implemented for MongoDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature request
Projects
None yet
Development

No branches or pull requests

2 participants