New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relations and referential integrity in NoSQL databases #2127

Open
bajtos opened this Issue Dec 6, 2018 · 4 comments

Comments

Projects
None yet
4 participants
@bajtos
Copy link
Member

bajtos commented Dec 6, 2018

Our current implementation of model relations (has-many, has-one, belongs-to) is deeply rooted in SQL and based on the assumption that the database take care of referential integrity for us.

Example 1: "Customer has many Order instances" and "Order belongs to Customer". When creating a new Order instance, we expect the database to verify that Order.customerId is matching the id value of an existing Customer record. We don't have any reliable & atomic way to do this check at LoopBack side.

Example 2: "Customer has one Credentials instance". When creating a new Credentials instance, we expect the database to verify that there are no other Credentials instances already created for the user. We don't have any reliable & atomic way to do this check at LoopBack side.

SQL databases provide FOREIGN KEY and UNIQUE constraints that work great for this flavor of relations.

The situation becomes more tricky when we try to map this approach to NoSQL databases. Many NoSQL databases do not provide FOREIGN KEY and UNIQUE constraints, this is often a constraint caused by CAP theorem.

For example, it's not possible to enforce UNIQUE constraint when the model data is stored in multiple physical machines and a network partition occurs (a node performing a write operation is not able to reach other nodes because of networking problems, and thus is cannot verify that the new value is not violating uniqueness constraint for records stored on those nodes).

I think we should rethink the way how we are modelling relations and offer different flavors optimized for different backends.

For example, instead of storing a foreign key in the target model, we can store id of related model(s) in the source model and use optimistic locking scheme to enforce the constraints

class Customer {
  // hasMany relation
  orderIds: string[];

  // hasOne relation
  credentialsId: string;
}

// Algorithm for creating a new Order

// 1. Check that the customer exists
const customer = await customerRepo.findById(customerId);

// 2. Create the order
const order = await orderRepo.create(orderData);
try {
  // 3. Add the new order to the customer
  customer.orderIds.push(order.id);
  await customerRepo.replace(customer);
} catch (err) {
  if (/* a conflict occurred, e.g. somebody deleted the Customer */) {
    // 4. Roll back on conflict
    await orderRepo.deleteById(order.id);
  }
  throw err;
}

We can even store the related models as embedded documents, this should work great for Document databases.

class Customer {
  // hasMany relation
  orders: Order[];

  // hasOne relation
  credentials: Credentials;
}

Related issues & discussions:

  • lb4 relations referential integrity for hasMany/belongsTo #1718

@strongloop/loopback-next @strongloop/loopback-maintainers thoughts?

@b-admike

This comment has been minimized.

Copy link
Member

b-admike commented Dec 7, 2018

@bajtos thank you for starting the discussion around NoSQL backend support for relations; I get the gist of your proposal and since we're not using the LB3 relation engine, it's logical to introduce different flavours of relations for NoSQL DBs.

@b-admike

This comment has been minimized.

Copy link
Member

b-admike commented Dec 7, 2018

I do like to add though that the bulk of the issues that arose with MongoDB and relations in LB4 have to do with the strictObjectIdCoercion flag and how the connector treats ID values. Maybe it's worth it to test it out with cloudant and see the behaviour there as well.

@raymondfeng

This comment has been minimized.

Copy link
Member

raymondfeng commented Dec 7, 2018

For example, instead of storing a foreign key in the target model, we can store id of related model(s) in the source model and use optimistic locking scheme to enforce the constraints

The referencesMany and referencesOne relations store FKs with the source model. For example:

Customer
  - publicProfileId ( a customer has a public profile)

Customer
  - emailIds ( a customer has multiple emails)
@bajtos

This comment has been minimized.

Copy link
Member

bajtos commented Dec 13, 2018

See my #1718 (comment). I think we need to look at the bigger picture first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment