Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADR] Core APIs database technology #39

Open
StevenMalaihollo opened this issue Mar 31, 2022 · 10 comments
Open

[ADR] Core APIs database technology #39

StevenMalaihollo opened this issue Mar 31, 2022 · 10 comments
Assignees
Labels

Comments

@StevenMalaihollo
Copy link
Contributor

StevenMalaihollo commented Mar 31, 2022

There are 2 types or databases that are considered:

  1. Relational
  2. Document
    • MongoDB
    • Firestore
    • CosmosDB
    • ArangoDB

Considerations

  1. Costs
  2. Security
  3. Governance (offboarding of team members)
  4. Scaling
  5. Community
  6. Learning curve
  7. Redundancy
  8. Development speed (migrations, containerized)
  9. Data integrity
  10. Performance
  11. Querying features

Database Comparison

Relational

Pros

  • Guaranteed data integrity
  • Easy to learn and widely used
  • Large community
  • Full sorting and filter capabilities
  • Likely to be more performant for join operations

Cons

  • Azure SQL Database is the only mainstream serverless solution
  • Time consuming migrations

Document

Pros

  • Existing experience with MongoDB
  • High development speed
  • Most support serverless scaling
  • Learning curve is not steep
  • Likely to be more performant for simple read/write operations

Cons

  • Easily lose data integrity
  • Tooling is not standardized (querying, emulating)

Summary

After lining up the considerations for Relational and Document DBs we come to our final candidates:

  1. Azure SQL Database serverless
  2. MongoDB (Serverless is in preview and not governed)
  3. CosmosDB with Mongo API (Not full support of mongo API, weak data integrity)
  4. Firestore (Weak data integrity, limited querying)
  5. ArangoDB (Not serverless, small community)
  6. CockroachDB, PlanetScale (Not mainstream, lacks governance)
  7. Google SQL

Decision

  1. For the source of truth APIs we're going to use Google Cloud SQL
    1. Mainstream solution
    2. Forces good data integrity
  2. For query API we're suggest going for a document database with strong search and querying capabilities
    1. MongoDB looks like a promising option, but we need more research to be sure
    2. Other options include ArangoDB, Elastic Search

Consequences

  • Slower development, especially when migration is needed
  • We need to test scaling performance
  • Team needs to get familiar with SQL

Alternatives

Document databases where not chosen for the source of truth APIs because;

  • they risk losing data integrity
  • they often lack serverless capabilities
  • sometimes lack community

CockroachDB / PlanetScale

  • Are not mainstream providers
  • No big name governance system
@github-actions
Copy link

Remember that ADRs are publicly available hence do not include any confidential information in the issue description!
To read more about ADR please refer to documentation.

@piotrczyz
Copy link
Member

I'm quite surprise over CosmoDb with Mongo API https://docs.microsoft.com/en-us/azure/cosmos-db/mongodb/mongodb-introduction. Maybe you can consider that for you APIs.

Why wouldn't you use Document database as a source of truth? @StevenMalaihollo

@JakubC-projects
Copy link
Contributor

JakubC-projects commented Mar 31, 2022

I'm quite surprise over CosmoDb with Mongo API https://docs.microsoft.com/en-us/azure/cosmos-db/mongodb/mongodb-introduction. Maybe you can consider that for you APIs.

Why wouldn't you use Document database as a source of truth? @StevenMalaihollo

CosmosDB wasn't chosen because it doesn't fully support Mongo API, notably lacks ability to define a collection schema. We feel that, for those basic APIs, data integrity is a top priority, so missing that is a big negative. However if it turns out that the serverless sql doesn't scale well enough, we are ready to switch to a Document database

@andreasgangso
Copy link
Member

Use postgres on azure

@andreasgangso
Copy link
Member

Why not google sql?

@piotrczyz
Copy link
Member

Can we do the anaysis here? #7

@andreasgangso
Copy link
Member

Likely to be more performant for simple read/write operations

I think this is false

image

https://www.enterprisedb.com/news/new-benchmarks-show-postgres-dominating-mongodb-varied-workloads

PostgreSQL 11 was found to be faster than MongoDB 4.0 in almost every benchmark.
Throughput was higher, ranging from dozens of percent points up to one and even two
orders of magnitude on some benchmarks. Latency, when measured by the benchmark,
was also lower on PostgreSQL.

https://info.enterprisedb.com/rs/069-ALB-339/images/PostgreSQL_MongoDB_Benchmark-WhitepaperFinal.pdf

@StevenMalaihollo
Copy link
Contributor Author

Likely to be more performant for simple read/write operations

I think this is false

image

https://www.enterprisedb.com/news/new-benchmarks-show-postgres-dominating-mongodb-varied-workloads

PostgreSQL 11 was found to be faster than MongoDB 4.0 in almost every benchmark.
Throughput was higher, ranging from dozens of percent points up to one and even two
orders of magnitude on some benchmarks. Latency, when measured by the benchmark,
was also lower on PostgreSQL.

https://info.enterprisedb.com/rs/069-ALB-339/images/PostgreSQL_MongoDB_Benchmark-WhitepaperFinal.pdf

Interesting, our source for this was this whitepaper, it compares MySQL instead of PostgreSQL with MongoDB:

image
link

@JakubC-projects
Copy link
Contributor

Why not google sql?

After looking at Azure Serverless SQL pricing model, it isn't really that good, and the SQL flavor is Microsoft specific.
We would like to use a mainstream SQL flavor (like Postgres or MySQL).
Therefore we're going to try GCP's SQL offer (probably postgres flavor).

@u12206050
Copy link
Member

u12206050 commented Apr 3, 2022 via email

@StevenMalaihollo StevenMalaihollo moved this from Te be Tested (Technical) to To be Archived in Membership Backlog Apr 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Membership Backlog
To be Archived
Development

No branches or pull requests

5 participants