Skip to content

devasthali-os/nosql-driver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NoSQL

Why is NoSQL faster than SQL?

Relation DB NoSQL

CAP theorem, AXP 2015, JWN 2016

C

  • Consistency in CAP actually means linearizability, which is a very specific (and very strong) notion of consistency. In particular it has got nothing to do with the C in ACID, even though that C also stands for "consistency".

  • The meaning of linearizability below.

If operation B started after operation A successfully completed, then operation B must see the the system in the same state as it was on completion of operation A, or a newer state.

Strong consistency models

eg. MongoDB Write/Read consistency, HBase

MongoDB is strongly consistent by default - if you do a write and then do a read, assuming the write 
was successful you will always be able to read the result of the write you just read. 
This is because MongoDB is a single-master system and all reads go to the primary by default. 

If you optionally enable reading from the secondaries then MongoDB becomes eventually consistent 
where it's possible to read out-of-date results. (MWA, 2016)

MongoDB also gets high-availability through automatic failover in replica sets: 
http://www.mongodb.org/display/DOCS/Replica+Sets

Write Concern for Replica Sets

w: 1 is the default write concern for MongoDB.

cfg = rs.conf()
cfg.settings = {}
cfg.settings.getLastErrorDefaults = { w: "majority", wtimeout: 5000 }
rs.reconfig(cfg)

// Requests acknowledgment that write operations have propagated to the majority of voting nodes, 
including the primary, and have been written to the on-disk journal for these nodes.

Redis - CP

https://aphyr.com/posts/283-jepsen-redis

A

Availability in CAP is defined as “every request received by a non-failing [database] node in the 
system must result in a [non-error] response”. 
It’s not sufficient for some node to be able to handle the request: any non-failing node needs to 
be able to handle it. 

Many so-called “highly available” (i.e. low downtime) systems actually do not meet this definition 
of availability.

CAP

eg. Cassandra - A but also provides C, based on tuning

Cassandra provides consistency when RRC + WRC > N(RF) 
(read replica count + write replica count > replication factor).

Cassandra Parameters for Dummies- http://www.ecyrd.com/cassandracalculator/

P

Partition Tolerance (terribly mis-named) basically means that you’re communicating over an 
asynchronous network that may delay or drop messages. The internet and all our datacenters have this 
property, so you don’t really have any choice in this matter.

CAP Theorem: Revisited, Robert Greiner

Brewer's CAP Theorem, The kool aid Amazon and Ebay have been drinking

2 nodes in a network, N1 and N2. They both share a piece of data V

https://codahale.com/you-cant-sacrifice-partition-tolerance/

http://blog.nahurst.com/visual-guide-to-nosql-systems

https://aphyr.com/posts/322-jepsen-mongodb-stale-reads

https://www.infoq.com/news/2014/04/bitcoin-banking-mongodb

http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/

http://hackingdistributed.com/2014/04/06/another-one-bites-the-dust-flexcoin/

mongodb on distributed consistency

Please stop calling databases CP or AP

CosmosDB