https://en.wikipedia.org/wiki/ACID#Characteristics
Property | Desc |
---|---|
A | all or nothing // on power failures, errors, and crashes trxn is either COMMITED or ROLLED BACK |
C | ensures that any transaction will bring the database from one valid state to another. |
I | concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially, i.e., one after the other (JDBC default isolation level: TRANSACTION_READ_COMMITTED, https://docs.oracle.com/cd/E19830-01/819-4721/beamv/index.html), https://en.wikipedia.org/wiki/Isolation_(database_systems) |
D | once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors |
ACID and database transactions?
DB, LMU 2013
Distributed algos and protocols
- Commit request phase, voting phase
- Commit phase, completion phase
Master Node/Co-ordinator Secondary Node/ Cohort
QUERY TO COMMIT
-------------------------------->
Execute The Transaction
(adds to UNDO_LOG & REDO_LOG)
& prepare*/abort*
VOTE YES/NO
<-------------------------------
commit*/abort* COMMIT/ROLLBACK
-------------------------------->
ACKNOWLEDGMENT commit*/abort*
<--------------------------------
end
**2PC disadvantage: **
The greatest disadvantage of the 2PC protocol is that it is a blocking protocol.
If the master(coordinator) fails permanently, some secondaries(cohorts) will never resolve their
transactions:
After a secondary has sent an agreement message to the master, it will block until a commit or
rollback is received.
https://en.wikipedia.org/wiki/Partition_(database)
When and why are relational database joins expensive?
Performance Considerations for Join Queries
Relational Algebra/ JOINs
What is Cartesian product in relational algebra?
weights = {
[packageA1, 100g], [packageA2, 200g]
}
shippingDates = {
[packageB1, 07-2016], [packageB2, 08-2016], [packageB3, 09-2016]
}
weights * shippingDates = { {[packageA1, 100g], [packageB1, 07-2016]},
{[packageA1, 100g], [packageB2, 08-2016]},
{[packageA1, 100g], [packageB3, 09-2016]},
{[packageA2, 200g], [packageB1, 07-2016]},
{[packageA2, 200g], [packageB2, 08-2016]},
{[packageA2, 200g], [packageB3, 09-2016]}
What is the difference between LEFT JOIN
and LEFT OUTER JOIN
?, JWN, 07-2016
Intersection: Set1 INNER JOIN
Set2
(A-B) U (B-A): Set1 OUTER JOIN
Set2
The result of a left outer join (or simply left join) for tables A and B always contains all rows
of the "left" table (A), even if the join-condition does not find any matching row in the "right" table (B).
LEFT
OUTER JOIN
operation (INTV, HUM 06-08-2016)
It preserves the unmatched rows from the first (left) table,
joining them with a NULL row in the shape of the second (right) table.
INNER JOIN vs. Natural JOIN (JOIN ON), speed-wise?
Set difference (A-B)
in SQL – M SQL Server
SELECT DISTINCT A.*
FROM (A LEFT OUTER JOIN B on A.ID=B.ID)
WHERE B.ID IS NULL
- https://devcenter.heroku.com/articles/postgresql-indexes
- https://www.postgresql.org/docs/9.1/sql-set-constraints.html
- https://stackoverflow.com/questions/5300307/not-deferrable-versus-deferrable-initially-immediate
- https://www.postgresql.org/docs/10/ddl-partitioning.html
- https://blog.timescale.com/blog/scaling-partitioning-data-postgresql-10-explained-cd48a712a9a1/
-
https://www.percona.com/blog/2019/07/10/wal-retention-and-clean-up-pg_archivecleanup/
-
https://www.postgresql.org/docs/9.3/continuous-archiving.html
-
https://docs.microsoft.com/en-us/azure/backup/backup-azure-database-postgresql
Database connection pooling is a method used to keep database TCP connections open so they can
be reused by others.
https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing
max_connections = ((os_core_count * 2) + effective_spindle_count)
If your little 4-Core i7 server with one hard disk:
connection pool = ((4 * 2) + 1) = 9
databases connection transport protocols
- TCP
- Socket (Unix socket file connection to local server)
https://www.datastax.com/dev/blog/binary-protocol
-
Database stores all records inside a fixed-size disk unit which is commonly called a "page" (Some database engine sometimes calls it a "block" instead).
-
the internal basic structure to organize the data in the database files.
-
https://www.postgresql.org/docs/8.0/storage-page-layout.html
-
https://dev.mysql.com/doc/internals/en/innodb-page-structure.html
- https://www.postgresql.org/docs/9.3/high-availability.html
- https://wiki.postgresql.org/wiki/Streaming_Replication