Types of Databases
Type	Description	Example Systems
Relational Databases (RDBMS)	Store data in tables with rows and columns, use SQL for queries, enforce schemas and relationships	MySQL, PostgreSQL, Oracle DB, SQL Server, SQLite
NoSQL Databases	Flexible schema, store data as documents, key-values, graphs, or wide-columns	MongoDB (document), Redis (key-value), Cassandra (wide-column), Neo4j (graph)
In-Memory Databases	Data stored in RAM for ultra-fast access, used for caching or real-time	Redis, Memcached
Time-Series Databases	Optimized for time-stamped data (metrics, logs)	InfluxDB, TimescaleDB
Graph Databases	Store data as nodes and edges for complex relationships	Neo4j, Amazon Neptune
Why relational databases are reliable and not scalable fast :
	‚Ä¢ Hard to Scale Horizontally: Relational databases store tightly linked data, making it difficult to split across servers without complex joins and sharding logic.
	‚Ä¢ Highly Reliable: They enforce ACID rules and strict schemas, ensuring every transaction is consistent and recoverable.
	‚Ä¢ Built for Stability: Proven tools for backup, failover, and monitoring make them ideal for critical systems like banking and inventory.
	

Most Used Relational Databases:
‚Ä¢ MySQL(Oracle): Popular, open-source, widely used in web apps.
‚Ä¢ Oracle Database(PL/SQL): Enterprise-grade, rich features, costly.
‚Ä¢ PostgreSQL(opensource not by single company): Advanced features, highly extensible, strong support for complex queries.
‚Ä¢ SQLite(opensource not by single company): Lightweight, embedded DB for local/small apps.
‚Ä¢ SQL Server(Microsoft): Microsoft‚Äôs enterprise DB, well integrated with Windows ecosystem.(t-sql only in microsoft sql)




Category	MySQL (InnoDB)	PostgreSQL	SQL Server
Performance & Scalability	- Optimized for simple primary-key lookups and read caching- Can hit transaction contention on very write-heavy workloads without careful tuning	- MVCC implementation avoids lock contention- Multicore parallel query execution gives strong mixed-load performance	- Cost-based optimizer with mature parallelism excels at complex analytical queries, especially on Windows-tuned hardware
Feature Set & Standards Compliance	- Basic SQL-92 compliance; advanced features like full CTEs arrived late- Widely supported by ORMs with minimal dialect quirks	- One of the most standards-compliant RDBMSes- Full support for window functions, recursive CTEs, custom types, multiple procedural languages (PL/pgSQL, PL/Python)	- Proprietary T-SQL extensions (TRY‚Ä¶CATCH, MERGE, CLR integration)- Rich procedural capabilities but less portability
Extensibility & Ecosystem	- Pluggable storage engines (InnoDB, MyISAM, MyRocks)- Large plugin market but limited user-defined function support	- Unbeatable extensibility: custom operators, index types (GIN, GiST), domain types- First-class extensions like PostGIS, TimescaleDB	- CLR-based extensions, deep Azure integration (Data Factory, Synapse)- Locked into Microsoft ecosystem
Tooling & Developer Experience	- Lightweight CLI tools (mysql, mysqldump)- Ubiquitous hosting support (cPanel, AWS RDS)- No single unified GUI for profiling	- Mature CLI (psql, pg_dump), robust backup/recovery (pgBackRest, WAL archiving)- Vibrant GUIs (pgAdmin, DBeaver)	- Industry-standard SSMS/Visual Studio integration- Built-in profiler, Activity Monitor, Tuning Advisor- Heavier install, Windows-centric
Licensing & Total Cost of Ownership	- Open-source GPL or commercial Oracle editions- Low cost but potential vendor lock-in on Oracle forks	- BSD-licensed; free for any use- Many cloud-hosted managed options with no licensing overhead	- Commercial Core/CAL licensing can be expensive- Free Express edition limited by CPU/RAM
‚Ä¢ CTEs (Common Table Expressions): Named, temporary result sets defined using the WITH keyword that you can reference within a single query to break complex queries into readable, reusable parts (e.g., recursive or multi-step data transformations).
‚Ä¢ SQL Dialects: Vendor-specific variations of SQL syntax and features (like T-SQL for SQL Server or PL/pgSQL for PostgreSQL) that extend or modify standard SQL, causing slight ‚Äúquirks‚Äù in how queries must be written or functions behave.
‚Ä¢ ORMs (Object-Relational Mappers): Tools that let you interact with databases using code (like Python or Java classes) instead of raw SQL, converting objects into database rows automatically.
	‚Ä¢ Examples: SQLAlchemy (Python), Hibernate (Java), and Django ORM simplify data handling, migrations, and queries while reducing SQL boilerplate.



Term	Definition
Schema	A logical container that groups related tables, views, and other objects within a database.
Table (Relation)	A two-dimensional dataset consisting of rows (tuples) and columns (attributes) that stores data records.
Column (Attribute)	A named field in a table defining a specific data attribute and its data type.
Row (Tuple)	A single record in a table, representing one instance of the data with values for each column.
Primary Key	A column or set of columns whose combined values uniquely identify each row within a table.
Foreign Key	A column or set of columns that enforces a link by referencing the primary key of another table.
Index	A data structure that allows for faster retrieval of records based on one or more columns.
View	A virtual table defined by a stored query, presenting data without storing it physically.
Trigger	A procedural construct that automatically executes specified actions in response to table events. By default for keyword like insert .A stored procedure is a user defined piece of code written in the local version of PL/SQL, which may return a value (making it a function) that is invoked by calling it explicitly. A trigger is a stored procedure that runs automatically when various events happen (eg update, insert, delete).
Transaction	A sequence of one or more SQL operations treated as a single unit, ensuring ACID properties.
Constraint	A rule applied to table columns to enforce data integrity, such as NOT NULL, UNIQUE, or CHECK constraints.
Domain	A user-defined data type that specifies allowable values and constraints for a column.
ER Diagram	A graphical model that depicts entities (tables) and the relationships between them.
Normalization	The process of organizing tables to reduce redundancy and dependency through a series of normal forms.
Join	An operation that combines rows from two or more tables based on related column values.
Stored Procedure	A precompiled block of SQL code stored in the database that can be invoked by name to perform operations. -- done by user
Function	A database routine that returns a single value and can be used within SQL expressions.


Databases Used by Each App
App	Primary Database(s)
Google	Cloud Spanner for global, strongly-consistent OLTP (e.g. Ads, Gmail) (Google Cloud, Wikipedia) Bigtable for massive low-latency wide-column data (e.g. Search indexes, Analytics) (Wikipedia)
Google Maps	Bigtable to store and serve map tile metadata at low latency and massive scale (Wikipedia)
Facebook	MySQL (InnoDB/MyRocks) for core social graph and user-profile data (Scaleyourapp) Cassandra for messaging and analytics workloads (DataStax Documentation)
Temu	MySQL for transactional order and inventory data (Lifewire) Redis/Memcached for caching hot product data‚Ä†
Duolingo	PostgreSQL for user progress and course-state storage (Lifewire) Redis for session caching and rate-limiting‚Ä†
ChatGPT	Custom vector-search engine (e.g. FAISS) + PostgreSQL/Redis for metadata and session state (Backlinko)
TikTok	MySQL (sharded via TikV/Vitess) for user and video metadata (Backlinko) HBase for time-series and feed logs ‚Ä†
WhatsApp	SQLite on device for message history (Wikipedia) Erlang Mnesia for ephemeral server-side state‚Ä†
Instagram	PostgreSQL for primary relational data (Backlinko) Cassandra for feed and notification streams ‚Ä†
YouTube	Bigtable for video-view statistics and analytics (DCLessons) Spanner for comment and metadata consistency‚Ä†
Google Chrome	SQLite for local browsing history/bookmarks (embedded) (Wikipedia)
CapCut	SQLite (mobile) for local project caches (Medium) MongoDB for backend user-video metadata‚Ä†
Max	MySQL for user subscriptions (Lifewire) Cassandra for streaming metrics‚Ä†
Spotify	PostgreSQL for user playlists and metadata (Lifewire) Cassandra for play-history events‚Ä†
Monopoly Go	DynamoDB for game-state and leaderboards (AWS) (Lifewire)
Audible	Oracle DB for catalog and transaction data (Lifewire) ElastiCache (Redis) for caching‚Ä†
Candy Crush Saga	SQLite (mobile) for level data and progress (Medium) Amazon RDS (MySQL) for cross-device sync‚Ä†
Amazon Shopping	DynamoDB for product/catalog lookups (Lifewire) Aurora (MySQL-compatible) for orders and user data‚Ä†
Netflix	Cassandra for viewing history and recommendations (Lifewire) EVCache (Memcached) for edge caching‚Ä†
Gmail	Cloud Spanner for mailbox OLTP (Wikipedia) Bigtable for search indexes and analytics‚Ä†

Notes:
	‚Ä¢ Most web-scale apps use a combination of a relational engine (MySQL, PostgreSQL, Spanner) for core transactional data and a NoSQL or wide-column store (Bigtable, Cassandra, DynamoDB, HBase) for high-throughput time-series, analytics, or feed data.
	‚Ä¢ Embedded apps (Chrome, Candy Crush, CapCut) rely on SQLite locally.
	‚Ä¢ Exact internal architectures can vary by region and over time; above reflects the most cited public information.



AI era databaes for chatbots:

‚Ä¢ Vector Databases: Specialized stores for high-dimensional embeddings (numerical representations of text, images, etc.) that support ultra-fast nearest-neighbor similarity searches, ideal for semantic search, recommendations, and anomaly detection.
‚Ä¢ Pinecone: A fully managed, serverless vector-database service that auto-scales, provides real-time indexing/queries, and integrates metadata filtering‚Äîenabling production AI systems (RAG agents, personalization engines) without the overhead of managing custom similarity-search infrastructure.

	
	
	
RELATIONAL DATABASE MANAGMENT SYSTEM:
	
	Bank Branch Database Schema ‚Äî Quick Overview
		init.sql:
		'''sql
		-- Step 1: Create the branch-specific database if it doesn't exist
		IF DB_ID('Bank_Branch_Mumbai') IS NULL
		BEGIN
		    CREATE DATABASE Bank_Branch_Mumbai;
		END
		GO
		 
		-- Step 2: Switch to that database
		USE Bank_Branch_Mumbai;
		GO
		 
		-- Step 3: Create Customers table
		IF OBJECT_ID('dbo.Customers', 'U') IS NULL
		BEGIN
		    CREATE TABLE Customers (
		        CustomerID INT IDENTITY(1,1) PRIMARY KEY,
		        FullName VARCHAR(100),
		        Email VARCHAR(100) UNIQUE,
		        Phone VARCHAR(15),
		        Address TEXT,
		        CreatedAt DATETIME DEFAULT GETDATE()
		    );
		END
		GO
		 
		-- Step 4: Create Accounts table
		IF OBJECT_ID('dbo.Accounts', 'U') IS NULL
		BEGIN
		    CREATE TABLE Accounts (
		        AccountID INT IDENTITY(1,1) PRIMARY KEY,
		        CustomerID INT FOREIGN KEY REFERENCES Customers(CustomerID),
		        AccountType VARCHAR(50),  -- e.g. Savings, Current, FD
		        Balance DECIMAL(15,2),
		        OpenedAt DATETIME DEFAULT GETDATE()
		    );
		END
		GO
		 
		-- Step 5: Create Transactions table
		IF OBJECT_ID('dbo.Transactions', 'U') IS NULL
		BEGIN
		    CREATE TABLE Transactions (
		        TransactionID INT IDENTITY(1,1) PRIMARY KEY,
		        AccountID INT FOREIGN KEY REFERENCES Accounts(AccountID),
		        Amount DECIMAL(15,2),
		        TransactionType VARCHAR(50), -- e.g. Deposit, Withdrawal
		        Timestamp DATETIME DEFAULT GETDATE(),
		        Remarks TEXT
		    );
		END
		GO
		 
		-- Step 6: Create Employees table
		IF OBJECT_ID('dbo.Employees', 'U') IS NULL
		BEGIN
		    CREATE TABLE Employees (
		        EmployeeID INT IDENTITY(1,1) PRIMARY KEY,
		        FullName VARCHAR(100),
		        Role VARCHAR(50),
		        Email VARCHAR(100) UNIQUE,
		        BranchCode VARCHAR(20),
		        HiredAt DATETIME DEFAULT GETDATE()
		    );
		END
		GO
		'''
		
		
			‚ñ™ Creates a branch-specific database if not exists.
			‚ñ™ Creates tables: Customers, Accounts, Transactions, Employees ‚Äî only if they don‚Äôt exist.
			‚ñ™ Uses IF OBJECT_ID(...) IS NULL checks to avoid recreating existing tables.
			‚ñ™ Tables store customer info, accounts, transactions, and branch employees.
			
			
		What Is a Schema?
			‚ñ™ A schema defines the structure of data in a database: tables, columns, relationships, constraints, indexes, etc.
			‚ñ™ Types:
			‚ñ™ Physical Schema: how data is stored on disk.
			‚ñ™ Logical Schema: tables, keys, constraints.
			‚ñ™ External Schema: user views and permissions.
			‚ñ™ Great question! Here are the core rules of a schema in any relational database system:
			‚ñ™ üß± Rules of a Database Schema
		
			  0. Relationships:
				Many to one
				One to many
				Many to many
				One to one
			`1. Uniqueness of Table and Column Names
				‚Ä¢ Each table name must be unique within a schema.
				‚Ä¢ Column names in a table must also be unique.
			2. Defined Data Types
				‚Ä¢ Every column must have a clearly defined data type (e.g., INT, VARCHAR(100), DATE).
				‚Ä¢ Ensures data is stored and processed correctly.
			3. Primary Key Requirement
				‚Ä¢ Every table should have a Primary Key to uniquely identify each row.
				‚Ä¢ Cannot be NULL or duplicate.
					‚óã Ex : CustomerID INT PRIMARY KEY
 
			4. Foreign Key Integrity
				‚Ä¢ Foreign Keys must match primary keys in referenced tables.
				‚Ä¢ Enforces referential integrity between related tables.
				‚Ä¢ FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
 
			5. Not Null Constraints
				‚Ä¢ Columns that must always have a value should be marked as NOT NULL.
				‚Ä¢ Email VARCHAR(100) NOT NULL
 
			6. Unique Constraints
				‚Ä¢ Prevents duplicate values in a column (like email or account number).
				‚Ä¢ Email VARCHAR(100) UNIQUE
 
			7. Check Constraints
				‚Ä¢ Enforces logical rules on data values.
				‚Ä¢ CHECK (Balance >= 0)
 
			8. Default Values
				‚Ä¢ Sets a default value when none is provided.
				‚Ä¢ CreatedAt DATETIME DEFAULT GETDATE()
 
				‚Ä¢ Absolutely! Let's walk through examples for Rule 9 (Normalization) and Rule 10 (Indexing) with simple, real-world cases.
			9: Normalization Example
				‚ùå Bad Schema (Not Normalized)
				CREATE TABLE Orders (
    OrderID INT PRIMARY KEY,
    CustomerName VARCHAR(100),
    CustomerEmail VARCHAR(100),
    ProductName VARCHAR(100),
    ProductPrice DECIMAL(10,2),
    OrderDate DATE
);
 
								üîç Problem:
					‚óã Repeats customer and product info in every order.
					‚óã Data duplication and hard to update (e.g., if a customer changes email).
				
				
				‚úÖ Normalized Schema
				Step 1: Split into multiple tables
				CREATE TABLE Customers (
    CustomerID INT PRIMARY KEY,
    Name VARCHAR(100),
    Email VARCHAR(100)
);

CREATE TABLE Products (
    ProductID INT PRIMARY KEY,
    Name VARCHAR(100),
    Price DECIMAL(10,2)
);

CREATE TABLE Orders (
    OrderID INT PRIMARY KEY,
    CustomerID INT,
    ProductID INT,
    OrderDate DATE,
    FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID),
    FOREIGN KEY (ProductID) REFERENCES Products(ProductID)
);
 
				üí° Benefits:
				‚Ä¢ No duplication
				‚Ä¢ Easier updates
				‚Ä¢ Scalable and clean design
				

			10: Indexing Example
				üßæ Sample Table
				CREATE TABLE Transactions (
    TransactionID INT PRIMARY KEY,
    AccountID INT,
    Amount DECIMAL(10,2),
    TransactionDate DATE
);
 
				‚ùå Problem:
				If you're querying a specific account's transactions often:
				SELECT * FROM Transactions WHERE AccountID = 12345;
 
				This gets slow on large tables.
				
				
				‚úÖ Solution: Add an Index
				CREATE INDEX idx_account ON Transactions(AccountID);
 
				üí° Benefits:
				‚Ä¢ Makes WHERE AccountID = ... queries much faster.
				‚Ä¢ Does not affect data; only boosts read performance.


			
			‚ñ™ ‚úÖ Optional but Good Practice:
			‚ñ™ Use naming conventions (snake_case, PascalCase)
			‚ñ™ Add comments to explain table/column purposes
			‚ñ™ Use schemas (like dbo, hr, finance) to organize objects
			‚ñ™ 
			
			
		How Schemas Are Used
			‚ñ™ For designing, standardizing, controlling access, and ensuring data integrity.
			‚ñ™ Managed via SQL scripts and migrations.
			‚ñ™ Evolve schemas carefully with migrations.
			
			
		Can You Re-run init.sql to Change Schema?
			Scenario	Data Loss?	Explanation
			Re-run with IF NOT EXISTS	No	Existing tables stay; no data lost
			Add new tables	No	New tables created
			Add columns in CREATE TABLE	No	Existing tables unchanged
			Use DROP TABLE + CREATE	Yes	Data lost due to table recreation
			Use ALTER TABLE for changes	No	Safely modify schema without data loss
			
			
		Best Practice for Schema Changes
			‚ñ™ Do NOT re-run full create scripts for schema changes.
			‚ñ™ Use migration scripts with ALTER TABLE to add or modify columns safely.
			‚ñ™ Example to add column:
			ALTER TABLE Customers ADD DateOfBirth DATE NULL;
 
			
		Summary
			‚ñ™ init.sql sets up schema for new branch, creating only missing tables.
			‚ñ™ Changing schema after deployment requires migrations (ALTER TABLE), not full reinitialization.
			‚ñ™ Avoid destructive operations like dropping tables unless you back up data.
	
	üß± Instance
		a. An instance is the actual data in the database at a moment.
		b. It changes when INSERT, UPDATE, or DELETE happen.
		c. It‚Äôs stored physically on disk inside database tables.
		d. Each time you query, you see a snapshot of the current instance.
		e. Used in: Every real-time DB operation; essential in transactions and backups.
		
		
	üîç View
		a. A view is a virtual table created using a SELECT query.
		b. It shows a filtered or joined version of existing data.
		c. Views don‚Äôt store data; they pull from real tables dynamically.
		d. Changes in underlying tables reflect immediately in the view.
		e. Used in: Reporting, simplifying complex queries, enforcing data access control.
	Transaction:
		BEGIN TRANSACTION;

-- 
		Step 1: Withdraw 100 from Account 1
UPDATE Accounts
SET Balance = Balance - 100
WHERE AccountID = 1;

		
-- Step 2: Deposit 100 to Account 2
UPDATE Accounts
SET Balance = Balance + 100
WHERE AccountID = 2;

		
-- Step 3: Insert transaction records for audit
INSERT INTO Transactions (AccountID, Amount, TransactionType, Remarks)
VALUES (1, -100, 'Withdrawal', 'Transfer to Account 2');

		
INSERT INTO Transactions (AccountID, Amount, TransactionType, Remarks)
VALUES (2, 100, 'Deposit', 'Transfer from Account 1');

-
		- If all above succeed, commit changes
COMMIT TRANSACTION;
 
		What this does:
			‚ñ™ Uses a transaction block to group multiple operations.
			‚ñ™ Ensures atomicity: either all updates happen, or none if error occurs.
			‚ñ™ Keeps the database consistent and isolated during the transfer.
			‚ñ™ If anything fails, you can ROLLBACK to undo partial changes.
			This kind of transaction happens after the schema and data exist, during normal database usage.
		Transactions ensure multiple related database changes succeed or fail together, maintaining data integrity.
		 They‚Äôre used in both manual and automated processes (e.g., triggers, application code) after schema setup to keep data consistent.
		










1. Global Infrastructure
2. Cloud Architecture
3. Management and Developers Tools
4. Shared Responsibility Model
5. Compute:
	
	
6. Storage:
7. 


8. Databases:
	1.SQL
	2.NO-SQL like MongoDB, Cassandra, and Redis, use a variety of data models, which can include key-value, document, column-family, and graph databases. 
	3.key-value database
	4.data warehouse vs data lake vs data mart:
		Data lake is like dumping place for all types of data structured and unstrcutured.
		Data base is for structured data and mostly in the form of tables
		Data warehouse is a welly cleaned and transformed data for bussines analyitics.
		Data mart is a domin specified data warehouse.
	5.OLAP (Online Analytical Processing)- complex data analysis- which interacts with user commands and OLTP (Online Transaction Processing)-processing massive transcations - backend saving.. are two different types of database systems that serve distinct purposes and play important roles in data science and the broader field of database management.
	6.
		
	
	
	
	

	

9. Networking :
	Cloud native networking services
	Enterprise/hybrid networking
	Virtual private cloud and subnets
	Security groups vs NACLs.
	
	

10. EC2 (Elastic Compute Cloud)
11. EC2 Pricing Models
12. Identity
13. Application Integration
14. Containers
15. Governance
16. Provisioning

17. Serverless:
Serverless Computing Definition: Serverless computing is a cloud computing model where developers build and run applications without managing underlying infrastructure.

Key Characteristics:
1. Event-Driven
2. Automatic Scaling
3. Pay-Per-Use Billing
4. Stateless Functions
5. Short-Lived
6. Abstraction of Infrastructure
7. Multi-Tenancy
8. Easy Deployment
	
18. Windows on AWS
19. Logging
20. ML (Machine Learning):

21. AI (Artificial Intelligence):

22. Big Data:
	

	

	
23. AWS Well Architected Framework
24. TCO (Total Cost of Ownership) and Migration
25. Billing and Pricing
26. Security:

# Database & Cloud Computing Reference Guide

## 1. Database Types & Categories

### Relational Databases (RDBMS)
- **MySQL** - Popular open-source, web applications
- **PostgreSQL** - Advanced features, highly extensible
- **Oracle DB** - Enterprise-grade, PL/SQL
- **SQL Server** - Microsoft ecosystem, T-SQL
- **SQLite** - Lightweight, embedded, local apps

### NoSQL Databases
- **Document**: MongoDB, CouchDB
- **Key-Value**: Redis, DynamoDB, Memcached
- **Wide-Column**: Cassandra, HBase
- **Graph**: Neo4j, Amazon Neptune

### Specialized Databases
- **Time-Series**: InfluxDB, TimescaleDB
- **In-Memory**: Redis, Memcached
- **Vector**: Pinecone, FAISS (for AI/ML)
- **Search**: Elasticsearch, Solr

## 2. Database Terminology

### Core Concepts
- **Schema** - Logical structure defining tables, columns, relationships
- **Instance** - Actual data in database at a specific moment
- **View** - Virtual table created from SELECT query
- **Index** - Data structure for faster record retrieval
- **Constraint** - Rules enforcing data integrity
- **Domain** - User-defined data type with allowable values

### Table Structure
- **Table (Relation)** - Two-dimensional dataset with rows/columns
- **Row (Tuple)** - Single record instance
- **Column (Attribute)** - Named field with specific data type
- **Primary Key** - Unique identifier for each row
- **Foreign Key** - Reference to primary key in another table

### Advanced Features
- **Trigger** - Automatic procedure on table events
- **Stored Procedure** - User-defined precompiled code block
- **Function** - Database routine returning single value
- **Transaction** - Sequence of operations as single unit (ACID)
- **Join** - Combine rows from multiple tables
- **CTE** - Common Table Expression for complex queries
- **Normalization** - Organizing data to reduce redundancy

## 3. Database Design Principles

### ACID Properties
- **Atomicity** - All or nothing execution
- **Consistency** - Data integrity maintained
- **Isolation** - Concurrent transactions don't interfere
- **Durability** - Committed changes persist

### Relationships
- **One-to-One** - Single record relates to single record
- **One-to-Many** - Single record relates to multiple records
- **Many-to-Many** - Multiple records relate to multiple records

### Schema Rules
1. Unique table/column names
2. Defined data types for all columns
3. Primary key requirement
4. Foreign key integrity
5. NOT NULL constraints
6. UNIQUE constraints
7. CHECK constraints
8. Default values
9. Normalization
10. Proper indexing

## 4. Data Processing Types

### OLTP vs OLAP
- **OLTP** (Online Transaction Processing) - Real-time transactions, INSERT/UPDATE/DELETE
- **OLAP** (Online Analytical Processing) - Complex analysis, reporting, data mining

### Data Storage Architectures
- **Database** - Structured data in tables
- **Data Lake** - Raw structured/unstructured data repository
- **Data Warehouse** - Cleaned, transformed data for analytics
- **Data Mart** - Domain-specific subset of data warehouse

## 5. Popular App Database Choices

### Web Scale Apps
- **Google**: Cloud Spanner, Bigtable
- **Facebook**: MySQL (InnoDB/MyRocks), Cassandra
- **Netflix**: Cassandra, EVCache (Memcached)
- **Amazon**: DynamoDB, Aurora (MySQL-compatible)
- **Instagram**: PostgreSQL, Cassandra

### Mobile Apps
- **WhatsApp**: SQLite (device), Erlang Mnesia (server)
- **Chrome**: SQLite (local storage)
- **Candy Crush**: SQLite (mobile), Amazon RDS (sync)

## 6. SQL Variants & Tools

### SQL Dialects
- **T-SQL** - SQL Server (Microsoft)
- **PL/SQL** - Oracle Database
- **PL/pgSQL** - PostgreSQL
- **Standard SQL** - ANSI/ISO standard

### ORM Tools
- **SQLAlchemy** (Python)
- **Hibernate** (Java)
- **Django ORM** (Python)
- **Entity Framework** (.NET)

## 7. Cloud Computing Categories

### Compute Services
- **Virtual Machines** - EC2, Azure VMs, Google Compute Engine
- **Containers** - Docker, Kubernetes, AWS ECS/EKS
- **Serverless** - AWS Lambda, Azure Functions, Google Cloud Functions

### Storage Services
- **Object Storage** - S3, Azure Blob, Google Cloud Storage
- **Block Storage** - EBS, Azure Disk, Persistent Disk
- **File Storage** - EFS, Azure Files, Cloud Filestore

### Database Services
- **Managed SQL** - RDS, Azure SQL, Cloud SQL
- **NoSQL** - DynamoDB, Cosmos DB, Firestore
- **Data Warehouse** - Redshift, Synapse, BigQuery

### Networking
- **VPC** - Virtual Private Cloud
- **Subnets** - Network segmentation
- **Security Groups** - Instance-level firewall
- **NACLs** - Network Access Control Lists (subnet-level)
- **Load Balancers** - Traffic distribution

## 8. Serverless Computing

### Key Characteristics
- **Event-Driven** - Triggered by events
- **Auto-Scaling** - Scales automatically with demand
- **Pay-Per-Use** - Billed only for execution time
- **Stateless** - No persistent server state
- **Short-Lived** - Functions run for limited time
- **Infrastructure Abstraction** - No server management

### Use Cases
- API backends
- Data processing
- Event handling
- Microservices
- Real-time file processing

## 9. AI/ML Database Technologies

### Vector Databases
- **Purpose** - Store high-dimensional embeddings
- **Use Cases** - Semantic search, recommendations, RAG systems
- **Examples** - Pinecone, Weaviate, Chroma, FAISS

### ML Data Pipeline
- **Feature Store** - Centralized repository for ML features
- **Model Registry** - Version control for ML models
- **Data Versioning** - Track dataset changes
- **MLOps** - DevOps practices for ML workflows

## 10. Performance & Scaling Concepts

### Database Performance
- **Indexing** - B-tree, Hash, Bitmap indexes
- **Query Optimization** - Cost-based optimizers
- **Caching** - Redis, Memcached
- **Connection Pooling** - Manage database connections

### Scaling Strategies
- **Vertical Scaling** - Increase server resources
- **Horizontal Scaling** - Add more servers
- **Sharding** - Distribute data across databases
- **Replication** - Master-slave, master-master
- **Load Balancing** - Distribute traffic

## 11. Data Security & Governance

### Security Measures
- **Encryption** - At rest and in transit
- **Access Control** - Role-based permissions
- **Audit Logging** - Track database activities
- **Backup & Recovery** - Data protection strategies

### Compliance
- **GDPR** - European data protection
- **HIPAA** - Healthcare data privacy
- **SOX** - Financial data integrity
- **Data Lineage** - Track data flow and transformations