# Database & Cloud Computing Reference Guide

## 1. Database Types & Categories

### Relational Databases (RDBMS)
- **MySQL** - Popular open-source, web applications
- **PostgreSQL** - Advanced features, highly extensible
- **Oracle DB** - Enterprise-grade, PL/SQL
- **SQL Server** - Microsoft ecosystem, T-SQL
- **SQLite** - Lightweight, embedded, local apps

### NoSQL Databases
- **Document**: MongoDB, CouchDB
- **Key-Value**: Redis, DynamoDB, Memcached
- **Wide-Column**: Cassandra, HBase
- **Graph**: Neo4j, Amazon Neptune

### Specialized Databases
- **Time-Series**: InfluxDB, TimescaleDB
- **In-Memory**: Redis, Memcached
- **Vector**: Pinecone, FAISS (for AI/ML)
- **Search**: Elasticsearch, Solr

## 2. Database Terminology

### Core Concepts
- **Schema** - Logical structure defining tables, columns, relationships
- **Instance** - Actual data in database at a specific moment
- **View** - Virtual table created from SELECT query
- **Index** - Data structure for faster record retrieval
- **Constraint** - Rules enforcing data integrity
- **Domain** - User-defined data type with allowable values

### Table Structure
- **Table (Relation)** - Two-dimensional dataset with rows/columns
- **Row (Tuple)** - Single record instance
- **Column (Attribute)** - Named field with specific data type
- **Primary Key** - Unique identifier for each row
- **Foreign Key** - Reference to primary key in another table

### Advanced Features
- **Trigger** - Automatic procedure on table events
- **Stored Procedure** - User-defined precompiled code block
- **Function** - Database routine returning single value
- **Transaction** - Sequence of operations as single unit (ACID)
- **Join** - Combine rows from multiple tables
- **CTE** - Common Table Expression for complex queries
- **Normalization** - Organizing data to reduce redundancy

## 3. Database Design Principles

### ACID Properties
- **Atomicity** - All or nothing execution
- **Consistency** - Data integrity maintained
- **Isolation** - Concurrent transactions don't interfere
- **Durability** - Committed changes persist

### Relationships
- **One-to-One** - Single record relates to single record
- **One-to-Many** - Single record relates to multiple records
- **Many-to-Many** - Multiple records relate to multiple records

### Schema Rules
1. Unique table/column names
2. Defined data types for all columns
3. Primary key requirement
4. Foreign key integrity
5. NOT NULL constraints
6. UNIQUE constraints
7. CHECK constraints
8. Default values
9. Normalization
10. Proper indexing

## 4. Data Processing Types

### OLTP vs OLAP
- **OLTP** (Online Transaction Processing) - Real-time transactions, INSERT/UPDATE/DELETE
- **OLAP** (Online Analytical Processing) - Complex analysis, reporting, data mining

### Data Storage Architectures
- **Database** - Structured data in tables
- **Data Lake** - Raw structured/unstructured data repository
- **Data Warehouse** - Cleaned, transformed data for analytics
- **Data Mart** - Domain-specific subset of data warehouse

## 5. Popular App Database Choices

### Web Scale Apps
- **Google**: Cloud Spanner, Bigtable
- **Facebook**: MySQL (InnoDB/MyRocks), Cassandra
- **Netflix**: Cassandra, EVCache (Memcached)
- **Amazon**: DynamoDB, Aurora (MySQL-compatible)
- **Instagram**: PostgreSQL, Cassandra

### Mobile Apps
- **WhatsApp**: SQLite (device), Erlang Mnesia (server)
- **Chrome**: SQLite (local storage)
- **Candy Crush**: SQLite (mobile), Amazon RDS (sync)

## 6. SQL Variants & Tools

### SQL Dialects
- **T-SQL** - SQL Server (Microsoft)
- **PL/SQL** - Oracle Database
- **PL/pgSQL** - PostgreSQL
- **Standard SQL** - ANSI/ISO standard

### ORM Tools
- **SQLAlchemy** (Python)
- **Hibernate** (Java)
- **Django ORM** (Python)
- **Entity Framework** (.NET)

## 7. Cloud Computing Categories

### Compute Services
- **Virtual Machines** - EC2, Azure VMs, Google Compute Engine
- **Containers** - Docker, Kubernetes, AWS ECS/EKS
- **Serverless** - AWS Lambda, Azure Functions, Google Cloud Functions

### Storage Services
- **Object Storage** - S3, Azure Blob, Google Cloud Storage
- **Block Storage** - EBS, Azure Disk, Persistent Disk
- **File Storage** - EFS, Azure Files, Cloud Filestore

### Database Services
- **Managed SQL** - RDS, Azure SQL, Cloud SQL
- **NoSQL** - DynamoDB, Cosmos DB, Firestore
- **Data Warehouse** - Redshift, Synapse, BigQuery

### Networking
- **VPC** - Virtual Private Cloud
- **Subnets** - Network segmentation
- **Security Groups** - Instance-level firewall
- **NACLs** - Network Access Control Lists (subnet-level)
- **Load Balancers** - Traffic distribution

## 8. Serverless Computing

### Key Characteristics
- **Event-Driven** - Triggered by events
- **Auto-Scaling** - Scales automatically with demand
- **Pay-Per-Use** - Billed only for execution time
- **Stateless** - No persistent server state
- **Short-Lived** - Functions run for limited time
- **Infrastructure Abstraction** - No server management

### Use Cases
- API backends
- Data processing
- Event handling
- Microservices
- Real-time file processing

## 9. AI/ML Database Technologies

### Vector Databases
- **Purpose** - Store high-dimensional embeddings
- **Use Cases** - Semantic search, recommendations, RAG systems
- **Examples** - Pinecone, Weaviate, Chroma, FAISS

### ML Data Pipeline
- **Feature Store** - Centralized repository for ML features
- **Model Registry** - Version control for ML models
- **Data Versioning** - Track dataset changes
- **MLOps** - DevOps practices for ML workflows

## 10. Performance & Scaling Concepts

### Database Performance
- **Indexing** - B-tree, Hash, Bitmap indexes
- **Query Optimization** - Cost-based optimizers
- **Caching** - Redis, Memcached
- **Connection Pooling** - Manage database connections

### Scaling Strategies
- **Vertical Scaling** - Increase server resources
- **Horizontal Scaling** - Add more servers
- **Sharding** - Distribute data across databases
- **Replication** - Master-slave, master-master
- **Load Balancing** - Distribute traffic

## 11. Data Security & Governance

### Security Measures
- **Encryption** - At rest and in transit
- **Access Control** - Role-based permissions
- **Audit Logging** - Track database activities
- **Backup & Recovery** - Data protection strategies

### Compliance
- **GDPR** - European data protection
- **HIPAA** - Healthcare data privacy
- **SOX** - Financial data integrity
- **Data Lineage** - Track data flow and transformations