# Chapter 4: Core Cloud Services - The Universal Toolkit

Now that you understand the landscape of cloud providers, it is time to master the fundamental building blocks that comprise every cloud architecture. Regardless of whether you choose AWS, Azure, or GCP, the core services—Compute, Storage, Databases, Networking, and Identity Management—follow similar architectural patterns with analogous implementations across platforms.

This chapter provides a comprehensive, hands-on exploration of these universal services. We will progress from provisioning your first virtual machine to designing complex networking topologies and securing them with granular access controls. By the end of this chapter, you will have the practical skills to deploy a multi-tier application architecture in the cloud.

## 4.1 Compute Services: The Engine of the Cloud

Compute is the processing power that runs your applications. Cloud providers offer three primary compute paradigms, each representing a different level of abstraction and management responsibility.

### 4.1.1 Virtual Machines (IaaS)
Virtual Machines (VMs) emulate physical computers, providing maximum control over the operating system and runtime environment.

**Key Concepts:**
*   **Instance Types:** Providers categorize VMs by resource ratios:
    *   *General Purpose:* Balanced CPU and memory (AWS T3/M6, Azure Bs/Dsv3, GCP E2/N2)
    *   *Compute Optimized:* High CPU-to-memory ratio for batch processing (AWS C6i, Azure Fsv2, GCP C2)
    *   *Memory Optimized:* High memory for databases and caches (AWS R6i, Azure Esv3, GCP M2)
    *   *GPU/Accelerated:* NVIDIA GPUs for ML and graphics (AWS P4, Azure NC, GCP A2)
*   **Metadata & User Data:** Cloud VMs can access metadata (IP addresses, instance IDs) via special endpoints. User Data allows you to run scripts at boot time for automated configuration.

**Code Snippet: Multi-Cloud VM Provisioning**
Below are equivalent CLI commands to launch a basic web server across the three major platforms:

```bash
# AWS EC2 - Launch with user data script
aws ec2 run-instances \
    --image-id ami-0c02fb55956c7d316 \
    --instance-type t3.micro \
    --key-name my-key \
    --security-group-ids sg-123456 \
    --user-data '#!/bin/bash
        yum update -y
        yum install -y httpd
        systemctl start httpd
        systemctl enable httpd
        echo "<h1>Hello from AWS EC2</h1>" > /var/www/html/index.html'

# Azure VM - Using Azure CLI
az vm create \
    --resource-group myRG \
    --name myVM \
    --image Ubuntu2204 \
    --size Standard_B1s \
    --admin-username azureuser \
    --generate-ssh-keys \
    --custom-data '#cloud-config
        package_update: true
        packages:
          - nginx
        runcmd:
          - systemctl start nginx'

# GCP Compute Engine
gcloud compute instances create my-vm \
    --zone=us-central1-a \
    --machine-type=e2-micro \
    --image-family=debian-11 \
    --image-project=debian-cloud \
    --metadata-from-file startup-script=startup.sh
```

**Scaling Mechanisms:**
*   **Vertical Scaling (Scale Up):** Increasing CPU/RAM of a single instance (requires downtime on traditional VMs, but possible live with some limitations).
*   **Horizontal Scaling (Scale Out):** Adding more instances behind a load balancer (preferred for high availability).

### 4.1.2 Container Services
Containers package applications with their dependencies, ensuring consistency across development, testing, and production environments.

**Container Orchestration Options:**
1.  **Managed Kubernetes (EKS/AKS/GKE):** Full control over container orchestration with automated master node management.
2.  **Serverless Containers (Fargate/Container Instances/Cloud Run):** Run containers without managing the underlying VM cluster.

**Code Snippet: Docker Basics for Cloud**
Before deploying to managed services, you must containerize your application:

```dockerfile
# Dockerfile - Multi-stage build for production optimization
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY . .
EXPOSE 3000
USER node
CMD ["node", "server.js"]
```

**Build and Push to Cloud Registry:**
```bash
# AWS ECR
aws ecr get-login-password | docker login --username AWS --password-stdin aws_account_id.dkr.ecr.region.amazonaws.com
docker build -t my-app .
docker tag my-app:latest aws_account_id.dkr.ecr.region.amazonaws.com/my-app:latest
docker push aws_account_id.dkr.ecr.region.amazonaws.com/my-app:latest

# Azure ACR
az acr login --name myregistry
docker build -t myregistry.azurecr.io/my-app:v1 .
docker push myregistry.azurecr.io/my-app:v1

# GCP Artifact Registry
gcloud auth configure-docker us-central1-docker.pkg.dev
docker build -t us-central1-docker.pkg.dev/my-project/my-repo/my-app:latest .
docker push us-central1-docker.pkg.dev/my-project/my-repo/my-app:latest
```

### 4.1.3 Serverless Functions (FaaS)
Function as a Service (FaaS) abstracts all infrastructure management. You upload code, define triggers, and the cloud provider handles execution, scaling, and availability.

**Characteristics:**
*   **Event-Driven:** Triggered by HTTP requests, file uploads, database changes, or message queue events.
*   **Stateless:** Each invocation is independent; no local persistence between executions.
*   **Ephemeral:** Short-lived (typically 15-minute maximum execution time on major platforms).

**Code Snippet: Serverless Function Example**
Here is a Python Lambda function (AWS) that processes an image upload to S3:

```python
import boto3
import json

def lambda_handler(event, context):
    """
    Triggered by S3 PutObject event
    Event structure contains bucket name and object key
    """
    s3 = boto3.client('s3')
    
    # Extract file info from event
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']
    
    # Generate thumbnail logic here
    print(f"Processing {key} from {bucket}")
    
    return {
        'statusCode': 200,
        'body': json.dumps(f'Successfully processed {key}')
    }
```

**Azure Equivalent (Python):**
```python
import azure.functions as func

def main(req: func.HttpRequest) -> func.HttpResponse:
    name = req.params.get('name')
    if not name:
        try:
            req_body = req.get_json()
        except ValueError:
            pass
        else:
            name = req_body.get('name')
    
    return func.HttpResponse(f"Hello {name}!")
```

## 4.2 Storage Services: Persistence in the Cloud

Cloud storage is categorized by how data is accessed and the performance characteristics required.

### 4.2.1 Object Storage
Object storage manages data as objects (files + metadata + unique ID) rather than file hierarchies. It is infinitely scalable and ideal for unstructured data.

**Key Attributes:**
*   **Durability:** Typically 99.999999999% (11 nines), meaning virtually zero data loss.
*   **Consistency Models:** Read-after-write consistency for new objects; eventual consistency for overwrites and deletes (varies by provider).
*   **Storage Classes/Tiers:**
    *   *Standard:* Frequently accessed data (highest cost, immediate access).
    *   *Infrequent Access:* Backups, disaster recovery (lower cost, retrieval fee).
    *   *Archive:* Long-term retention (lowest cost, minutes to hours retrieval).

**Code Snippet: Object Storage Operations**
```python
# AWS S3 with Boto3
import boto3

s3 = boto3.client('s3')

# Upload file with server-side encryption
s3.upload_file(
    'local_file.pdf', 
    'my-bucket', 
    'documents/report.pdf',
    ExtraArgs={'ServerSideEncryption': 'AES256'}
)

# Generate pre-signed URL for temporary access (expires in 1 hour)
url = s3.generate_presigned_url(
    'get_object',
    Params={'Bucket': 'my-bucket', 'Key': 'documents/report.pdf'},
    ExpiresIn=3600
)

# Lifecycle policy (transition to Glacier after 90 days)
s3.put_bucket_lifecycle_configuration(
    Bucket='my-bucket',
    LifecycleConfiguration={
        'Rules': [{
            'ID': 'MoveToGlacier',
            'Status': 'Enabled',
            'Transitions': [{
                'Days': 90,
                'StorageClass': 'GLACIER'
            }]
        }]
    }
)
```

### 4.2.2 Block Storage
Block storage provides raw storage volumes attached to compute instances, functioning like virtual hard drives.

**Use Cases:** Boot volumes for VMs, database storage requiring high IOPS, applications needing file systems.

**Performance Tiers:**
*   **SSD-backed:** High IOPS (Input/Output Operations Per Second), low latency for transactional workloads (AWS io2, Azure Premium SSD, GCP Hyperdisk).
*   **HDD-backed:** High throughput for sequential workloads like big data (AWS sc1, Azure Standard HDD).

**Important Concept:** Block storage is **zonal**—it resides in a specific Availability Zone and can only attach to instances in that same zone. For multi-AZ availability, you must use replication or shared file systems.

### 4.2.3 File Storage
File storage provides Network Attached Storage (NAS) accessible via standard protocols (NFS, SMB).

**Comparison Table:**

| Feature | Object Storage | Block Storage | File Storage |
|---------|---------------|---------------|--------------|
| **Access Method** | REST API (HTTP) | Mounted as disk | NFS/SMB protocols |
| **Scalability** | Unlimited | Single volume limits | Petabyte scale |
| **Latency** | Milliseconds | Microseconds | Milliseconds |
| **Best For** | Backups, media, logs | Databases, boot disks | Shared content, home directories |
| **Example** | S3, Blob, Cloud Storage | EBS, Disk, Persistent Disk | EFS, Files, Filestore |

## 4.3 Database Services: Managed Data Persistence

### 4.3.1 Managed Relational Databases (RDS/Azure SQL/Cloud SQL)
These services automate provisioning, patching, backup, and failover for traditional SQL databases.

**Key Features:**
*   **Multi-AZ Deployment:** Synchronous standby replica in different AZ for automatic failover (99.95%+ availability).
*   **Read Replicas:** Asynchronous replicas for read scaling, often cross-region for disaster recovery.
*   **Automated Backups:** Point-in-time recovery capabilities (typically 7-35 days retention).

**Connection Best Practices:**
Never hardcode database credentials in applications. Use:
1.  **Environment Variables:** Injected at runtime.
2.  **Secrets Managers:** AWS Secrets Manager, Azure Key Vault, GCP Secret Manager.
3.  **IAM Authentication:** Database access via IAM roles rather than passwords (supported by Amazon Aurora, Cloud SQL).

### 4.3.2 NoSQL Databases
**DynamoDB (AWS), Cosmos DB (Azure), Firestore (GCP):**

These handle semi-structured data with flexible schemas and massive scale.

**Data Modeling for NoSQL:**
Unlike SQL (where you normalize data), NoSQL requires denormalization and access-pattern-driven design.

```json
// DynamoDB Single-Table Design Example
// Entity: User with Orders embedded (denormalized)
{
  "PK": "USER#123",
  "SK": "PROFILE#123",
  "name": "Alice Smith",
  "email": "alice@example.com",
  "orders": [
    {"orderId": "ORD#001", "total": 150.00},
    {"orderId": "ORD#002", "total": 75.50}
  ]
}
```

**Critical Concepts:**
*   **Partition Keys:** Determine data distribution. Hot partitions occur when one key receives disproportionate traffic.
*   **Global Secondary Indexes (GSI):** Allow querying on non-primary key attributes.
*   **Capacity Modes:**
    *   *Provisioned:* You specify read/write capacity units (cheaper for predictable workloads).
    *   *On-Demand:* Pay per request (better for unpredictable/spiky workloads).

### 4.3.3 Data Warehouses
For analytics workloads (OLAP rather than OLTP), cloud data warehouses offer columnar storage and massive parallel processing.

*   **AWS Redshift:** PostgreSQL-compatible, requires cluster management (serverless option available).
*   **Azure Synapse Analytics:** Formerly SQL Data Warehouse, integrates with Spark.
*   **Google BigQuery:** True serverless—no cluster management, pay per query (storage separate from compute).

## 4.4 Networking Fundamentals in the Cloud

Networking in the cloud is software-defined, allowing you to create complex topologies programmatically.

### 4.4.1 Virtual Networks (VPC/VNet)
The Virtual Private Cloud is your isolated network segment in the public cloud.

**Architecture Components:**
*   **CIDR Blocks:** Define the IP address range for your VPC (e.g., 10.0.0.0/16 provides 65,536 IPs).
*   **Subnets:** Subdivisions of the VPC mapped to specific Availability Zones:
    *   *Public Subnets:* Have route tables pointing to an Internet Gateway (IGW).
    *   *Private Subnets:* Route tables point to NAT Gateways (for outbound internet) or remain isolated.
*   **Route Tables:** Define traffic routing rules (local traffic stays internal, 0.0.0.0/0 goes to IGW or NAT).

**Code Snippet: Terraform for VPC Architecture**
```hcl
# AWS VPC with Public and Private Subnets
resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true
  
  tags = {
    Name = "production-vpc"
  }
}

resource "aws_subnet" "public" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.1.0/24"
  availability_zone       = "us-east-1a"
  map_public_ip_on_launch = true  # Auto-assign public IPs
  
  tags = {
    Name = "public-subnet-1a"
    Type = "Public"
  }
}

resource "aws_subnet" "private" {
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.2.0/24"
  availability_zone = "us-east-1a"
  
  tags = {
    Name = "private-subnet-1a"
    Type = "Private"
  }
}

# Internet Gateway for public subnet access
resource "aws_internet_gateway" "gw" {
  vpc_id = aws_vpc.main.id
}

# NAT Gateway for private subnet outbound access (requires Elastic IP)
resource "aws_eip" "nat" {
  domain = "vpc"
}

resource "aws_nat_gateway" "main" {
  allocation_id = aws_eip.nat.id
  subnet_id     = aws_subnet.public.id
}
```

### 4.4.2 Network Security
*   **Security Groups:** Stateful virtual firewalls at the instance level (control inbound/outbound traffic). Changes apply immediately.
    *   *Best Practice:* Principle of least privilege—only open required ports (e.g., 443 for HTTPS, 22 for SSH from specific IPs).
*   **Network ACLs (NACLs):** Stateless firewalls at the subnet level (supplement to Security Groups). They evaluate rules in numbered order.
*   **PrivateLink/Private Link:** Securely access cloud services (like S3 or DynamoDB) without traversing the public internet.

### 4.4.3 Load Balancing and DNS
*   **Application Load Balancer (Layer 7):** Routes based on HTTP headers, paths, or hostnames. Supports SSL termination.
*   **Network Load Balancer (Layer 4):** Ultra-low latency TCP/UDP load balancing, preserving source IP addresses.
*   **DNS (Route 53/Azure DNS/Cloud DNS):**
    *   *A Records:* Map domain to IP.
    *   *Alias Records:* Map domain to AWS/Azure/GCP resources (handles IP changes automatically).
    *   *Health Checks:* Automatic failover to healthy endpoints.

## 4.5 Identity and Access Management (IAM)

IAM is the security foundation of cloud computing. It answers: *Who can access what resources under what conditions?*

### 4.5.1 Core IAM Components

**Users:** Represent individual people or applications requiring long-term credentials. **Avoid using root account credentials for daily operations.**

**Groups:** Collections of users with shared permissions (e.g., "Developers", "Database-Admins").

**Roles:** Temporary security credentials used by services, applications, or federated users. **Preferred over users for service-to-service communication.**

**Policies:** JSON documents defining permissions. Structure follows:
```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",  // or "Deny"
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": "203.0.113.0/24"
        }
      }
    }
  ]
}
```

### 4.5.2 IAM Best Practices (Industry Standards)
1.  **Least Privilege:** Start with zero permissions; add only what is necessary.
2.  **Use Roles for EC2/Compute:** Never store access keys on instances. Use instance profiles/attached roles.
3.  **Multi-Factor Authentication (MFA):** Enforce for root accounts and privileged users.
4.  **Regular Access Reviews:** Rotate credentials, remove unused users/permissions.
5.  **Service Control Policies (SCP):** In multi-account environments, guardrails that limit maximum available permissions.

**Code Snippet: IAM Role Trust Policy**
```json
// Trust policy allowing EC2 to assume this role
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
```

## 4.6 Monitoring and Logging: Observability

You cannot manage what you cannot measure. Cloud observability rests on three pillars: **Metrics**, **Logs**, and **Traces**.

### 4.6.1 Metrics (CloudWatch/Azure Monitor/Cloud Monitoring)
Time-series data points measuring resource utilization and application performance.

**Key Metrics to Monitor:**
*   **Compute:** CPU Utilization, Memory Usage, Disk I/O, Network In/Out.
*   **Storage:** Bucket size, Number of objects, API request rates.
*   **Databases:** Connection count, Read/Write latency, Cache hit ratio.

**Alarms/Alerts:** Set thresholds for proactive notification (e.g., "Notify when CPU > 80% for 5 minutes").

### 4.6.2 Logging
Centralized collection of application and infrastructure logs.

*   **Structured Logging:** Use JSON format for machine parsing.
*   **Retention Policies:** Compliance often requires 1-7 years of log retention (move to cheaper storage tiers).
*   **Log Aggregation:** Services like AWS CloudWatch Logs, Azure Monitor Logs, or Google Cloud Logging centralize logs from multiple sources.

### 4.6.3 Distributed Tracing
For microservices architectures, trace requests as they traverse multiple services to identify bottlenecks.

**OpenTelemetry:** Emerging industry standard for instrumenting code to generate traces, metrics, and logs across cloud providers.

---

### Summary

In this chapter, we built your operational toolkit for cloud computing. You learned to provision **Compute** resources across the spectrum from VMs to serverless functions, understanding when to choose each paradigm. You mastered the three **Storage** types—Object for unstructured data, Block for high-performance databases, and File for shared access—recognizing their distinct use cases. We explored **Database** options from managed relational services to NoSQL and data warehouses, emphasizing that data modeling must match your access patterns. We architected secure **Networks** with VPCs, subnets, and layered security groups, implementing the principle of defense in depth. You learned **IAM** fundamentals, understanding that security in the cloud begins with identity and least-privilege access policies. Finally, we established the foundation of **Observability** through metrics, logs, and traces.

These core services are the atoms of cloud architecture. Every solution you design—whether a simple website or a complex machine learning pipeline—will combine these elements in unique configurations. With these tools, you are ready to move from understanding individual services to architecting complete, production-ready systems.

**Next Up: Chapter 5 - Cloud-Native Application Design**
Now that you can provision infrastructure, we must learn how to architect applications specifically for the cloud environment. In the next chapter, we will explore the Twelve-Factor App methodology, microservices architecture, and design patterns for resilience and scalability. You will learn to think like a Cloud Architect, designing stateless, decoupled systems that leverage the cloud's elastic nature rather than simply lifting and shifting legacy applications.

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='3. the_big_three_platforms_overview.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='../3. building_and_deploying_cloud_applications/5. cloud_native_application_design.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
