# Terminologies

- Components:
    - Code (e.g. python code for Lambda)
    - configuration (e.g. template file for CloudFormation) 
    - AWS resources (e.g. RDS database)
    - Executes certain tasks to satisfy a specific requirement
    - Independent unit, decoupled from other components
- Workload:
    - Set of components
    - Deliver business value
    - Ex: Python code on Lambda function and RDS database, both deployed through a CloudFormation template file, to process clients' orders faster
    - technology and business leaders communicate with this Level of details 

# Well-Architected Framework


- Implement designs that maximize the value of the cloud and scale over time
- Structured around 6 pillars:
    - Operational excellence
    - Security
    - Reliability
    - Performance efficiency
    - Cost optimization
    - Sustainability
- Benefits of the Well-Architected Framework
    - Accelerate the build and deploy process
    - Better risk management:
        - Mitigate risks before problems arise
        - Quickly fix issues when they happen
    - Cloud-native applications:
        - On the cloud since development
        - Faster deployment and scaling
    - Properly assess technology evolution impact
    - Constantly evolving framework:
        - Continuous improvement mindset

# Operational excellence

- Run workloads effectively
    - Gain insights into their operations
    - Continuously improve processes to deliver business value
- Focus Area:
    - Understand your Organization
    - Prepare
    - Operate
    - Evolve
- Understand your Organization
    - Clear objectives and priorities based on:
        - Internal and external customers' needs
        - Leadership team's requirements
        - Compliance requirements
        - Threat landscape
    - Shared understanding of business goals between teams
    - Good understanding of each team's role
    - Encourage teams to experiment approaches, take risks and escalate concerns
- Prepare
    - Design your workloads to emit information about their internal state for monitoring
    - Facilitate changes into production:
        - Version control
        - Test and validation automation
    - Plan for recovery after unsuccessful change
    - Processes to assess when a workload is production-ready
- Operate
    - Define clear KPIs and metrics related to:
        - Workload health (e.g. error rate, responsetime)
        - Operations health (e.g. successful vs. failed deployments)
    - Regularly collect and analyze them
    - Set alerts when KPIs are at risk or when anomalies arise
    - Have a process for each alert
    - Automate responses to events
- Evolve
    - Constantly analyze operations
    - Learn from failures:
        - Post-incident analysis
    - Document and share learning experiences
        - Feedback loops
    - Proactively allocate time to continuously improve and adapt your processes
- Design principles for operational excellence
    1. Perform operations as code
        - Use same methodology as software development
    2. Make small, frequent, reversible changes
    3. Refine operations procedures frequently
    4. Anticipate failure
    5. Learn from all operational failures
<center><img src="images/04.041.jpg"  style="width: 400px, height: 300px;"/></center>


# Security

- Leverage cloud technologies to protect: 
    - data
    - systems
    - assets
- Improve companies' security with best practices
- focus areas
    - Foundations
        - AWS secures the cloud, you secure your application
        - Keep contact information accurate: AWS can reach out to you quickly in case of abuse or compromise
        - Stay up to date on security matters
        - Automate security processes
        - Separate accounts for workloads but centralize environment management using AWS Organizations
    - Identity and access management
        - Centralized identity management
        - Carefully grant permissions to human and machine identities in a fine-grained fashion
        - Strong sign-in for identities with high privileges
    - Malicious Activity Detection
        - Continuous logging and analysis for malicious activity detection
        - Actionable security events
        - Automated response to event
    - Infrastructure protection
        - Several network layers
        - Ex: using subnets with no direct access to the Internet
        - Frequent scans and patches for code vulnerabilities, software integrity validations
        - Usse Managed services
    - Data protection
        - Data categorization based on criticality and sensitivity
        - For both data at rest and in transit:
            - Encryption
            - Key management
            - Certificate management
            - Access control enforcement
    - Incident response
        - Clear objectives and documented plans for security incident responses
        - Pre-deployed and ready to use investigation tools
        - Incident simulation through game days
        - Containment and recovery automation
- design principles
    1. Implement a strong identity foundation
    2. Enable traceability
    3. Apply security at all layers
    4. Automate security best practices
    5. Protect data in transit and at rest
    6. Keep people away from data
    7. Prepare for security events
<center><img src="images/04.071.jpg"  style="width: 400px, height: 300px;"/></center>


# Reliability

- Ability of workloads to correctly perform tasks when intended
- Administrators and developers ability to:
    - Operate workloads
    - Perform functional testing
    - Throughout the whole workload lifecycle
- focus areas
    - Foundations
        - Knowing and monitoring service quotas and limitations
        - Network topology planning:
            - High connectivity for public endpoints
            - Redundant connectivity between private networks
    - Workload architecture
        - Distributed systems to prevent failures:
            - Microservices (single function building blocks)
            - API communication
        - Mitigating failures:
            - Request throttling (slowing down request processing)
            - Request timeouts
    - Change management
        - Automatic adaptation to change in demand:
            - Automatic provisioning and scaling
        - Processes for change in implementation:
            - Automatic testing
            - Automatic deployment
    - Failure management
        - Automatic data backup
        - Multiple location deployment
        - Automatic failure detection and failover
        - Testing, simulation and game days:
            - evaluate Performance requirements
            - Resiliency
        - Plan for disaster recovery
- design principles
    1. Automatically recover from failure
    2. Test recovery procedures
    3. Scale horizontally to increase aggregate workload availability
    4. Stop guessing capacity
    5. Automate change management
<center><img src="images/04.072.jpg"  style="width: 400px, height: 300px;"/></center>


# Performance efficiency


- Best practices Efficient usage of resources to meet system requirements
- Maintaining efficiency as:
    - Demand changes
    - Technology evolves
- focus areas:
    - Selection
        - Understand the available resources
        - Evaluate the options and possible configurations, and their impact on performance
        - Make decisions based on performance metrics
        - Use guidance from AWS or partner
    - Review
        - Have a consistent performance review process based on well defined metrics:
            - Ex: Deming's PDCA (Plan Do Check Act)
        - Stay up to date and keep track of new resources:
            - Use the ones that improve performance
    - Monitoring
        - Collect performance data and analyze it
        - Use real-time processing and alarming
        - Establish Key Performance Indicators (KPIs) and review them regularly
    - Trade-offs
        - Understand The areas where performance is most critical
        - The levers to trade-offs in order to maximize performance on those areas
            - Consistency
            - Durability
            - Space
            - Latency
        - Measure the impact of performance improvements
- design principles
    - Democratize advanced technologies
    - Leverage AWS global infrastructure to Go global in minutes
    - Use serverless architectures
    - Experiment more often
    - Consider mechanical sympathy (Use tech and tools best aligned for your goals)
<center><img src="images/04.101.jpg"  style="width: 400px, height: 300px;"/></center>


# Cost optimization


- Cost-aware workloads
- Continuous ROI improvement:
    - Cost minimization while achieving business objectives
- focus areas
    - Practice cloud financial management
        - Bridge the gap between finance and technology teams
        - Build a cost-aware culture and processes
        - Establish cloud budget and forecasts
        - Quantify business value generated by cost optimization
    - Expenditure and usage awareness
        - Define goals and set usage quotas
        - Establish costs and usage reports
        - Decommission unused resources
    - Cost-effective resources
        - Evaluate the cost of services
        - Select the optimal configuration:
            - Type of resource
            - Size of resource
            - Number of resource
            - Pricing model
    - Manage demand and supplying resources
        - Manage demand:
            - Throttling requests (limiting)
            - Buffer-based (queues)
        - Dynamic supply:
            - Demand-based (auto-scaling)
            - Time-based (Prediction of provisioning time)
    - Optimize over time    
        - Implement consistent cost review processes
        - Stay up to date
        - Implement new sevices
- design principles
    1. Implement cloud financial management
    2. Adopt a consumption model according to business requirements
    3. Measure overall efficiency
    4. Stop spending money on undifferentiated heavy lifting (Do not contribute efforts directly to mission of the company)
    5. Analyze and attribute expenditure according to workloads
<center><img src="images/04.102.jpg"  style="width: 400px, height: 300px;"/></center>


# Sustainability

- Understanding and quantifying the environmental impact of workload
    - Energy consumption
    - Carbon emissions
- Implementing design principles and best practices to reduce this impact
- focus areas
    - Region selection
        - Include sustainability factors in your choice of regions:
        - Some regions are near renewable energy projects
        - Some regions publish lower carbon intensity reports than others
    - User behavior patterns
        - Minimize unused infrastructure by adapting to user load
        - Incorporate sustainability goals in your Service-Level Agreements (SLAs)
        - Reduce the distance network traffic must travel by adapting geographic placement to user locations
    - Software and architecture pattern
        - Code optimization to lower time and resource usage
        - Remove or refactor idle or low usage components and workloads
        - Use technologies and software pattern that minimize data processing and storage requirements
    - Data pattern
        - Limit redundant and delete unnecessary data
        - Minimize data traffic across networks
        - Back up only when difficult/impossible to recreate data
    - Hardware pattern
        - Use the most energy-efficient instances
        - Keep up to date regarding new instances improvements
        - Only use GPUs for the necessary time
        - Use managed services:
            - Shifts sustainability optimization responsibility to AWS
            - Distributes sustainability impact accross all hardware users
    - Development and deployment process
        - Evaluate the sustainability impact before performing new deployments
        - Provision build environment resources only when needed
        - Test new features using managed device farms
- Design principles for sustainability
    - Understand your impact
    - Establish sustainability goals
    - Maximize utilization
    - Anticipate and adopt new, more efficient hardware and software offerings
    - Use managed services
    - Reduce the downstream impact of your cloud workloads

<center><img src="images/04.042.jpg"  style="width: 400px, height: 300px;"/></center>
