<a href="https://colab.research.google.com/gist/JayLee18/6c8350ed67c6f1477c095aaaf8888e18/google-it-automation-course-5-configuration-and-management-and-the-cloud.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Classes from Coursera "Google IT Automation with Python Specialization" 
#Course 5 - "Configuration Management and the Cloud".

##Basics

- Manual - unmanaged configuration
- Managed - Configuration Management System
- Infrastructure as Code IaC:
 - when all configuration to deploy/manage a node in the infrastructure is stored in version control.

 Puppet:
 - Client: Puppet Agent
 - Server: Puppet Master
 - Providers: puppet agent deccides which to you & passes on the attributes to it.
 - Pack management systems
  - Windows: need to include where folder is.
- Resources - basic unit for modeling configuration we want to manage.


In [None]:
#apply sudo - rule example
class sudo {
    package { 'sudo':
             ensure = > present,}
}

In [None]:
#defining a file resource - example

class sysctl {
    #Make sure directory exists
    file {'/etc/sysctl.d':
          ensure => directory,
          }
}

In [None]:
#making sure timezone file has 3 attributes of: being a file, contents UTC zone, file contents will be Replaced.
class timezone {
    file {'/etc/timezone':
          ensure => file,
          content => 'UTC\n',
          replace => true,
          }
}

Puppet Classes:

In [None]:
#package, file & service example - Related to Network Time Protocol NTP

class ntp {
    package {'ntp':
             ensure => latest,
             }
    file {'/etc/ntp.conf':
          source => 'puppet:///modules/ntp/ntp.conf'
          replace => true, 
          }
    service { 'ntp':
             enable => true,
             ensure => running,
             }
}

Domain Specific Languages DSL (in Puppet):
- Limited in scope
- Easier to learn
- Puppet: variables, conditional stmts, functions
- Puppet Facts: variables that represent characteristics of the system.


In [None]:
#checking if smrtmontool package should be installed in physical machines

if $facts['is_virtual'] {
    package {'smartmontools':
             ensure => purged,
             }
} else {
    package { 'smartmontools':
             ensure => installed,
             }
}

Driving Principles of Configuration Management:
- Python - procedural language
- Puppet - goal oriented lang.
- Idempotent action: can be performed over & over without changing the system.
- exec - might not be idempotent, be careful as might break rules for future use.
 - onlyif attribute
- Test & Repair - actions taken only when necessery
- Stateless - each Pupper run is independent of previous one.


In [None]:
file { '/etc/issue':
      mode => '0644',
      content => 'Internal system \l \n',
      }

In [None]:
#change to idempotent with ONLYIF

exec { 'move example file':
      commanda => 'mv /home/user/example.txt /home/user/Desktop',
      onlyif => 'test -e /home/user/example.txt',
      }

Qwiklabs

In [None]:
echo $PATH

ls /

export PATH=/bin:/usr/bin

cd /etc/puppet/code/environments/production/modules/profile/manifests
cat init.pp

It's common to use numbers to represent the permissions: 4 for read, 2 for write and 1 for execute. The sum of the permissions given to each of the groups is then a part of the final number. For example, a permission of 6 means read and write, a permission of 5 means read and execute, and a permission of 7 means read, write and execute.

In [None]:
class profile {
        file { '/etc/profile.d/append-path.sh':
                owner   => 'root',
                group   => 'root',
                mode    => '0646',
                content => "PATH=/java/bin\n",
        }
}

In [None]:
#fixed version:
class profile {
        file { '/etc/profile.d/append-path.sh':
                owner   => 'root',
                group   => 'root',
                mode    => '0644',
                content => "PATH=\$PATH:/java/bin\n",
        }
}

In [None]:
sudo puppet agent -v --test

##Deploying Puppet

###Deploying Puppet Locally

Applying Rules Locally:
- Manifests - create file to store rules - .pp
- Catalog - list of rules that are generated for 1 specific computer once server evaluated all variables, conditionals & functions.


In [None]:
sudo apt install puppet-master

In [None]:
#create new file
vim tools.pp

In [None]:
package { 'htop':
         ensure => present,
         }

In [None]:
sudo puppet apply - v tools.pp

Managing Resource Relationships:
- use 'require' 
- Resource type - written in lower case, like: package, file, service.
- Relationships - written in 1st Capital (when refering to them from another Resource attribute), like File, Package, Service.


In [None]:
#ntp.pp

class ntp {
    package { 'ntp:'
          ensure => latest,
          }
    file { '/etc/ntp.conf':
          source => '/home/user/ntp.conf',
          replace => true, 
          require => Package['ntp'],
          notify => Service['ntp'],
          }
    service {'ntp':
             enable => true,
             ensure => running,
             require => File['/etc/ntp.conf'],
             }
}

include ntp

In [None]:
sudo puppet apply -v ntp.pp

In [None]:
#configuration file:
vim ntp.conf

#update servers

Organizing Puppet Modules:
- Module: a collection of manifests & associated data.
- Use Topics for separation.
- modules: ntp: has files (ntp.conf) + manifests (init.pp)



In [None]:
#to see structure
tree modules/

In [None]:
sudo apt install puppet-module-puppetlabs-apache

###Deploying Puppet to Clients

Puppet Nodes:
- Use separate Node Definitions - to apply dif rules on dif machines.
- A Node: any system where we can run a Puppet Agent.
- Define a base Class - that can re-use later for common cases.
- Node Definitions: stored in site.pp



In [None]:
#Node example, using 2 classes.

node default {
    class { 'sudo': }
    class { 'ntp':
           servers => ['ntp1.example.com', 'ntp2. example.com']
           }
}

In [None]:
#Specific node example

node webserver.example.com {
    class { 'sudo': }
    class { 'ntp':
           servers => ['ntp1.example.com', 'ntp2. example.com']
           }
    class { 'apache': }
}

Puppet's Certificate Infrastructure:
- Puppet Agent sends Facts with Manifests to Puppet Master, which puts them in Catalog back to Puppet Agent.
- Puppet uses Public Key Infrastructure PKI
- Secure Socket Layer SSL - used to check server & client's identity.
- Each machine has a pair of: Private, Public Key.
- Certificate Authority CA (of a machine, public)
- Always Authenticate Machines!
- Solution: make a script to include data in request that will be recognised.


Setting up Puppet Clients & Servers:
- 

In [None]:
sudo puppet config --section master set autosign true

#connect to client
ssh webserver

#install puppet client on machine
sudo apt install puppet

#configure to talk to server
sudo puppet config set serbver ubuntu.example.com

#test connection to puppet master
sudo puppet agent -v --test

#go bk to puppet master & create node definitions
vim /etc/puppet/code/environments/production/manifests/site.pp

In [None]:
node webserver.example.com {
    class { 'apache':}
}

node default {}

In [None]:
#set to run automatically, after above testing worked.
sudo systemctl enable puppet

sudo systemctl start puppet

sudo systemctl status puppet

###Updating Deployments

Modifying & Testing Manifests:
- Puppet Parser Validate command: checks syntax of manifests.
- --noop parameter: simulates what puppet would do, without doing it.
- Test machines: for testing out changes.
- rspec tests: to check manifests automatically.
- Apply catelog, then use scripts top check if done correctly.

In [None]:
#rspec test:
describe 'gksu', :type => :class do
  let (:facts) { { 'is_virtual' => 'false' }}
  it { should contain_package('gksu').with_ensure('latest') }
end

Safely Rolling out Changes & Validating them:
- Production: part of infrastructure where service is executed/served to users.
- Test environment
- Canaries: early adopters. Push changes here, test, then deploy if fine.
- Keep changes smaller, easier to test/fix.


Qwiklabs

In [None]:
cd /etc/puppet/code/environments/production/modules/packages

cat manifests/init.pp

In [None]:
#init
class packages {
    package { 'python-requests':
        ensure => installed,
    }
}

In [None]:
sudo chmod 646 manifests/init.pp

In [None]:
#init.pp
class packages {
   package { 'python-requests':
       ensure => installed,
   }
   if $facts[os][family] == "Debian" {
     package { 'golang':
       ensure => installed,
     }
  }
   if $facts[os][family] == "RedHat" {
     package { 'nodejs':
       ensure => installed,
     }
  }
}

In [None]:
gcloud compute instances describe linux-instance --zone=us-central1-a --format='get(networkInterfaces[0].accessConfigs[0].natIP)'

In [None]:
sudo puppet agent -v --test

In [None]:
apt policy golang

Fetching Machine info

In [None]:
cd /etc/puppet/code/environments/production/modules/machine_info
cat manifests/init.pp

In [None]:
#init.pp
class machine_info {
   file { '/tmp/machine_info.txt':
       content => template('machine_info/info.erb'),
   }
   if $facts[kernel] == "windows" {
       $info_path = "C:\Windows\Temp\Machine_Info.txt"
   } else {
       $info_path = "/tmp/machine_info.txt"
   }
}

In [None]:
#changed:
class machine_info {
  if $facts[kernel] == "windows" {
       $info_path = "C:\Windows\Temp\Machine_Info.txt"
   } else {
       $info_path = "/tmp/machine_info.txt"
   }
 file { 'machine_info':
       path => $info_path,
       content => template('machine_info/info.erb'),
   }
}


Puppet Templates:
- Embedded Puppet (EPP) uses Puppet expressions in special tags. It's easy for any Puppet user to read, but only works with newer Puppet versions. (≥ 4.0, or late 3.x versions with future parser enabled.)
- Embedded Ruby (ERB) uses Ruby code in tags. You need to know a small bit of Ruby to read it, but it works with all Puppet versions.


In [None]:
cat templates/info.erb
sudo chmod 646 templates/info.erb

In [None]:
#info.erb
Machine Information
-------------------
Disks: <%= @disks %>
Memory: <%= @memory %>
Processors: <%= @processors %>
Network Interfaces: <%= @interfaces %>
}

In [None]:
sudo puppet agent -v --test
cat /tmp/machine_info.txt

New Module - Reboot

In [None]:
sudo mkdir -p /etc/puppet/code/environments/production/modules/reboot/manifests
cd /etc/puppet/code/environments/production/modules/reboot/manifests
sudo touch init.pp

In [None]:
#init.pp
class reboot {
  if $facts[kernel] == "windows" {
    $cmd = "shutdown /r"
  } elsif $facts[kernel] == "Darwin" {
    $cmd = "shutdown -r now"
  } else {
    $cmd = "reboot"
  }
  if $facts[uptime_days] > 30 {
    exec { 'reboot':
      command => $cmd,
     }
   }
}


- shutdown /r on windows
- shutdown -r now on Darwin (macOS)
- reboot on Linux.


In [None]:
sudo nano /etc/puppet/code/environments/production/manifests/site.pp 

In [None]:
node default {
   class { 'packages': }
   class { 'machine_info': }
   class { 'reboot': }
}

##Cloud

###Cloud Computing

Cloud Services Overview:
- SaaS Software as a Service - Cloud provider delivers an entire application to the customer.
- PaaS Platform as a Service - when a cloud provider offers a preconfigured platform to the customer.
- IaaS Infrastructure as a Service - when a cloud provider supplier only bare-bones computing experience.

Scaling in the Cloud:
- Capacity - how much the service can deliver (tied to number of servers).
- QPS - queries per second.
- Scaling - capacity change.
- Upscaling - increasing capacity.
- Downscaling - decreasing cap.
 - Horizontally - add more nodes into the pool of service.
 - Vertically - making nodes bigger (resources assigned: memory, CPU).
 - Automatic scaling - Cloud provider will use metrics to automatically increase/decrease capacity.
   - Need Quotas for scaling system, to not overpay.
 - Manual Scaling
   - Need a lot of monitoring.



Evaluating the Cloud:
- Less control
- Certifications & security measures? ISO, etc
  - Multifactor authentication
  - Encrypted file systems
  - Public-key cryptography
- What kind of support?


Migrating to the Cloud:
- Lift & shift strategy - moving equipment to new bigger location.
  - same server config.
  - still need to install apps/os.
  - test that working well.
  - PaaS
    - Managed web applications - don't manage platform, just the web design code: (Amazon Elastic Beanstalk, Microsoft App Service, Google app Engine)
- Containers - applications that are packaged together with their config & dependencies.
- Types:
  - Public Clouds - service provided by third party.
  - Private clouds - your company owns the services/infrastructure.
  - Hybrid clouds - mix of public/private.
  - Multi-clouds - a mix of public/private clouds accross vendors (several vendors).


###Managing Instances in the Cloud

Spinning up VMs in the Cloud:
- Set the parameters:
  - Name
  - Region/Zone - closer to users better
  - Machine Type - how much virtual memory without overpaying?
  - Boot disk - os + disk space.
- Reference images - store contents of a machine in a reusable format.
- Templating - process of capturing all of system configs to create VMs in a repeatable way.
  - Disk image - a snapshot of a virtual machine's disk at a given point in time.

Creating a New VM with GCP Web UI:
- console.cloud.google.com
- New Project -> Create
- Home -> Open Project
- Compute Engine -> VM Instances -> Create
  - Firewall: Allow HTTP traffic.
  - Click 'command line' - can copy-paste code to create exact same VMs.
- Connect -> SSH

#### Customizing VMs in GCP:
- Reference base image - to deploy repeatedly.
- Connect -> SSH

In [None]:
git clone [link of rep]
./hello_cloud.py
#Connections on port 8000

- run the port with admin privileges on a dif nr

In [None]:
sudo ./hello_cloud.py 80

- GCP: Click under External IP link
- open Service file, copy hello_cloud.py file to the location given of /usr/local/bin
- then enable the service to run automatically.
- can test by rebooting the system

In [None]:
cat hello_cloud.service

sudo cp hello_cloud.py /usr/local/bin
sudo cp hello_cloud.service /etc/systemd/system/
sudo systemctl enable hello_cloud

sudo reboot

- after reboot, check if app is running

In [None]:
ps ax | grep hello

In [None]:
sudo apt install puppet
./hello/setup_puppet.sh

####Templating a Customized VM
- in GCP click ... next to SSH -> Stop
- click into machine name
- scroll down, click into Boot disk Name.
- click CREATE SNAPSHOT or CREATE IMAGE
- Create Image
- Instance templates -> Create instance template
  - Image: Custom Images: select the one created
  - Firewall: Allow HTTP traffic
  - Done

- VM instances - + ->Create Instance -> New VM instance from template
  - Continue
  - leave settings as is - should use template.

For Faster Batch Transactions:


In [None]:
gcloud init
#go through config - they will be used as defaults later.

#create 5 new VMs based on template
gcloud compute instances create --source-instance-template webserver-template ws1 ws2 ws3 ws4 ws5

###Automating Cloud Deployments

####Cloud Scale Deployments:
- Load balances - ensures each node receives a balanced number of requests.
- Round robin - give each node 1 request.
- Autoscaling - allow the service to change capacity as needed, while service owner only pays for the cost of machines that are in used at any given time.
- Select an ENTRY POINT. Could have layers:
  - Load Balancer Web Cache
  - Varnish tool
  - Nginx tool
  - Cloudflare
  - Fastly
- For webcaching:
  - Memcached
  - Redis



#### Orchestration:
- Automated configuration & coordination of complex IT systems & services.
- Application programming interface API
- Monitoring & Alerting

####Cloud Infrastructure as Code:
- IaC - using machine-readable files to automate configuration.
- Tools:
  - CloudFormation
  - Cloud Deployment Manager
  - Azure Resource Manager
  - Heat Orchestration Templates
  - Terraform - can interact with dif cloud providers. Uses APIs for each provider.
- Nodes can be:
  - Long-lived
  - Short-lived


####Qwiklabs - Create VM template & Automate Deployment

In [None]:
gcloud compute instances create --zone us-west1-b --source-instance-template vm1-template vm2 vm3 vm4 vm5 vm6 vm7 vm8

In [None]:
gcloud compute instances list

###Building Software for the Cloud

####Storing Data in the Cloud:
- Block storage - traditional. Local disks like with GCP instances.
  - Persistent storage: long-lived & need to keep data across reboots & updates.
  - Ephemeral storage: for temporary instances & only need local data while they're running. Good for Containers.
  - Shared file system solutions. With PaaS
  - Good when just need to get files, BUT NOT launching apps.
- Object storage - newer. AKA Blob storage.
  - Lets you place & retrieve objects in a storage bucket.
  - Blobs - binary large objects.
  - Stored in locations called Buckets.
- Databases as a service:
  - SQL - relational
  - NoSQL
- Storage class:
  - Throughput - the amount of data that you can read/write in a given amount of time.
  - Input/Output Operations per Second IOPS - measures how many reads/writes can do in 1 second, no matter how much data you're accessing.
  - Latency - time it takes to complete a read/write operation.
    - Time to first byte
- Data type:
  - Hot data - accessed frequently.
  - Cold Data
- Hot Storage - usually using Solid State Disks SSD to be faster.





#### Load Balancing:
- Round-Robin DNS: each gets 1.
  - Splits address IP into a few versions & tried which works.
  - Can't stop reaching out even if server broken.
  - Need to wait for cached records to expire.
- Set up a machine as a Dedicated Load Balancer:
  - Sticky sessions - all requests from same client always go to the same backend server. Only use if Really needed.
  - Performing health checks of backend servers.
  - GeoDNS or GeoIP to make sure to connect clients to closest server.
  - Content Delivery Networks CDNs - make up a network of physical hosts that are geographically located as close to end user as possible.

####Change Management:
- Unit Tests
- Integration Tests
- Continuous integration CI - build/test code every time there is a change.
  - Travis CI - github
- Continuous Deployment CD - to auto deploy results or Build artifacts.
  - controlling deployment with rules.
- Test vs Production Envrionment
- A/B Testing - testing A vs B configs.
  - run different instance groups for A/B.


####Understanding Limitations:
- How will app be deployed?
- Quotas/Limits?
- Rate limits? Prevent 1 service from overloading the system.
- Utilization limits - cap total amount of a certain resource that you can provision.
- Quota increase?
- Service costs?
- Maintenance & upgrades?



##Monitoring & Alerting

### Getting Started with Monitoring:
- Look at Metrics: what do you want to monitor?
- Response Code: exmpl 404 page not found.
  - Range of 500 - smth happened on server's end.
  - Range of 400 - client side problem.
- Tools by Cloud Providers:
  - AWS CloudWatch
  - Google Stack Driver
  - Zure Metrics
- Tools used across vendors:
  - Prometheus
  - DataDog
  - Nagios
- Pull Model: monitoring system periodically queries metrics.
- Push Model: our service periodically connects to monitoring system to send metrics.
- Only store metrics that you care about, otherwise costly.
- Whitebox Monitoring - checks behavior of the system from the inside.
- Blackbox monitoring - check behaviors from outside.



###Getting Alerts:
- Linux - cron
- Pages - urgent alerts
- All Alerts should be Actionable, otherwise it's Noise & should be removed.

###Service-Level Objectives:
- SLOs - preestablished performance goals for a specific service. Soft targets.
- Need to be:
  - measurable - like operating availablity, uptime.
- SLAs Service level agreements: a commitment between a provider & a client.
- four-nines SLO: error budget 0.01% or SLO 99.99%


###Basic Monitoring in GCP:
- Tool: stackdriver
  - CPU usage
  - Disk I/O
  - Network Traffic
- Create New Alerting Policy

###Troubleshooting & Debugging

####Where is the Failure coming from?
- Check for geo regions: switch locations.
- Try to run in a dif machine-type
- If recent change, do a Rollback
- Run Container locally to see if issue persists.


####Recovering from Failure:
- Automatic Back-ups & check that they are working periodically.
- Secondary instances of services if 1st one goes down.
- Have servers running on dif datacenters, so if 1 goes down, the other takes over.
- If running from office, have 2 separate internet connections in case 1 goes down.
- Use 2 dif cloud vendors.
- Have documented proceedures for when system goes down - disaster recovery plan.

####Qwiklabs

- Informational responses (100–199)
- Successful responses (200–299)
- Redirects (300–399)
- Client errors (400–499)
- Server errors (500–599)


In [None]:
sudo systemctl status apache2

In [None]:
sudo systemctl restart apache2

In [None]:
sudo netstat -nlp

In [None]:
ps -ax | grep python3

In [None]:
cat /usr/local/bin/jimmytest.py
sudo kill [process-id-2711]

In [None]:
sudo systemctl --type=service | grep jimmy

In [None]:
sudo systemctl stop jimmytest && sudo systemctl disable jimmytest