# Google Cloud Platform Fundamentals - Core Infrastructure

Google Cloud Platfrom offers 4 main types of services

* COMPUTE
* STORAGE
* BIG DATA
* MACHINE LEARNING

Cloud is a great place for your application and data, because using it frees you from lot of chores. Google Cloud provides reasonably priced access to the same planet scale infrastructure that Google runs on.

## What is Cloud Computing

* On Demand Self Service - No Human Intervention Needed to get Resources.
* Broad Network Access - Access from Anywhere.
* Resource Pooling - Providers share resources to customers.
* Rapid Elasticity - Get more resources quickly as needed.
* Measured Service - Pay only for what you use.

## Need for Cloud Computing

* Colocation - A data center facility is a location in which businesses can rent space for servers and other computing hardware. The Data Center Facility provides building, cooling, power, bandwidth and physical security while customer provides servers and storage. Space in the facility is often leased by the rack, cabinet, cage or room and some also provide managed services.

* Virtualization - Virtualization allows us to use resources efficiently. The requirements for the virtual data center is the same as for a physical data center like servers, storage and so on. But these are virtual devices which can be separately managed from underlying hardware. With Virtualization you still buy, house and maintain infrastructure. So you still need to guess how much hardware you still need and by when. Also set it up and keep it running.

* Serverless - This is fully managed and hosted by a third party we just pay for utilization of the resources.

![title](Quiz1.PNG)


## Google Compute Architecture

Virtualized data centers gave us IAAS - Infrastructure As A Service and PAAS - Platform As A Service.

* IAAS - Provides rawcompute, storage and network and you pay for what was allocated.
* PAAS - Binds application code to libraries that give access to infrastructure your application needs, that way you just focus on application logic. Pay for what you use.

According to some estimates avaiable publically, Google's network carries as much as 40% of the world's internet traffic everyday. Google's network is the largest of its kind on earth.

## GCP Regions and Zones

![title](GCP_Regions.PNG)

* Zone is a deployment area for a GCP resource
* Regions are Zones grouped together. These are independent geographic areas and you can choose which GCP region your resource will be in.
* All zones within the same region will have very fast network connectivity. Locations with in the same region will have round trip network latancy of under 5ms.
* You should think Zone as a single failure domain within a region, as a part of building fault tolerant applications you can spread resources across multiple zones, that protects against unexpected failures.
* You can also run resources in different regions too. Lot of customers do that. This is to bring their application closer to users around the world and also protect against natural disaster.

The Virtual World is build on top of physical servers.

Google was the first to start Bill By the Second, instead of rounding off to bigger units of time. GCP also provides Sustained use discounts for running a resources for a significant portion of the billing month.

## Why Choose Google Cloud Platform

GCP enables developers to build, test and deploy applications on Google's highly secure, reliable and scalable infrastructure.

![title](Quiz2.PNG)

![title](Quiz3a.PNG)
![title](Quiz3b.PNG)
![title](Quiz3c.PNG)

# Getting Started with Google Cloud Platform.

* When you run your workload on GCP you use projects to organize them.
* And you use Google Cloud Identity and Access Management - IAM to control who can do what.
* Projects are the main way of organizing your resources in GCP, you can use projects to group together related resources if they have common business objectives.
* Principle of Least Privilage says that each user should have only those privilages needed to do their job. GCP Customers use IAM to implement principle of least privilage.
* There are 4 ways for interacting with GCP Management Layer.
    * Web Based Console
    * SDK and Command Line Tool
    * API
    * Mobile Application

## GCP Resource Hierarchy

![title](Resource_Hierarchy.PNG)

* Resource Hierarchies define trust boundaries
* Group your resources according to your organization hierarchy.
* Level of Hierarchy provides trust boundaries and resource isolation.
* Folder is the only level of hierarchy which is allowed to have Sub Folders.
* You can define policies at the Org Node, Folder-Subfolder and Project Level and only few GCP resources allow for this.
* You can have policies can be applied on few individual resources too.
* Policies are inherited downward in hierarchy.
* All GCP resources belong to a Project. Projects are the basis for enabling and using GCP services like Managing API, Enabling Billing, adding and removing colaborators and other services.
* Project can have different owners and users. And they are also billed and managed separately.

* Each GCP Project has 3 Identifiers
    * Project ID - Permanent unchangeble identifier and this needs to be unique across GCP.
    * Project Name - These user friendly names which can be assigned.
    * Project Number - GCP also assigns a unique number to each project.

* Folders allow for flexible management and they are used to group projects. We can also assign policies at the folder level.
* To use folders you need to have an organization node at the top of the hierarchy. 
* Resources inherit the policies of the parent resource
* You can get a Org Node if your company is a GSUITE Customer or you can create it using IAM.
* Policies implemented at the Higher Level cannot take away access granted at the lower level.
* You caan always move Projects into Folders.


## Identity & Access Management

* IAM lets administrators authorize who can take action on specific resource.
* IAM Policy has 3 Parts
    * WHO - This names the users which are defined by a Google account, Google Group or Service Account.
    * CAN DO WHAT - This is defined by IAM Roles. IAM Role is a collection of permissions.
    * ON WHICH RESOURCE - This defines the specific GPC Resource.
* There are 3 types of IAM Roles
    * Primitive - These are broad, if you apply them to project it then effects all resources within the project.
        * Owner - Every thing Editor can do and Manage roles and permissions and Owner is the one who can setup Billing. For companies who do not want some one who has access to resource manage billing then Owner can assign any user as a Billing Administrator which allows that person to manage billing.
        * Editor - Everything Viewer can do and can also change state of resources. Like Start, Stop etc.
        * Viewer - Read Only
    * Predefined - These are tailored roles which can be specific to GCP services and may be few common services.
    * Custom - Gives granular control on what you can access, these roles can only be used at projects or organization level and not at the folder levels.
* What if you want to give privilages to compute engine virtual machine and not a person. In that case you have to use a service account. If you have an application running on a VM that needs to write to google storage. Then you would create a service account to authenticate the VM to cloud to perform writes in the storages. Service Accounts use cryptographic keys instead of password.
* In addition to being an Identity a service account is also a resource it can have IAM policies on its own attached to it. For instance Alice can have an Editor role and Bob can have Viewer role.

![title](Quiz4.PNG)


## Interacting with Google Cloud Platform.

There are 4 ways of interacting with GCP
* Cloud Platform Console
* SDK and Cloud Shell
* REST Based API
* Mobile App

* GCP console is a web based administrative user interface which allows you to view and manage all resources and projects. They also allow you to explore, enable and disable GCP API's of GCP Services. This also gives access to Cloud Shell as well. From the cloud shell you can use the tools provided by the Google Cloud SDK without having to first install them somewhere.
* Google Cloud SDK are set of tools that can be used to manage resources and applications in GCP like GCloud Tools which is the main command line interface for GCP etc.
    * Ways to get SDK
        * Use the cloud shell button in GCP console which will bring up a command line in the web browser on a virtual machine with commands installed.
        * Install SDK on your local system
        * SDK is also available as a docker image
* RESTful API provides programatic access to products and services of GCP. Application can pass information to API in JSON. GCP console lets you turn on and off API, most of the API are off by default and many have restrictions. This is for security. GCP console has a API Explore feature which helps explore various API interactively.
* Mobile App helps you manage GCP resources on you mobile.

## Cloud Market Place

Cloud Market Place is a tool to quickly deploy functional software packages on Google Cloud Platform. No need to manually configure software, virtual machine, storage and network settings. Although, you can modify many of them before you launch, if you like.

![title](Test1a.PNG)
![title](Test1b.PNG)
![title](Test1c.PNG)
![title](Test1d.PNG)

# Virtual Private Cloud (VPC) Network 

What is Virtual Private Cloud Network 

A virtual private cloud (VPC) is an on-demand configurable pool of shared computing resources allocated within a public cloud environment, providing a certain level of isolation between the different organizations (denoted as users hereafter) using the resources.

One of the ways of getting started with GCP is to define your own Virtual Private Cloud inside their first GCP Project or they can choose the default VPC and get started with that. A VPC network connects your google cloud platform resources to each other and the internet. You can segment your networks, use fire wall rules to restrict access to instances and create static routes to forward traffic to specific destinations.
 
* Each VPC Network is contained in a GCP Project.
* You can provision Cloud Platform Resources, connect them to each other and isolate them from one another.

* Google Cloud VPC Network are global and the subnets are regional.

![title](Quiz5.PNG)

# Compute Engine

Compute Engines lets you create and run virtual machines on google infrastructures. 
* There are no up front investments 
* You can run thousands of virtual machines on a system which is designed to handle such requests and provide fast and consistant performances.

* You can create VM's on GCP using the GCP Console or the GCloud Command Line Tool. 
* You can run Windows or Linux or customizable versions of those or even import images from many of your physical servers on those VM's.

* You Pick a Machine Type which determines how much memory and how many virtual CPUs it has, These types range from very cmall to very large these are what we call predefined types. If you dont find a any predefined type which meets your needs you can custmize a VM to your requirements.
* If you have work loads like machine learning and data processing that can take advantages of GPU then many GCP Zones have GPU available.

* Just like physical computers need disk so do VM's, you can choose from two types of persistant storage 
    * Standard
    * SSD 
* If your application needs high performance scratch space you can attach a local SSD, but be sure to store permament data some where else, because the local SSD content does not last past when local VM Terminates.
* You also choose a boot image, GCP offers lots of versions of linux and windows and they are ready to go, You can also import images from any of your physical servers.

* Lot of customers want their VM's to come up with certain configurations, like installing software packages on first boot. You can pass GCP start up scripts which do exactly that. You can also pass other types of meta data too.

* Once you have your VM up and running you can take Snapshots of the disks for backup purposes or use them as tools for migrations.


# Preemptible VM instances

* A preemptible VM is an instance that you can create and run at a much lower price than normal instances. However, Compute Engine might terminate (preempt) these instances if it requires access to those resources for other tasks. Preemptible instances are excess Compute Engine capacity, so their availability varies with usage.

* Suppose you have a work load that does not require any human intervention. Like a batch job analyzing large data sets. You can save money by choosing Preemptible VM's to run the jobs. Preemptible VM differs from compute engine VM in only one respect, you have given Compute Engine permissions to terminate it if its resources are needed elsewhere. When you run jobs on Preemptible VM, make sure the jobs are made in such as way that they can be stopped and started.

You can make very large VM's in compute engine, Maxx number of VCPU's in a VM was 96 and Max Memory Size was 624 GB. These huge VM's are useful for in-memory data bases and CUP intensive analytics. But norm is to start bny scaling out rather than scaling up.

Compute Engine instances has a feature called auto scaling which lets you add and take away VM's from your applications based on load metrics and the other part is balancing incoming traffic across all the VM. And Google VPC supports several kinds of Load Balancing.

![title](Quiz6.PNG)


# Important VPC Capabilities

* Much like  physical networks VPC have routing tables to forward traffic from one instances to another instances within the same network, and even across subnetworks, and between GCP zones. VPC Routing tables are builtin you dont have to provision or manage a router.

* You also do not need to manage firewall instance for GCP, VPC gives you global distributed firewall which you can control to restrict access to instances, both incoming and outgoing. 

* You can define firewall rules in terms of metadata tags on compute engine instances. Example you can tag all your web servers with a tag called web and write a fire wall rule saying that traffic on port 80 is allowed into all VM's with the web tag no matter what their Ip address happens to be.

* VPC belong to GCP Projects, when you have several GCP projects and VPC need to talk to each other then there two ways depending on what is the requirement. If you want to just connect VPC's together so they can exchange data between them then this can be done by VPC Peering. If you want to use the full power of IAM to control who and what in one project can interact with a VPC in an other then use Shared VPC.
    * Shared VPC - Shared VPC allows an organization to connect resources from multiple projects to a common VPC network, so that they can communicate with each other securely and efficiently using internal IPs from that network. When you use Shared VPC, you designate a project as a host project and attach one or more other service projects to it. The VPC networks in the host project are called Shared VPC networks. Eligible resources from service projects can use subnets in the Shared VPC network.
    * VPC Network Peering - Google Cloud Platform (GCP) Virtual Private Cloud (VPC) Network Peering allows private RFC 1918 connectivity across two VPC networks regardless of whether or not they belong to the same project or the same organization.

* Virtual Machines can auto scale to respond to changing loads. How do application load balancing happen to route traffic equally to all the VM's. This is done by Cloud Load Balancing which is a fully distributed software defined managed service for all your traffic. Since the load balancer dont run on the VM's you have to manage you dont have to worry about scaling or managing them.

![title](LoadBalancing.PNG)

* Googles Cloud DNS provides Public Domain Name Service. DNS translates internet host names to addresses. Google has highly developed DNS infrastructure. The internet host names and host names of applications that you build in GCP. GCP uses Cloud DNS to help find them.

# Cloud CDN ( Content Delivery Network) 

* Cloud CDN (Content Delivery Network) uses Google's globally distributed edge points of presence to cache HTTP(S) load balanced content close to your users. Caching content at the edges of Google's network provides faster delivery of content to your users while reducing serving costs.

* Google has a global system of edge caches, this can be used to accelerate content delivery in your application using Google Cloud CDN ( Cloud Delivery Network ), with this your customers will experience lower latancy, the origins of your content experience reduced load and you can save money. If you already have a CDN which you would like to use Googles CDN inter connect will help you do that.

* You can interconnect other networks to their google VPC's like onprem networks or networks in other clouds, there are a lot of ways to achieve this
    * Cloud Interconnect - Cloud Interconnect extends your on-premises network to Google's network through a highly available, low latency connection. You can use Google Cloud Interconnect - Dedicated (Dedicated Interconnect) to connect directly to Google or use Google Cloud Interconnect - Partner (Partner Interconnect) to connect to Google through a supported service provider.
    * Cloud VPN - You can setup a Virtual Private Network over the internet using the IPSEC protocol. Then use use Cloud Router to exchange route information over the VPN using the border gateway protocol. Traffic is encrypted and travels between the two networks over the public Internet. Cloud VPN is useful for low-volume data connections.
    * Peering 
        * Direct Peering - Direct Peering allows you to establish a direct peering connection between your business network and Google's edge network and exchange high-throughput cloud traffic. When established, Direct Peering provides a direct path from your on-premises network to Google services, including the full suite of Google Cloud Platform products. Traffic from Google's network to your on-premises network also takes that direct path, including traffic from VPC networks in your projects. 
        * Carrier Peering - Carrier Peering enables you to access Google applications, such as G Suite, by using a service provider to obtain enterprise-grade network services that connect your infrastructure to Google. When connecting to Google through a service provider, you can get connections with higher availability and lower latency, using one or more links. Work with your service provider to get the connection you need.

![title](Test2a.PNG)
![title](Test2b.PNG)
![title](Test2c.PNG)

# Introduction to Google Cloud Platform Storage Options.

Every application needs to store data. Different application work loads will require different storage database solutions.

## Google Cloud Storage - Object Storage

* File Storage - Here you manage your data as hierarchy of folders.
* Block Storage - Here the operating system manages data as chunks of disk.
* Object Storage - Object storage (also known as object-based storage) is a computer data storage architecture that manages data as objects, as opposed to other storage architectures like file systems which manages data as a file hierarchy, and block storage which manages data as blocks within sectors and tracks. Each object typically includes the data itself, a variable amount of metadata, and a globally unique identifier. Often these unique keys are in the form of url's which means that object storage interacts well with web technologies.

Cloud Storage is a fully managed scalable service, which means that you dont need to provision capacity ahead of time. Just make objects and the service stores them with high durability and high availability.

Cloud Storage works the same which means you create objects and the service stores them with high durability and high availability. You can use cloud storage for lots of things like serving websites content, storing data for archival and disaster recovery or distributing data to your end users via direct download. 

Since each object has a unique url each object feels like a file. Cloud storage consistes of Buckets which are used to manage your objects. 

These storage objects are immutable which means you cannot edit them in place. But instead you create new versions.

Cloud Storage always encrypts your data on the service side before it is written to disk and also by default data in transit is encrypted using HTTPS.

Cloud Storage Objects are organized into buckets, when you create a bucket your give it a globally unique name and specify a geographic location where the bucket and its contents are stored. Pick a location which minimizes latancy for your users.

There are several ways to control access to your objects and buckets. At a high level IAM is sufficient, but if you need finer control you can use Access Control List that offers finer control.

Access Control List [ACL] defines who has access to your buckets and objects as well as what level of access they have.

Each ACL contains two pieces of information
* Scope - Defines who can perform the specified action. Like User of User Group
* Permission - Defines what actions can be performed. Like Read and Write.

You can turn object versioning on your buckets if you want, if enabled then cloud storage keeps a history of modifications. That is it over writes or deletes all of the objects in the bucket, you can list the archived versions of the object, restore an object to an older state or permanently delete a version as needed. If you do not turn on object versioning, New always Over Writes Old. 

Cloud Storage also offers Life Cycle Management Policies, For example you can say cloud storage to delete objects older than 365 days, or keep only 3 most recent versions of the object in a bucket.


## Cloud Storage Classes.

Cloud Storage lets you choose amoung 4 different types of Storage Classes.
* Regional - High Performance Object Storage, Lets you store your data in specific GCP Region. Cheaper than Multi Regional Storage and offers less redundancy. This is best to store data close to compute engine virtual machine or their kubernetes engine clusters which gives better performance for data intensive appliations.
* Multi Regional - High Performance Object Storage, Expensive but its geo redundant, which means you pick a broad geographical region like United States, Europe, Asia etc. And cloud storage stores the data in atleast 2 geographical regions which are separated by atleast by 160 KMS. This is great to store frequently accessed data like website content etc. 
* Near Line - Back up and Archival Storage, Is a Low cost highly durable service for storing infrequent accessed data. Great for data access once a month. They also incurr access fee for every GB of data read.
* Cold Line - Back up and Archival Storage, This is a highly durable service for data archiving, online backup and disaster recovery. Best for data access for once a Year. This incurs higher fee for every GB of data read.

All types of storage classes are accessible using Cloud Storage API and all offer milli second access times.

![title](StorageClasses.PNG)

There are many ways of bringing data into cloud storage.

![title](DataTransfer.PNG)

Cloud Storage is often the ingestion point for data being moved into the cloud and is also the long term storage location for data.

Online Transfer - Many customers use GSUtil which is the cloud storage command from cloud SDK. You can also move the data in using the drag and drop functionality in the GCP Console if you use the google chrome browers.

Storage Transfer Service - If you have to upload Terra Bytes and Petta Bytes if data, this can be done using online Storage Transfer Service and Offline Transfer Applicnce. Storage Transfer Service lets you schedule and manage batch transfers to cloud storage from another cloud provider, from a different cloud storage region or from an HTTPS end point.

Transfer Applicance - This is a Rackable High Capacity storage server that you lease from google cloud, you simply connect it to your network load it with data and ship it to an upload facality where the data is uploaded to cloud storage. This service enables you to securely transfer upto a Peta Byte of data on a single applicance.

There are also lots of other ways to import data into cloud storage. As this storage option is tightly integrated with many GCP products and services. Ex - you can import and export from and to cloud storage using Big Query and Cloud SQL.

![title](CloudStorage.PNG)

![title](Cloud_Storage_Quiz.PNG)


## Cloud Big Table - NO SQL Big Data Data Base Service

* Cloud Big table is Google's NOSQL BigData Data Base Service. Like Relational Databases where you have a fixed table schema and which is enforced by the data base, NOSQL does not have this restrictions this may be done by sparcely populating the rows.
* Data Bases in big table are sparsely populated tables that can scale to billions of rows and thousands of columns allowing you to store petabytes of data. GCP Fully manages the service so you dont have to configure and tune it.
* Cloud Big Table is great to store large amounts of data with low latancy, it supports high through put both read and write, so this is a great choice for operational and analytical applications. This can be accessed by Open Source HBase API, where HBase is the native database for Apache Hadoop Project. 
* Underlying Data Base for Cloud Big Table is HBase which is the native database for Apache Hadoop Project.
* Reasons to choose Cloud Big Table 
    * Scalability - If you manage your own HBase instance, scaling past certain rate of queries per second is going to be tough, but with Big Table you can just increase your machine count and this does not even require down time. And the Cloud Big Tables handles administrative tasks like upgrades and restarts transparently.
    * All data in Cloud Big Table is encrypted in flight and at rest.
    * You can use IAM to control who has access to data in Big Table.
    
![title](AccessPattern.PNG)

![title](CLoud_BigTable.PNG)


## Google Cloud SQL - Relational Data Base Service

Relational Data Base Services, These services use a data base schema help your application keep your data consistant and correct. Cloud SQL offers a choice of MySql or PostgreSQL Data Base Engine as a fully managed service.

You can also run you instance of relational data base engines on a compute engine instance, but there are advantages to using Cloud SQL which are
* Cloud SQL provides several replica services like Read, Failover and External Replicas, This means that an outage occurs cloud Sql can replicate data between multiple zones with automatic failover.
* Also helps back up ondemand or scheduled backups
* It can scale Vertically by changing machine type or horizontally via read replicas.
* Cloud SQL instances include network firewalls and customer data is encrypted when on google networks and when stored in database tables, temporary files and backups.
* They are accessible by other GCP services and other external services.

![title](CloudSQL.PNG)

If cloud Sql does not meet your requirement of Horizontal Scalability, we can consider Coud Spanners. It offers transactional consistancy at a global scale, Schemas, SQL and Automatic Synchronos Replication for High Availability while providing petabytes of capacity.

Cloud Spanners best alternative in situation if you have outgrown relational data base, or sharding your data base to achieve through put and high performance, need transactional consistancy or if you just want to consolidate your database. Few use cases are for financial applications and inventory management applications.

![title](Cloud_SQL_Spanner.PNG)


## Google Cloud Datastore 

Cloud Datastore is a NoSQL document database built for automatic scaling, high performance, and ease of application development.
This is a highly scalable NoSQL data base. One of the main use cases is to store structured data from App Engine Apps. You can also build solutions that span an app engine and a compute engine with cloud data store as the integration point. 
Cloud Datastore automatically handles sharding and replication providing highly available and durable data base that scales automatially to handle load. It also offers transactions that affect multiple database rows and lets you do SQL like queries. To get started it has a free daily quota that provides storage, reads, writes, deletes and small operations at no charge.

![title](Cloud_DataStore_Quiz.PNG)

## Comparing Storage Options

![title](StorageOptions.PNG)
![title](StorageOptions1.PNG)

![title](GCP_Storage_Quiz1.PNG)
![title](GCP_Storage_Quiz2.PNG)
![title](GCP_Storage_Quiz3.PNG)




# Containers, Kubernetes, and Kubernetes Engine

Compute Engine which is GCP's Infrastructure as a Service offering with access to servers, file systems and networking. And App Engine which is GCP Platform as a service. Containers and Kubernetes Engine which is a hybrid that conceptually sits between the two with benefits from both.

Infrastructure as a Service allows you to share compute resources with other developers by virtalizing the hardware using virtual machines. Each developer can deploy their own operating system, access the hardware and build their applications in a self contained environment with access to RAM, File Systems, Networking Interfaces and so on. This Flexibility comes with a cost, the smallest unit of compute is a App with its VM, the guest OS may be large in GB size and may take minuites to boot, but you have your tools of your choice on a configurable system so you can install your favorite run time, webserver , database, middle ware and configure the underlying system resources like Disk space, Disk IO or networking and build as you like. However demand for your application increases you have to copy an entire VM and boot the guest OS for each instance of you app which can be slow and costly.

In App Engine you get access to programming services, so all you do is write your code in self contained workloads that use these services and include any dependent libraries, as the demand for your application increases the platform scales your application seemlesly and independently by workload and infrastructure. This scales rapidly but you will not be able to fine tune the underlying architecture to save cost.

That is  where containers come in, the idea is to give you independent scalability of workloads and an abstraction layer of OS and hardware. What you get is an invisible box around your code and its dependancies, with limited access to your own partition of the File System and Hardware. It only requires few system calls to create and it starts as quickly as a process. All you need on each host is an OS kernel which supports Containers and Container Run Time.

In essence you are virtualizing the OS, it scales like PAAS and gives you nearly the same flexibility as IAAS. With this abstraction your code is ultra portable and you can treat the OS and hardware as a blackbox, so you can go from development to staging to production or from laptop to the cloud without changing or rebuilding anything.

If you want to scale a webserver for example, you can do so in seconds and deploy dozens or hundreds of them depending on the size of your work load, on a single host. This is a simple example of scaling one container running a whole application on a single host. You likely want to build your application using lots of containers each performing their own function like Microservices. If you build them like this and connect them with network connections you can make the application modular, deploy easily and scale independently across a group of hosts. And the host can sacle up or down, start and stop the containers on demand as demand for your application change or as hosts fail, a tool that lets you do this well is called Kubernetes.

Kubernetes makes it easy to orchestrate many containers on many hosts, scale them as microservices and deploy rollouts and rollbacks. 

Running containers can be done using an open source tool called docker that defines the format for bundling your application and its dependancies and machine specific settings into a container. Or you could also use a different tool like Google Container Builder.

Kubernetes is an open source orchestrator that abstracts containers at a highlevel so you can better manage and scale your applications. At the highest level Kubernetes is a set of API which you can use to deploy containers on a set of nodes called cluster. The system is divided into set of master components that run as a control point and a set of nodes that run containers. In Kubernetes node represents a compute instance like a mcahine, in google cloud nodes are virtual machines running compute engine. You can describe set of applications and how they should interact with each other and kubernetes figures out how to make that happen.

Now that you have built your container you will want to deploy it into a cluster, Kubernetes can be configures wityh many options and addons, but can be time consuming to boot strap from the ground up. Instead you can boot strap Kubernetes using Kubernetes Engine (GKE). GKE is hosted kubernetes by google, GKE clusters can be customized and support diffrent machine types, numbers of nodes and network settings.

Pod is the smallest unit in kubernetes that you create or deploy, Pod represents a running process on your cluster as either as a component of your application or an entire app. Generally you have only one container per pod, but if you have multiple containers with hard dependancy you can pacakage them into a single pod and share networking and storage. The Pod provides a unique network IP and a set of ports for your containers and options to govern how the containers should run. Containers inside a pod can communicate with each other using local host and ports that remain fixed as they are started and stopped on different nodes.

![title](Containers_Quiz1.PNG)
![title](Kubernetes_Quiz1.PNG)
![title](Kubernetes_Quiz2.PNG)

![title](Containers_Kuberneter_Test1.PNG)
![title](Containers_Kuberneter_Test2.PNG)


# Introduction to App Engine

the two GCP products that provide the compute infrastructure for applications Compute Engine and Kubernetes Engine. In these two setup you choose the infrastructure in which your application runs, based on Virtual Machines for Compute Engine and Containers for Kubernetes Engine. If you do not want to focus on infrastructure at all but focus only on the code then App Engine is the solution.

App Engine platform manages the hardware and networking infrastructure required to run your code. Thus App Engine is a PAAS for building scalable applications. To deploy an application on App Engine, you just hand App Engine your code and the App Engine service takes care of the rest. App Engine provides built in services that many web applications need , NoSql Db, In Memeory Caching, Load Balancing, Health checks, Logging and a way to authenticate users. You code your application to take advantages of these services and app engine provides them.

App Engine will scale your application automatically in response to the amount of traffic it recieves, so you only pay for those resources you use. And there are no services that you have to provision and maintain. That is why App Engine is specially suited for applications where the work load is highly variable or unpredictable like web applications and mobile backend.

App Engine offers two environments Standard and Flexible.

![title](App_Engine_Quiz1.PNG)

## App Engine Standard Environment

Standard Environment is the simpler amoung the two environment. It offers simpler deployment experience than the flexible environment and also gives fine grained auto scaling. It also offers free daily usage quota for the use of some services. What is distinctive about Standard Environment is that low utilization applications might be able to run at no charge.

Google provides app engine SDK in several languages so you can test your application locally before you put it to App Engine Service. SDK also provide simple commands for deployment.

App Engine Standard Environment provides run times for specific versions of Java, Python, PHP and GO. These run times also include libraries that support app engine API's, for many applications standard environment run times and libraries is all you may need. If you want to code in another language then standard environment is not the one for you, and you will need to consider flexible environment. Standard Environment also enforces restrictions on your code by making it run in a sandbox, this is a software construct that is independent of the hardware, OS or physical location of the server it runs on. Sandbox is one of the reasons why app engine standard environment can scale and manage your application in a very fine grained way. Like all sand boxes it imposes some restrictions for example, your application cannot write to the local file system it will have to write to a database service instead if it needs to make data persistant. All requests your application recieves has a 60 sec timeout and you cannot install arbitrary third party software.

![title](AppEngineStandard.PNG)

## App Engine Flexible Environment

App Engine Flexible Environment lets you specify the container your app engine runs in, your application runs inside docker containers on google compute engine virtual machines. App engine manages these compute engine VM's for you, they are health checked, healed as necessary and you get to choose which geographic region they run in backward compatable updates to the operating system are automatically applied. All this so you can just focus on your code.

![title](AppComaprision.PNG)

![title](AppKubernetes.PNG)

![title](App_Engine_Quiz2.PNG)


## Google Cloud Endpoint and Apigee Edge

Software service's implementation can be complex and changeble, what if to use that service other pieces of software needed to know about the internal detals about how they worked, that would be difficult, instead application developers structure the software they like so that it presents a clean well defined interface that abstracts away needless details and then they document that interface that is an API. The underlying implementation can change as long as the interface does not change, other pieces of software that use the API dont have to know or care. Some times you have to change an API to add or depricate a feture. To make this kind of changes clean and easy developers version their API.

![title](API.PNG)

Supporting an API is very important task and GCP provides two API management tools 

Suppose you are developing a software services in one of GCP's backends, you would like to make it easy to expose this API and you would like to make sure it would only be consumed by other developers whom you trust, and you would like an easy way to monitor and log its use, you would like for the API to have a single coherent way for it to know which end  user is making the call that is when you use Cloud End Points. It Implements these capabilites and more using an easy to deploy proxy in front of your software service and it provides an API console to wrap up capabilities in an easy to manage interface.

![title](CloudEndPoints.PNG)

Apigee Edge is also a platform for developing and managing API Proxies. This has a different orientaition with a focus on business problems like rate limiting, quotas and analytics. Many users of apigee edge are providing software services to other companies and those features come in handy.

![title](App_Engine_Test1.PNG)
![title](App_Engine_Test2.PNG)

# Development in Cloud

## Cloud Source Repositories

Lots of GCP Customers use Git to store and manage their source code trees. That means running their own git instance or using hosted git provider. Running your own gives your great control and using a hosted git is less work.
A way to keep code private to GCP project and use IAM permissions to protect it but not have to maintain the git instance yourself, This is what cloud source repositories do. It provides git version control to support your teams development of any application or service, including the ones that run on App Engine, Compute Engine or Kubernetes Engine. With cloud source reporsitories you can have any number of private git source repositories, which allows you to organize the code associated with your cloud project in what ever way works best for your. 

Cloud source repositories also contain a source viewer so you can browse and view repository files from within the GCP console. 

## Cloud Functions.

Many application contain event driven parts, may be you have an application thats lets users upload images, when ever that happens you may need to process images in various ways like resize or convert to different format and storing it etc. You can always integrate this function into your application, but then you have to worry about providing compute resources to it, no matter if this happens one a day or more. What if you could write a single purpose function that would do the necessary tasks and arrange for it to automatically run when ever a new image gets uploaded, This is what cloud functions lets you do. You do not need to worry about servers or run time binaries, you just need to write your code in Java Script for a Node JS environment which GCP provides and configure when it should fire. You just pay when ever your functions run in 100MS intervals. Cloud Functions can trigger when events occur in cloud storage, Cloud PubSub or an http call. Some applications especially those that have a micro services architecture can be implemented entirely in cloud functions. We can also use cloud functions to enhance existing applications without having to worry about scaling.

![title](Could_Source_Repository.PNG)


# Deployment: Infrastructure as Code

Setting up environment in GCP can entail many steps, Setting up compute, network and storage resource and keeping track of their configurations. When you have to make changes to the configurations running all the commands and maintaining all the configurations is time consuming. So its more efficient to use a template, which is a specification of what the environment should look like, its declarative rather than imperative. GCP Provides deployment manager to let you do just that. It is an infrastructure management service that automate  the creation and management of your GCP resources for you. To use it you create a template file using yaml markup language or python that describe what your components should look like. Then you give the template to deployment manager which figures out and does the actions needed to create the environment your template describes. If you need to change your environment you just change your template then tell the deployment meneger to update the environment with the change. You can version control your deployment manager templetes in cloud source repositories.

![title](Could_Functions.PNG)

# Monitoring: Proactive Instrumentation
## Stack Driver

You cannot run a stable application without monitoring, monitoring helps you figure out if the chages you made were good or bad. It lets you respond with information rather than panic when an end user complains your application is down. Stackdriver is GCP tool for Monitoring, Logging and Diagnostic. Stackdriver gives you access to many different types of information from infrastructure platforms, virtual machines containers, middle ware, application tier, logs, metrics and traces.

It gives insights into your application's health, performance and availability, so if issues occur you can fix them faster.

Below are the core components of stackdriver.
* Monitoring - Checks the end points of web applications and internet accessible services running on your cloud environment, you can configure up time checks associated with url's, groups or resources such as instances and load balancers, You can setup alerts on interesting criteria. Like when health check results or uptimes fall into levels that need action. You can create alerts and dashboards to visualize.
* Logging - Lets you view logs from your applications and filter and search on them, Also lets you define metrics based on log content which can be incorporated into dashboards and alerts. You can export logs to BigQuery, Cloud Storage and Cloud Pubsub.
* Trace - You can sample the latancy of app engine applications and report per url statistics.
* Error Reporting - Tracks and groups errors in your application and notifies when new errors are detected.
* Debugging - Traditional way of debugging is to go back to the code and add lots of logging and debugging statements. Debuggger connects the production data to source code so you can inspect state of your application at any code location in your application. That means you can view application state with out adding logging statements. Debugger works best when your applcation source code is available.

![title](DeploymentQuiz1.PNG)
![title](DeploymentQuiz2.PNG)

# Introduction to Big Data and Machine Learning

## Google Cloud Big Data Platform

## Cloud Data Proc

Google Cloud Big Data Solutions are designed to help you transform your business and user experience with meaningful data insights. Serverless means you dont have to provision Compute instances to run your job, the services are fully managed and you only pay for the resources you consume. Platform is integrated so that GCP data services work together to help you create custom solution.

Apache Hadoop is a opensource frame work for big data. It is based on map reduce programming model which google invented and published. Map Reduce means that one function called Map Function runs in parallel with a massive dataset to produce intermediate results another function called reduce function builds a final result set based on all the intermediate results. 

Hadoop is often used informally to encompass Apache Hadoop itself and related projects like Spark, Pig and Hive. Cloud Data Proc is a fast easy, managed way to run hadoop, spark, hive and Pig on Google Platform. All you have to do is request a hadoop cluster. It will be built for you in 90 seconds or less on top of compute engine virtual machines whose number and type you control. If you need more processing power while your cluster is running you can scale it up or down. You can use the default configuration for hadoop software in your cluster or you can customize it. And you can monitor your cluster using stack driver.

Running a on prem hadoop cluster requires capital investment. Running hadoop clusters on Google data Proc allows you to pay for hardware resouces used during the life of the cluster you create. All cloud data proc clusters are billed in one second clock time increments, Subject to one minuite minimum billing. So when you are done with your cluster you can delete it and the billing stops. This is much more agile use of resources than on premises assets.

You can also save money by telling cloud data proc to use premptable compute engine instances for your bath processing you have to make sure your jobs can be started cleanly if they are terminated. Ans you get a significant break in the cost of the instances.

Once your data is in a cluster, you can use spark and sparkSQL to do data mining and you can use MLLib which is a Apache Spark's Machine Learning Library to discover patterns through machine learning.

## Cloud Data Flow

Cloud data proc is great when you know your data size or when you want to manage your cluster size yourself. What if your data shows up in real time, or its of unpredictable size or rate, That is where Cloud data Flow is a good choice. It is both a unified programming model and a managed service, it lets you develop and execute big range of data processing patterns, Extract, Transform and Load, batch computation and continuous computation. You use data flow to build data pipelines, And the same pipelines work for both batch and streaming data. There is no need to spin up a cluster or size instances. Cloud Data Flow automates the management of what ever processing resources are required. Cloud Data flow frees you from operational tasks like resource management and performance management. Data Flow is a general purpose ETL Tool.

## Cloud Big Query

Suppose instead of a dynamic pipeline your data needs to run more a way of exploring a vast sea of data. You want to do adhoc Sql queries on a massive data set, that is what Big Query is for. It is Gooogle's fully managed peta byte scale low cost analytics data warehouse. Because there is no infrastructure to manage you can focus on analyzing data to find meaningful insights. Use familiar SQL and take advantage of pay as you go model. 

Its easy to get data into Big Query, you can load it from cloud storage or cloud data store or stream it into big query at upto 100,000 rows per second. once its in there you can run super fast SQL queries against multiple Terra Bytes of data in seconds,  using processing power of Google's Infrastructure. In addition to sql queries you can easily read and write data into Big Query using cloud data flow, hadoop and spark.

Big Query is used by all types of organizations. Google's Infrastructure is global so is Big Query. Big Query will let you specify the region where where your data will be kept. Because Big Query separates Storage and Computation your pay for your data storage separately from queries. That means you pay for queries when they are actually running.

You have full control on who has access to data stored in big query. Including sharing data sets with people in different projects. You have automatic discounts for data stored in Big Query for long time. Google automatically drop the price of storage if the data is stored for more than 90 days.

## Cloud Pub/Sub - Publishers/Subscribers

When ever you are working with events in real time it helps to have a messaging service, That is what cloud pub/sub is. It is ment to serve as a simple, reliable, scalable foundation for stream analytics. you can use it to let independent applications to send and recieve messages, that way they are decoupled so they scale independently. 

PUB in Pub/Sub is short for publishers and SUB is short for subscribers. Applications can publish messages in pub/sub and one or more subscribers recieve them, recieving messages does not have to be synchronos, thats what makes Pub/Sub great for decoupling systems.

Cloud Pub/Sub offers ondemand scalability to 1M messages per sec and beyond, you just choose the quota you want. It is an important building block for applications for which data arrives at high and unpredictable rates like IOT systems. If you are analyzing streaming data, Cloud Data Flow is a natural paring with Cloud Pub/Sub. Pub/Sub also works well with applications built on GCP compute platform. You can configure subscribers to recieve messages on a push or pull basis. Subscribers can be sent new when they arrive for them of they can check if there new messages for them.

## Project Jupyter - Cloud Data Lab

It lets you create and maintain web-based notebooks containing python code and you can run that code interactively and view the results. And cloud data lab takes the management work out of this natural technique, it runs in a compute engine virtual machine. To get started youe specify the virtual machine type you want and what GCP region it should run in.  When it launches it presents an interactive python environment that's ready to use. And it orchstrates multiple GCP services automatically, so you can focus on exploring your data. You only pay for the resources you use. It is integrated with Big Query, Compute Engine and Cloud Storage so accessing your data does not run into authentication hassels.  When your are up and running you can visualize your data with Google Charts or MatPlotLib.

## Google Cloud Machine Learning Platform

Machine Learning is one branch of Artificial Intelligence. It is a way of solving problems without explicitly coding solution. Insteat programmers build systemms that improve over time, through repeated exposure to sample data which is called training data. Major google applications use machine learning like you tube, Photos, Google Mobile app and Google Translate. 

Google Machine Learning platform is now available as a cloud service so that you can add innovative capabilities to your own applications. Cloud Machine Learning platform provides modern machine learning services with pre trained models and a platform to generate your own tailored models. As with other GCP products there are a range of services that streaches from the highly general to pre customized.

Tensor FLow is an open source software library that is very well suited for machine learning applications like neural networks, it was developed by Google for google's internal use and the opensourced so world can benefit. You can run tensor flow where ever you like, but GCP is an ideal place for it. Because machine learning models lots of on demand compute resources and lots of training data. Tensor Flow can also take advatage of tensor processing units which are hardware devices designed to accelerate machine learning work loads with tensorflow. GCP makes them available in the cloud with compute engine virtual machines.

Suppose you want a more managed service, Google Cloud Machine Learning Engine lets you easily build machine learning models that work on any type of data of any size, it can take any tensorflow model and perform large scale training on a managed cluster. 

Suppose you want to add machine learning capabilities to your application without having to worry about the details on how they are provided. Google cloud also offers a range of machine learning API's suited to specific purposes.

## Machine Learning API

Cloud Vision API enables developers to understand content of an image it quickly classifies into thousands of categories, detects individual objects in an image, and finds and reads printed words contained within an image. It encapsulated power ful machine learning models behind an easy to use API, You can use it to build metadata on your image catalog, moderate offensive content or even do image sentiment analysis.

Cloud Speech API enables developers to convert audio to text, because you have an increasingly global user base the API recognizes over 80 languages and variances.you can transcribe the text of users, dictating an applications microphone, enable command and control through voice or transcribe audio files. 

The Cloud natural Language API provides variety of natural language understanding technologies to developers. It can do syntax analysis, breaking down sentences supplied by the users into tokens, identify the nouns, verbs and adjectives and other parts of speech and figure our relationships between the words. It can do entity recognition, in other words it can parse texts and flag mentions of people, organizations, Locations, events, products and media.It can understand the over all sentiment expressed in a block of text. It has this capabilities in multiple language.

Cloud Translation API provides a simple programatic interface for translating arbitraty string into a supported language when you dont know the source language the API can detect it.

Cloud Video Intelligence API lets you annotate in a variety of format, it helps you identify key entities like nouns within your video  and when they occur. You can make your video content searchable and discoverable 

![title](Big_Data_and_ML_Quiz1.PNG)
![title](Big_Data_and_ML_Quiz2.PNG)
![title](Big_Data_and_ML_Quiz3.PNG)


# Summary

![title](Summary1.PNG)
![title](Summary2.PNG)
![title](Summary3.PNG)
![title](Summary4.PNG)
![title](Summary5.PNG)

![title](Summary_Quiz1.PNG)
![title](Summary_Quiz2.PNG)
![title](Summary_Quiz3.PNG)
