# Data Engineering, Big Data, and Machine Learning on Google Cloud Platform

As the number of internet connected devices continues to grow, the amount of data generated worldwide is becoming mind-bogglingly large. Due to this proliferation of information, it is more important than ever to be able to build applications that can derive insights from vast quantities of data in an automated fashion.

The Google Cloud Platform (GCP) provides four core infrastructure components: underlying all applications is **security**; above this base are **compute power**, **storage** and **networking** tools; finally, atop these, are high level **big data and machine learning products** that abstract away difficult implementation work.

#### Compute Power

Google's compute engine is an Infrastructure as a Service or IaaS solution that enables users to run virtual machines.

Google trains machine learning algorithms on a vast network of data centers. Smaller, trained versions of these models are then deployed onto consumer hardware. You can access Google's AI research via pre-trained AI models that can be utilized out-of-the-box.

As Moore's Law has slowed and the rate of compute performance has plateaued, one solution has been to build Application-Specific Chips (ASICs) to limit the power consumption of a chip. Google has created Tensor Processing Units (TPUs) with more memory and faster processors that are specifically optimized for machine learning workloads. TPUs in the cloud enable businesses to solve large, challenging problems in a way that would not otherwise have been possible.

`Google Cloud Platform > Compute > Compute Engine > VM instances`

An example process:

* Spin up a VM
* Perform processing
* Stop the VM
* Copy output into cloud storage
* Serve files to end users

#### Storage

One major way that cloud computing differs from typical desktop computing is that compute and storage are independent. The size of the disks associated with a compute instance do not limit the amount of data that can be processed and stored. Rather, data is transferred via pipelines into a cloud storage solution, for example an elastic storage bucket. Google gsutil commands, via the Google Cloud SDK, provide a Unix-like syntax for copying files into buckets.

#### Networking

Google’s private network is the largest in the world, comprised of thousands of miles of fiber optic cable providing petabit bisectional bandwidth. This network interconnects with the public Internet at many Edge points of presence worldwide. When a user accesses a Google resource, they are redirected to the location that will provide the lowest latency response.

#### Security

Google provides security for lower level systems such as physical hardware and data encryption that would otherwise be difficult for many businesses to manage on their own. Similarly, customizable user access controls in BigQuery enable pinpoint security for data and encryption keys.