![image.png](attachment:image.png)

Not long ago, cutting-edge computer vision algorithms couldn’t differentiate between images of cats and dogs. Today, a skilled data scientist equipped with nothing more than a laptop can classify tens of thousands of objects with greater accuracy than the human eye. In this course, you will use TensorFlow 2.6 to develop, train, and make predictions with the models that have powered major advances in recommendation systems, image classification, and FinTech. You will learn both high-level APIs, which will enable you to design and train deep learning models in 15 lines of code, and low-level APIs, which will allow you to move beyond off-the-shelf routines. You will also learn to accurately predict housing prices, credit card borrower defaults, and images of sign language gestures.

# Introduction to TensorFlow
Learning Objectives:
- define constants and variables
- perform tensor addition, multiplication, & compute derivatives
- Knowledge of linear algebra is helpful, but not necessary

### Constants & Variables

These are my notes from a course on the fundamentals of the TensorFlow API in Python. We will briefly introduce TensorFlow and then discuss its two basic objects of computation: constants and variables.

**What is TensorFlow?** TensorFlow is an open-source library for graph-based numerical computation. It was developed by the Google Brain Team. It has both low and high level APIs. You can use TensorFlow to perform addition, multiplication, and differentiation. You can also use it to design and train machine learning models. TensorFlow two point zero brought with it substantial changes. Eager execution is now enabled by default, which allows users to write simpler and more intuitive code. Additionally, model building is now centered around the Keras and Estimators high-level APIs.

- an open-source library for graph-based numerical computation
    - developed by the Google Brain Team
- it has both low and high level APIs
    - perform addition, multiplication, and differentiation
    - design and train machine learning models
- TensorFlow 2.0 brought substantial changes
    - Eager execution is now enabled by default, which allows users to write simpler and more intuitive code 
    - model building is now centered around the Keras and Estimators high-level APIs.

**What is a tensor?** The TensorFlow documentation describes a tensor as **"a generalization of vectors and matrices to potentially higher dimensions."** Now, if you are not familiar with linear algebra, you can simply think of a tensor as **a collection of numbers, which is arranged into a particular shape.**
- "a generalization of vectors and matrices to potentially higher dimensions"
- a collection of numbers, which is arranged into a particular shape

**Example.** As an example, let's say you have a slice of bread and you cut it into 9 pieces. One of those 9 pieces is a 0-dimensional tensor: this corresponds to a single number. A collection of 3 pieces that form a row or column is a 1-dimensional tensor. All 9 pieces together are a 2-dimensional tensor. And the whole loaf, which contains many slices, is a 3-dimensional tensor.
![image-2.png](attachment:image-2.png)

Now that you know what a tensor is, let's define a few. We will start by importing tensorflow as tf. We will then define 0-, 1-, 2-, and 3-dimensional tensors. Note that each object will be a tf dot Tensor object.
![image-3.png](attachment:image-3.png)

If we want to print the array contained in that object, we can apply the dot numpy method and pass the resulting object to the print function.
![image-4.png](attachment:image-4.png)

We next move on to constants, which are the simplest category of tensor in TensorFlow. A constant does not change and cannot be trained. It can, however, have any dimension. In the code block, we've defined two constants. The constant a is a 2x3 tensor of 3s. The constant b is a 2x2 tensor, which is constructed from the 1-dimensional tensor: 1, 2, 3, 4.
![image-5.png](attachment:image-5.png)

**Convenience Functions** In the previous slide, we worked exclusively with the constant operation. However, in some cases, there are more convenient options for defining certain types of special tensors. You can use the zeros or ones operations to generate a tensor of arbitrary dimension that is populated entirely with zeros or ones. You can use the zeros_like or ones_like operations to populate tensors with zeros and ones, copying the dimension of some input tensor. Finally, you can use the fill operation to populate a tensor of arbitrary dimension with the same scalar value in each element.
![image-6.png](attachment:image-6.png)

**Defining and initializing variables** Unlike a constant, a variable's value can change during computation. The value of a variable is shared, persistent, and modifiable. However, its data type and shape are fixed. Let's take a look at how variables are constructed and used in TensorFlow. In the code, we first define a variable, a0, which is a 1-dimensional tensor with 6 elements. We can set its datatype to a 32-bit float or something else, such as a 16-bit int, as we have for a1. We then define a constant, b. And define c0 as the product of a0 and b. Note that certain TensorFlow operations, such as tf.multiply are overloaded, which allows us to use the simpler a0*b expression instead.
![image-7.png](attachment:image-7.png)

### Basic Operations

**What is a TensorFlow operation?** TensorFlow has a model of computation that revolves around the use of graphs. A TensorFlow graph contains edges and nodes, where the edges are tensors and the nodes are operations.
![image-8.png](attachment:image-8.png)

In the graph shown, which was drawn using TensorFlow, the const operations define 2 by 2 constant tensors. Two tensors are summed using the add operation.
![image-9.png](attachment:image-9.png)

Another two tensors are then summed using the add operation.
![image-10.png](attachment:image-10.png)

Finally, the resulting matrices are multiplied together with the matmul operation.
![image-11.png](attachment:image-11.png)

**Applying the addition operator** Let's start with the addition operator. We will first import the constant and add operations. We may now use constant to define 0-dimensional, 1-dimensional, and 2-dimensional tensors.
![image-12.png](attachment:image-12.png)

Finally, let's add them together using the operation for tensor addition. Note that we can perform scalar addition with A0 and B0, vector addition with A1 and B1, and matrix addition with A2 and B2.
![image-13.png](attachment:image-13.png)

**Performing tensor addition** The add operation performs element-wise addition with two tensors. Each pair of tensors added must have the same shape. Element-wise addition of the scalars 1 and 2 yields the scalar 3. Element-wise addition of the vectors 1,2 and 3,4 yields the vector 4,6. Element-wise addition of the matrices 1,2,3,4 and 5,6,7,8 yields the matrix 6,8,10,12. Furthermore, the add operator is overloaded, which means that we can also perform addition using the plus symbol.
![image-14.png](attachment:image-14.png)

**Perform multiplication in TensorFlow** We will consider both element-wise and matrix multiplication. For element-wise multiplication, which is performed with the multiply operation, the tensors involved must have the same shape. For instance, you may want to multiply the vector 1,2,3 by 3,4,5 or 1,2 by 3,4. For matrix multiplication, you use the matmul operator. Note that performing matmul(A,B) requires that the number of columns of A equal the number of rows of B.
![image-15.png](attachment:image-15.png)

**Application of multiplication operators** Let's look at some examples of multiplication in TensorFlow. We'll import the ones operator, along with the two types of multiplication we will use. We will also define a scalar, A0, a 3 by 1 vector of ones, a 3 by 4 vector of ones, and a 4 by 3 vector of ones. What operations can be performed using these tensors of ones? We can perform element-wise multiplication of any element by itself, such as A0 by A0, A31 by A31, or A34 by A34. We can also perform matrix multiplication of A43 by A34, but not A43 by A43.
![image-16.png](attachment:image-16.png)

**Summing over tensor dimensions** Finally, we end this lesson by discussing summation over tensors, which is performed using the reduce sum operator. This can be used to sum over all dimensions of a tensor or just one. Let's see how this works in practice. We will import ones and reduce sum from tensorflow. We will then define a 2 by 3 by 4 tensor that consists of ones.
![image-17.png](attachment:image-17.png)

If we sum over all elements of A, we get 24, since the tensor contains 24 elements, all of which are 1. If we sum over dimension 0, we get a 3 by 4 matrix of 2s. If we sum over 1, we get a 2 by 4 matrix of 3s. And if we sum over 2, we get a 2 by 3 matrix of 4s. In each case, we reduce the size of the tensor by summing over one of its dimensions.
![image-18.png](attachment:image-18.png)

### Advanced Operations

In this section, we will cover a selection of advanced operations. Some will be used frequently in later chapters. Others will help you to gain intuition about complex machine learning routines.

**Overview of advanced operations** We have already covered basic operations in TensorFlow, including add, multiply, matmul, and reduce sum. In this lesson, we will move on to more advanced operations, including gradient, reshape, and random.
![image-19.png](attachment:image-19.png)

The gradient operation, which we'll use in conjunction with gradient tape, computes the slope of a function at a point. The reshape operation changes the shape of a tensor. And the random module generates a tensor out of randomly-drawn values.
![image-20.png](attachment:image-20.png)

**Finding the optimum** In many machine learning problems, you will need to find an optimum--that is, a minimum or maximum. You may, for instance, want to find the model parameters that minimize the loss function or maximize the objective function. Fortunately, we can do this by using the gradient operation, which tells us the slope of a function at a point. We start this process by passing points to the gradient operation until we find one where the gradient is zero. Next, we check if the gradient is increasing or decreasing at that point. If it is increasing, we have minimum. Otherwise, we have a maximum.
![image-21.png](attachment:image-21.png)

**What is a gradiant?** a differential operator applied to a three-dimensional vector-valued function to yield a vector whose three components are the partial derivatives of the function with respect to its three variables. The symbol for gradient is ∇. In simpler terms, the change in the value of a quantity (as temperature, pressure, or concentration) with change in a given variable and especially per unit on a linear scale. In two dimensional space the gradient is the slope at a given point. Gradients are typically refered to in three dimensional space or greater.

**Calculating the gradient** The plot shows the function y equals x. Notice that **the gradient--that is, the slope at a given point, is constant**. If we increase x by 1 unit, y also increases by 1 unit.
![image-22.png](attachment:image-22.png)

This is not true if we instead consider the function y equals x squared. When x is less than 0, y decreases when x increases. When x is greater than 0, y increases when x increases. Thus, the gradient is initially negative, but becomes positive for x larger than 0. This means that x equals 0 minimizes y.
![image-23.png](attachment:image-23.png)

**Gradients in TensorFlow** Let's use TensorFlow to compute the gradient. We will start by defining a variable, x, which we initialize to minus one point zero. We will then define y to be x squared within an instance of gradient tape. Note that we apply the watch method to an instance of gradient tape and then pass the variable x. This will allow us to compute the rate of change of y with respect to x. Next, we compute the gradient of y with respect to x using the tape instance of gradient tape. Note that y is the first argument and x is the second. As written, the operation computes the slope of y at a point. Running the code and printing, we find that the slope is -2 at x equals -1, which means that y is initially decreasing in x, as we saw on the previous slide. Much of the differentiation you do in deep learning models will be handled by high level APIs; however, gradient tape remains an invaluable tool for building advanced and custom models.
![image-24.png](attachment:image-24.png)

**Images as tensors** We'll next consider an operation that is particularly useful for image classification problems: reshaping. The grayscale image shown has a natural representation as a matrix with values between 0 and 255. While some algorithms exploit this shape, others require you to reshape matrices into vectors before using them as inputs, as shown in the diagram.
![image-25.png](attachment:image-25.png)

**How to reshape a grayscale image** Now that you've seen how images can be represented as tensors, let's generate some input images and reshape them. We will create a random grayscale image by drawing numbers from the set of integers between 0 and 255. We will use these to populate a 2 by 2 matrix. We can then reshape this into a 4 by 1 vector, as shown in the diagram.
![image-26.png](attachment:image-26.png)

**How to reshape a color image** For color images, we will generate 3 such matrices to form a 2 by 2 by 3 tensor. We could then reshape the image into a 4 by 3 tensor, as shown in the diagram.
![image-27.png](attachment:image-27.png)

# Linear Models

Learning Objectives:
- how to build, solve, and make predictions with models in TensorFlow 2
- linear regression model – predict housing prices
- load and manipulate data, construct loss functions, perform minimization, make predictions, & reduce resource use with batch training

### Input Data

In the previous section, we learned how to perform core TensorFlow operations. In this section, we will work towards training a linear model with TensorFlow.

**Using data in TensorFlow** So far, we've only generated data using functions like ones and random uniform; however, when we train a machine learning model, we will want to import data from an external source. This may include numeric, image, or text data. Beyond simply importing the data, numeric data will need to be assigned a type, and text and image data will need to be converted to a usable format.
![image.png](attachment:image.png)

External datasets can be imported using TensorFlow. While this is useful for complex data pipelines, it will be unnecessarily complicated for what we do in this chapter. For that reason, we will use simpler options to import data. We will then convert the data into an NumPy array, which we can use without further modification in TensorFlow.

Let's start by importing numpy under the alias np and pandas under the alias pd. We will then read housing transaction data from kc_housing.csv using the pandas method read csv and assign it to a dataframe called housing. When you are ready to train a model, you will want to convert the data into a numpy array by passing the pandas dataframe, housing, to np array. We will focus on loading data from csv files in this chapter, but you can also use pandas to load data from other formats, such as json, html, and excel.
![image-2.png](attachment:image-2.png)

**Parameters of read_csv()** Let's take a closer look at the read csv method of pandas, since you will use it frequently to import data. In the code block, we filled in the only required parameter, which was the filepath or buffer. Note that you could have instead supplied a URL, rather than a filepath to load your data. Another important parameter is sep, which is the delimiter that separates columns in your dataset. By default, this will be a comma; however, other common choices are semi-colons and tabs. Note that if you do use whitespace as a delimiter, you will need to set the delim whitespace parameter to true. Finally, if you are working with datasets that contain non-ASCII characters, you can specify the appropriate choice of encoding, so that your characters are correctly parsed.
![image-3.png](attachment:image-3.png)

**Using mixed type datasets** Finally, we will end this lesson by talking about how to transform imported data for use in TensorFlow. We will use housing data from King County, Washington as an example. Notice how the dataset contains columns with different types. One column contains data on house prices in a floating point format. Another column is a boolean variable, which can either be true, 1, or false, 0. In this case, a 1 indicates that a property is located on the waterfront.
![image-4.png](attachment:image-4.png)

**Setting the data type** Let's say we want to perform TensorFlow operations that require price to be a 32-bit floating point number and waterfront to be a boolean. We can do this in two ways. The first approach uses the array method from numpy. We select the relevant column in the DataFrame, provide it as the first argument to array, and then provide the datatype as the second argument.
![image-5.png](attachment:image-5.png)

The second approach uses the cast operation from TensorFlow. Again, we supply the data first and the data type second. While either tf cast or np array will work, waterfront will be a tf dot Tensor type under the former option and a numpy array under the latter.
![image-6.png](attachment:image-6.png)

### Loss Function

We now know how to import datasets and perform TensorFlow operations on them, but how can we use this knowledge to train models? In this video, we'll move closer to that goal by taking a look at loss functions.

2. Introduction to loss functions
Loss functions play a fundamental role in machine learning. We need loss functions to train models because they tell us how well our model explains the data. Without this feedback, it is unclear how to adjust model parameters during the training process. A high loss value indicates that the model fit is poor. Typically, we train the model by selecting parameter values that minimize the loss function. In some cases, we may want to maximize a function instead. Fortunately, we can always place a minus sign before the function we want to maximize and instead minimize it. For this reason, we will always talk about loss functions and minimization.

3. Common loss functions in TensorFlow
While it is possible to define a custom loss function, this is typically not necessary, since many common options are available in TensorFlow. Typical choices for training linear models include the mean squared error loss, the mean absolute error loss, and the Huber loss. All of these are accessible from tf dot keras dot losses.

4. Why do we care about loss functions?
Here, we plot the MSE, MAE, and Huber loss for error values between minus two and two. Note that the MSE strongly penalizes outliers and has high sensitivity near the minimum. The MAE scales linearly with the size of the error and has low sensitivity near the minimum. And the Huber loss is similar to the MSE near zero and similar to the MAE away from zero. For greater sensitivity near the minimum, you will want to use the MSE or Huber loss. To minimize the impact of outliers, you will want to use the MAE or Huber loss.

5. Defining a loss function
Let's say we decide to use the MSE loss. We'll need two tensors to compute it: the actual values or "targets" tensor and the predicted values or "predictions." Passing them to the MSE operation will return a single number: the average of the squared differences between the actual and predicted values.

6. Defining a loss function
In many cases, the training process will require us to supply a function that accepts our model's variables and data and returns a loss. Here, we'll first define a model, "linear_regression," which takes the intercept, slope, and features as arguments and returns the model's predictions. We'll next define a loss function called "loss_function" that accepts the slope and intercept of a linear model -- the variables -- and the input data, the targets and the features. It then makes a prediction and computes and returns the associated MSE loss. Note that we've defined both functions to use default argument values for features and targets. We will do this whenever we train on the full sample to simplify the code.

7. Defining the loss function
Notice that we've nested TensorFlow's MSE loss function within a function that first uses the model to make predictions and then uses those predictions as an input to the MSE loss function. We can then evaluate this function for a given set of parameter values and input data. Here, we've evaluated the loss function using a test dataset and it returned a loss value of ten point seven seven. If we had omitted the data arguments, test_targets and test_features, the loss function would have instead used the default targets and features arguments we set to evaluate model performance.

### Linear Regression
### Batch Training

# Neural Networks

- build, train, and make predictions with neural networks
- define dense layers, apply activation functions, select an optimizer, and apply regularization to reduce overfitting
- take advantage of TensorFlow's flexibility by using both low-level linear algebra and high-level Keras API operations to define and train models

### Dense Layer
### Activation Functions
### Optimizers
### Training a Network in TensorFlow

# High Level APIs

- use high-level APIs in TensorFlow 2 to train a sign language letter classifier
- use both the sequential and functional Keras APIs to train, validate, make predictions with, and evaluate models
- use the Estimators API to streamline the model definition and training process, and to avoid errors

### Defining Neural Networks with Keras
### Training & Validating with Keras
### Training Models with the Estimators API

# Review

In section 1, you learned low-level, basic, and advanced operations in TensorFlow. You learned how to define and manipulate variables and constants. You also learned the graph-based computational model that underlies TensorFlow and how it can be used to compute gradients and solve arbitrary optimization problems. 

In section 2, you learned how to load and transform data for use in your TensorFlow projects. You also saw how to use predefined and custom loss functions. We ended with a discussion of how to train models, and when and how to divide the training into batches.

In chapter 3, we moved on to training neural networks. You learned how to define neural network architecture in TensorFlow, both using low-level linear algebra operations and high-level Keras API operations. We talked about how to select activation functions and optimizers, and, ultimately, how to train models. 

In chapter 4, you learned how to make full use of the Keras API to train models in TensorFlow. We discussed the training and validation process and also introduced the high-level Estimators API, which can be used to streamline the production process.

**Extensions.** In addition to what we covered, there are also a two important TensorFlow extensions that did not fit into the course, but may be worthwhile to explore on your own. The first is **TensorFlow Hub**, which allows users to import pretrained models that can then be used to perform transfer learning. This will be particularly useful when you want to train an image classifier with a small number of images, but want to make use of a feature-extractor trained on a much larger set of different images. **TensorFlow Probability** is another exciting extension, which is also currently available as a standalone module. One benefit of using TensorFlow Probability is that it provides additional statistical distributions that can be used for random number generation. It also enables you to incorporate trainable statistical distributions into your models. Finally, TensorFlow Probability provides an extended set of optimizers that are commonly used in statistical research. This gives you additional tools beyond what the core TensorFlow module provides.

**TensorFlow 2.0.** Finally, I will say a few words about the difference between TensorFlow 2 and TensorFlow 1. If you primarily develop in 1, you may have noticed that you do not need to define static graphs or enable eager execution. This is done automatically in 2. Furthermore, TensorFlow 2 has substantially tighter integration with Keras. In fact, the core functionality of the TensorFlow 1 train module is handled by tf.Keras operations in 2. In addition to the centrality of Keras, the Estimators API also plays a more important role in TensorFlow 2. Finally, TensorFlow 2 also allows you to use static graphs, but they are available through the tf.function operation.

## Sources

Hull, Isaiah (2022). Introduction to Tensor Flow in Python, Datacamp. Available from https://app.datacamp.com/learn/courses/natural-language-generation-in-python![image.png](attachment:image.png)