# TensorFlow Fundamentals: Tensors and Basic Operations

## Learning Objectives
By the end of this notebook, you will understand:
- What TensorFlow is and why it's used
- How to create and manipulate tensors
- Basic tensor operations and broadcasting
- Data types and shapes in TensorFlow
- Converting between TensorFlow tensors and NumPy arrays

## What is TensorFlow?
TensorFlow is an open-source machine learning framework developed by Google. It's designed to make it easy to build and deploy machine learning models. The name "TensorFlow" comes from the way the library handles data - as tensors (multi-dimensional arrays) that flow through a computational graph.

In [None]:
# Import TensorFlow - the main library we'll be using throughout this course
import tensorflow as tf
import numpy as np  # NumPy for numerical operations and comparison

# Check TensorFlow version - important for compatibility
print(f"TensorFlow version: {tf.__version__}")

# Check if GPU is available - this will speed up training significantly
print(f"GPU available: {tf.config.list_physical_devices('GPU')}")
print(f"Built with CUDA: {tf.test.is_built_with_cuda()}")

## Understanding Tensors

A **tensor** is a generalization of vectors and matrices to potentially higher dimensions. Think of it as:
- **Scalar (0-D tensor)**: A single number
- **Vector (1-D tensor)**: An array of numbers  
- **Matrix (2-D tensor)**: A 2D array of numbers
- **3-D tensor and higher**: Multi-dimensional arrays

In deep learning:
- Images are typically 3-D tensors (height, width, channels)
- Video data is 4-D tensors (time, height, width, channels)
- Batch of images is 4-D tensors (batch_size, height, width, channels)

In [None]:
# Creating tensors - different ways to initialize tensor data

# 1. Scalar (0-D tensor) - just a single number
scalar = tf.constant(7)
print(f"Scalar: {scalar}")
print(f"Scalar shape: {scalar.shape}")
print(f"Scalar dimensions: {scalar.ndim}")
print()

# 2. Vector (1-D tensor) - a list of numbers
vector = tf.constant([10, 10])
print(f"Vector: {vector}")
print(f"Vector shape: {vector.shape}")  # shape tells us the dimensions
print(f"Vector dimensions: {vector.ndim}")  # ndim tells us how many dimensions
print()

# 3. Matrix (2-D tensor) - a 2D array of numbers
matrix = tf.constant([[10, 7],
                      [3, 2],
                      [8, 9]])
print(f"Matrix: {matrix}")
print(f"Matrix shape: {matrix.shape}")  # (3, 2) means 3 rows, 2 columns
print(f"Matrix dimensions: {matrix.ndim}")
print()

# 4. 3-D tensor - could represent an RGB image (height, width, channels)
tensor_3d = tf.constant([[[1, 2, 3],
                          [4, 5, 6]],
                         [[7, 8, 9],
                          [10, 11, 12]]])
print(f"3-D Tensor: {tensor_3d}")
print(f"3-D Tensor shape: {tensor_3d.shape}")  # (2, 2, 3)
print(f"3-D Tensor dimensions: {tensor_3d.ndim}")

## Creating Tensors with Different Methods

TensorFlow provides many ways to create tensors. Each method is useful in different scenarios:
- `tf.constant()`: Creates immutable tensors with fixed values
- `tf.Variable()`: Creates mutable tensors that can be updated (used for model parameters)
- `tf.zeros()`, `tf.ones()`: Create tensors filled with zeros or ones
- `tf.random.normal()`: Create tensors with random values

In [None]:
# Different ways to create tensors

# 1. tf.constant - creates immutable (unchangeable) tensors
constant_tensor = tf.constant([1, 2, 3])
print(f"Constant tensor: {constant_tensor}")

# 2. tf.Variable - creates mutable (changeable) tensors
# These are typically used for model parameters (weights, biases)
variable_tensor = tf.Variable([1, 2, 3])
print(f"Variable tensor: {variable_tensor}")
print(f"Variable is trainable: {variable_tensor.trainable}")  # Can this be updated during training?

# 3. Create tensors filled with zeros
zeros_tensor = tf.zeros(shape=(3, 4))  # 3 rows, 4 columns, all zeros
print(f"Zeros tensor:\n{zeros_tensor}")

# 4. Create tensors filled with ones
ones_tensor = tf.ones(shape=(2, 3, 4))  # 3D tensor filled with ones
print(f"Ones tensor shape: {ones_tensor.shape}")
print(f"Ones tensor:\n{ones_tensor}")

# 5. Create tensors with random values
# Random normal distribution (mean=0, std=1)
random_tensor = tf.random.normal(shape=(3, 2))
print(f"Random normal tensor:\n{random_tensor}")

# Random uniform distribution (values between 0 and 1)
random_uniform = tf.random.uniform(shape=(2, 3))
print(f"Random uniform tensor:\n{random_uniform}")

## Tensor Data Types (dtypes)

**Data type** determines what kind of numbers a tensor can store and how much memory it uses:
- `float32`: 32-bit floating point numbers (most common for deep learning)
- `float64`: 64-bit floating point numbers (double precision)
- `int32`: 32-bit integers
- `bool`: Boolean values (True/False)

**Why dtype matters:**
- Neural networks typically use `float32` for good balance of precision and memory
- Using the wrong dtype can cause errors or poor performance
- You can't mix different dtypes in operations without explicit conversion

In [None]:
# Working with different data types

# Default data type is usually float32 for floating point numbers
float_tensor = tf.constant([1.7, 2.4, 3.6])
print(f"Float tensor dtype: {float_tensor.dtype}")

# Explicitly specify data type
float16_tensor = tf.constant([1.7, 2.4], dtype=tf.float16)  # Uses less memory
print(f"Float16 tensor dtype: {float16_tensor.dtype}")

# Integer tensor
int_tensor = tf.constant([1, 2, 3], dtype=tf.int32)
print(f"Integer tensor dtype: {int_tensor.dtype}")

# Boolean tensor
bool_tensor = tf.constant([True, False, True], dtype=tf.bool)
print(f"Boolean tensor: {bool_tensor}")
print(f"Boolean tensor dtype: {bool_tensor.dtype}")

# Converting between data types
float_tensor_converted = tf.cast(int_tensor, dtype=tf.float32)
print(f"Converted to float32: {float_tensor_converted}")
print(f"New dtype: {float_tensor_converted.dtype}")

## Tensor Attributes and Information

Every tensor has important attributes that tell us about its structure:
- **Shape**: The dimensions of the tensor (e.g., (3, 4) for 3 rows, 4 columns)
- **Rank/Dimensions**: How many dimensions the tensor has
- **Dtype**: The data type of the elements
- **Size**: Total number of elements in the tensor

In [None]:
# Exploring tensor attributes
example_tensor = tf.random.normal(shape=(2, 3, 4, 5))

print(f"Tensor: {example_tensor.shape}")
print(f"Shape: {example_tensor.shape}")  # Dimensions of the tensor
print(f"Rank (number of dimensions): {example_tensor.ndim}")
print(f"Data type: {example_tensor.dtype}")
print(f"Size (total elements): {tf.size(example_tensor)}")
print(f"Size as Python int: {tf.size(example_tensor).numpy()}")

# Get specific dimension sizes
print(f"First dimension size: {example_tensor.shape[0]}")
print(f"All dimension sizes: {example_tensor.shape.as_list()}")

# Shape can also be accessed as a tensor
shape_tensor = tf.shape(example_tensor)
print(f"Shape as tensor: {shape_tensor}")

## Basic Tensor Operations

TensorFlow supports all the basic mathematical operations you'd expect:
- **Element-wise operations**: Applied to each element individually
- **Reduction operations**: Combine elements (like sum, mean)
- **Linear algebra operations**: Matrix multiplication, etc.

These operations are **vectorized**, meaning they're applied to entire tensors at once, which is much faster than using Python loops.

In [None]:
# Basic tensor operations

# Create some sample tensors
a = tf.constant([1, 2, 3, 4])
b = tf.constant([10, 20, 30, 40])

print(f"Tensor a: {a}")
print(f"Tensor b: {b}")
print()

# Element-wise operations (applied to each element)
print("Element-wise operations:")
print(f"a + b = {a + b}")  # Addition
print(f"a - b = {a - b}")  # Subtraction  
print(f"a * b = {a * b}")  # Multiplication
print(f"b / a = {b / a}")  # Division
print(f"a ** 2 = {a ** 2}")  # Power
print()

# You can also use TensorFlow functions explicitly
print("Using TensorFlow functions:")
print(f"tf.add(a, b) = {tf.add(a, b)}")
print(f"tf.multiply(a, b) = {tf.multiply(a, b)}")
print(f"tf.square(a) = {tf.square(a)}")
print(f"tf.sqrt(tf.cast(a, tf.float32)) = {tf.sqrt(tf.cast(a, tf.float32))}")
print()

# Reduction operations (combine all elements)
print("Reduction operations:")
print(f"Sum of a: {tf.reduce_sum(a)}")
print(f"Mean of a: {tf.reduce_mean(tf.cast(a, tf.float32))}")
print(f"Max of a: {tf.reduce_max(a)}")
print(f"Min of a: {tf.reduce_min(a)}")

## Matrix Operations

**Matrix multiplication** is fundamental to deep learning - it's how data flows through neural network layers.

Important distinction:
- `*` or `tf.multiply()`: Element-wise multiplication
- `@` or `tf.matmul()`: Matrix multiplication

For matrix multiplication, the inner dimensions must match: (a, b) × (b, c) = (a, c)

In [None]:
# Matrix operations

# Create 2D tensors (matrices)
matrix_a = tf.constant([[1, 2],
                        [3, 4],
                        [5, 6]])

matrix_b = tf.constant([[7, 8, 9],
                        [10, 11, 12]])

print(f"Matrix A shape: {matrix_a.shape}")  # (3, 2)
print(f"Matrix A:\n{matrix_a}")
print()
print(f"Matrix B shape: {matrix_b.shape}")  # (2, 3)
print(f"Matrix B:\n{matrix_b}")
print()

# Matrix multiplication: (3,2) × (2,3) = (3,3)
matrix_mult_result = tf.matmul(matrix_a, matrix_b)
print(f"Matrix multiplication result shape: {matrix_mult_result.shape}")
print(f"A @ B =\n{matrix_mult_result}")

# Alternative syntax using @ operator
matrix_mult_result2 = matrix_a @ matrix_b
print(f"Same result using @ operator:\n{matrix_mult_result2}")
print()

# Element-wise multiplication (different from matrix multiplication!)
# For this, shapes must be exactly the same or broadcastable
matrix_c = tf.constant([[1, 2],
                        [3, 4],
                        [5, 6]])

matrix_d = tf.constant([[2, 2],
                        [3, 3],
                        [4, 4]])

element_wise_mult = matrix_c * matrix_d
print(f"Element-wise multiplication:\n{element_wise_mult}")

## Tensor Indexing and Slicing

Just like NumPy arrays, you can access specific elements or slices of tensors:
- Use square brackets `[]` to index
- Use colons `:` for slicing ranges
- Negative indices count from the end
- Multiple dimensions are separated by commas

In [None]:
# Tensor indexing and slicing

# Create a sample tensor
sample_tensor = tf.constant([[1, 2, 3, 4],
                            [5, 6, 7, 8],
                            [9, 10, 11, 12]])

print(f"Original tensor:\n{sample_tensor}")
print(f"Shape: {sample_tensor.shape}")  # (3, 4)
print()

# Indexing specific elements
print("Indexing specific elements:")
print(f"Element at [0, 0]: {sample_tensor[0, 0]}")  # First row, first column
print(f"Element at [1, 2]: {sample_tensor[1, 2]}")  # Second row, third column
print(f"Element at [-1, -1]: {sample_tensor[-1, -1]}")  # Last row, last column
print()

# Slicing rows and columns
print("Slicing:")
print(f"First row: {sample_tensor[0, :]}")  # All columns of first row
print(f"First column: {sample_tensor[:, 0]}")  # All rows of first column
print(f"Last row: {sample_tensor[-1, :]}")  # All columns of last row
print(f"Middle columns: {sample_tensor[:, 1:3]}")  # All rows, columns 1-2
print()

# More complex slicing
print("Complex slicing:")
print(f"Top-left 2x2:\n{sample_tensor[:2, :2]}")
print(f"Every other element in first row: {sample_tensor[0, ::2]}")
print(f"Reverse the tensor:\n{sample_tensor[::-1, ::-1]}")

## Broadcasting

**Broadcasting** allows TensorFlow to perform operations on tensors with different shapes by automatically expanding the smaller tensor to match the larger one.

Rules for broadcasting:
1. Start from the trailing (rightmost) dimensions
2. Dimensions are compatible if they're equal, or one of them is 1
3. Missing dimensions are assumed to be 1

This is very useful in deep learning for operations like adding bias terms to layers.

In [None]:
# Broadcasting examples

# Example 1: Adding a scalar to a tensor
tensor_2d = tf.constant([[1, 2, 3],
                        [4, 5, 6]])
scalar = tf.constant(10)

print(f"2D tensor shape: {tensor_2d.shape}")  # (2, 3)
print(f"Scalar shape: {scalar.shape}")  # () - empty shape means scalar
print(f"Broadcasting result:\n{tensor_2d + scalar}")
print()

# Example 2: Adding tensors with compatible shapes
tensor_a = tf.constant([[1, 2, 3],
                       [4, 5, 6]])  # Shape: (2, 3)

tensor_b = tf.constant([10, 20, 30])  # Shape: (3,) - this will broadcast to (1, 3), then (2, 3)

print(f"Tensor A shape: {tensor_a.shape}")
print(f"Tensor B shape: {tensor_b.shape}")
print(f"Broadcasting result:\n{tensor_a + tensor_b}")
print()

# Example 3: Broadcasting with different dimensions
tensor_c = tf.constant([[[1, 2],
                        [3, 4]]])  # Shape: (1, 2, 2)

tensor_d = tf.constant([[10],
                       [20]])     # Shape: (2, 1)

print(f"Tensor C shape: {tensor_c.shape}")
print(f"Tensor D shape: {tensor_d.shape}")
print(f"Broadcasting result shape: {(tensor_c + tensor_d).shape}")  # Results in (1, 2, 2)
print(f"Broadcasting result:\n{tensor_c + tensor_d}")

# Example of incompatible shapes (this would cause an error)
# tensor_e = tf.constant([[1, 2, 3]])  # Shape: (1, 3)
# tensor_f = tf.constant([[1], [2]])   # Shape: (2, 1)
# result = tensor_e + tensor_f  # This would work! Results in (2, 3)

print("\nBroadcasting makes many operations much more convenient!")

## Converting Between TensorFlow and NumPy

TensorFlow tensors and NumPy arrays are closely related:
- Use `.numpy()` method to convert TensorFlow tensor to NumPy array
- Use `tf.constant()` or `tf.convert_to_tensor()` to convert NumPy array to TensorFlow tensor
- They can often be used interchangeably in TensorFlow operations

In [None]:
# Converting between TensorFlow tensors and NumPy arrays

# Create a TensorFlow tensor
tf_tensor = tf.constant([1, 2, 3, 4, 5])
print(f"TensorFlow tensor: {tf_tensor}")
print(f"Type: {type(tf_tensor)}")
print()

# Convert TensorFlow tensor to NumPy array
numpy_array = tf_tensor.numpy()
print(f"NumPy array: {numpy_array}")
print(f"Type: {type(numpy_array)}")
print()

# Create a NumPy array
numpy_original = np.array([10, 20, 30, 40, 50])
print(f"Original NumPy array: {numpy_original}")
print(f"Type: {type(numpy_original)}")
print()

# Convert NumPy array to TensorFlow tensor
tf_from_numpy = tf.constant(numpy_original)
print(f"TensorFlow from NumPy: {tf_from_numpy}")
print(f"Type: {type(tf_from_numpy)}")

# Alternative conversion method
tf_converted = tf.convert_to_tensor(numpy_original)
print(f"TensorFlow converted: {tf_converted}")
print()

# TensorFlow operations work with NumPy arrays too!
numpy_array1 = np.array([1, 2, 3])
numpy_array2 = np.array([4, 5, 6])

# TensorFlow automatically converts NumPy arrays
tf_result = tf.add(numpy_array1, numpy_array2)
print(f"TensorFlow operation on NumPy arrays: {tf_result}")
print(f"Result type: {type(tf_result)}")

# Note: Variables need special handling
tf_variable = tf.Variable([1, 2, 3])
variable_as_numpy = tf_variable.numpy()
print(f"Variable as NumPy: {variable_as_numpy}")

## Practical Exercise: Image Data Simulation

Let's apply what we've learned to a practical example. We'll simulate working with image data, which is very common in deep learning.

In [None]:
# Practical exercise: Working with image-like data

# Simulate a batch of RGB images
# Shape: (batch_size, height, width, channels)
# Let's say we have 32 images, each 28x28 pixels with 3 color channels (RGB)
batch_size = 32
height = 28
width = 28
channels = 3

# Create random image data (pixel values between 0 and 255)
image_batch = tf.random.uniform(
    shape=(batch_size, height, width, channels),
    minval=0,
    maxval=255,
    dtype=tf.float32
)

print(f"Image batch shape: {image_batch.shape}")
print(f"Data type: {image_batch.dtype}")
print(f"Min value: {tf.reduce_min(image_batch)}")
print(f"Max value: {tf.reduce_max(image_batch)}")
print()

# Common preprocessing: Normalize pixel values to [0, 1] range
normalized_images = image_batch / 255.0
print(f"After normalization:")
print(f"Min value: {tf.reduce_min(normalized_images)}")
print(f"Max value: {tf.reduce_max(normalized_images)}")
print()

# Get a single image from the batch
single_image = image_batch[0]  # First image in the batch
print(f"Single image shape: {single_image.shape}")  # (28, 28, 3)

# Get all red channels from all images
red_channels = image_batch[:, :, :, 0]  # All images, all pixels, red channel only
print(f"Red channels shape: {red_channels.shape}")  # (32, 28, 28)

# Calculate mean pixel value across the batch
mean_pixel_value = tf.reduce_mean(normalized_images)
print(f"Mean pixel value across batch: {mean_pixel_value}")

# Calculate mean for each channel
mean_per_channel = tf.reduce_mean(normalized_images, axis=[0, 1, 2])  # Average over batch, height, width
print(f"Mean per channel (R, G, B): {mean_per_channel}")

## Summary and Key Takeaways

In this notebook, you learned:

1. **What tensors are**: Multi-dimensional arrays that are the fundamental data structure in TensorFlow
2. **Creating tensors**: Using `tf.constant()`, `tf.Variable()`, `tf.zeros()`, `tf.ones()`, and random functions
3. **Tensor attributes**: Shape, dtype, rank, and size
4. **Basic operations**: Element-wise operations, matrix multiplication, and reductions
5. **Indexing and slicing**: Accessing specific elements and ranges of tensors
6. **Broadcasting**: How TensorFlow handles operations between different shaped tensors
7. **NumPy integration**: Converting between TensorFlow tensors and NumPy arrays

## Next Steps

Now that you understand tensors and basic operations, you're ready to learn about:
- Automatic differentiation (AutoDiff) with GradientTape
- Building neural network layers
- The Keras API for model building

Practice these concepts by experimenting with different tensor shapes and operations!