# PyTorch Tensor Mastery Practice

This notebook contains 50+ progressive questions to master PyTorch tensors. Work through them in order, as they build on each other.

## How to Use This Practice Guide

1. Read each question carefully
2. Implement the solution in the code cell below each question
3. Test your implementation with different inputs
4. Check edge cases and error handling
5. Compare your approach with PyTorch's built-in functions when applicable

## Learning Objectives

By completing these exercises, you will master:

- Tensor creation and initialization
- Shape manipulation and broadcasting
- Mathematical operations and reductions
- Device management (CPU/GPU)
- Memory management and efficiency
- Advanced indexing and slicing
- Real-world deep learning tensor patterns


In [None]:
import torch
import numpy as np

In [None]:
# ========================================
# SECTION 1: BASIC TENSOR CREATION (1-15)
# ========================================

# 1. Create a 2x3 tensor filled with zeros using torch.zeros()
# ✅ CORRECT: Good implementation with explicit dtype and requires_grad
# Note: requires_grad=False is default, so it's optional
x = torch.zeros(2, 3, dtype=torch.int16, requires_grad=False)
print(f"Q1 - Shape: {x.shape}, Dtype: {x.dtype}")
print(x)

# 2. Create a 4x4 identity matrix using torch.eye()
# ✅ CORRECT: Perfect implementation
identity = torch.eye(4)
print(f"\nQ2 - Identity matrix shape: {identity.shape}")
print(identity)

# 3. Create a tensor from the Python list [[1, 2, 3], [4, 5, 6]] and print its shape
# ✅ CORRECT: Perfect implementation
x = torch.tensor([[1, 2, 3], [4, 5, 6]])
print(f"\nQ3 - Shape: {x.shape}")
print(x)

# 4. Create a random tensor of shape (3, 5) with values between 0 and 1
# ✅ CORRECT: torch.rand() creates values between 0 and 1
random_tensor = torch.rand(3, 5)
print(
    f"\nQ4 - Shape: {random_tensor.shape}, Min: {random_tensor.min():.3f}, Max: {random_tensor.max():.3f}"
)

# 5. Create a tensor of shape (2, 4) filled with the value 7.5
# ✅ CORRECT: Good approach using torch.ones() * 7.5
# Alternative: torch.full((2, 4), 7.5)
filled_tensor = torch.ones(2, 4) * 7.5
print(f"\nQ5 - Shape: {filled_tensor.shape}, Unique values: {filled_tensor.unique()}")
# Alternative approach:
filled_alt = torch.full((2, 4), 7.5)
print(f"Alternative approach: {filled_alt.unique()}")

# 6. Convert the NumPy array np.array([1.1, 2.2, 3.3]) to a PyTorch tensor
# ❌ INCOMPLETE: You didn't complete this question separately
np_array = np.array([1.1, 2.2, 3.3])
torch_from_numpy = torch.from_numpy(np_array)
print(f"\nQ6 - Original NumPy: {np_array}")
print(f"Q6 - Converted tensor: {torch_from_numpy}")
print(
    f"Q6 - Shares memory: {torch_from_numpy.data_ptr() == np_array.__array_interface__['data'][0]}"
)

# 7. Create a tensor like an existing tensor but filled with ones (use torch.ones_like)
# ✅ CORRECT: Good implementation, but you combined it with Q6
# Let's separate them for clarity
reference_tensor = torch.from_numpy(np_array)
ones_like_tensor = torch.ones_like(reference_tensor)
print(
    f"\nQ7 - Reference shape: {reference_tensor.shape}, dtype: {reference_tensor.dtype}"
)
print(f"Q7 - Ones like: {ones_like_tensor}")

# 8. Create a linearly spaced tensor from 0 to 10 with 50 points (use torch.linspace)
# ✅ CORRECT: Perfect implementation
linspace_tensor = torch.linspace(0, 10, 50)
print(f"\nQ8 - Shape: {linspace_tensor.shape}")
print(f"Q8 - First 5 values: {linspace_tensor[:5]}")
print(f"Q8 - Last 5 values: {linspace_tensor[-5:]}")

# 9. Create a tensor with random integers between 1 and 100, shape (3, 3)
# ⚠️ ISSUE: torch.rand(3,3)*100 gives floats between 0-100, not integers 1-100
# Your approach:
float_random = torch.rand(3, 3) * 100
print(f"\nQ9 - Your approach (floats 0-100): {float_random.dtype}")
print(f"Sample values: {float_random[0, :3]}")

# Correct approaches for integers 1-100:
int_random_v1 = torch.randint(1, 101, (3, 3))  # Method 1: direct randint
int_random_v2 = (torch.rand(3, 3) * 99 + 1).int()  # Method 2: scale and convert
print(f"\nQ9 - Correct approach 1: {int_random_v1.dtype}")
print(int_random_v1)
print(f"Q9 - Correct approach 2: {int_random_v2.dtype}")
print(int_random_v2)

# 10. Create a tensor using torch.arange from 0 to 20 with step size 2
# ✅ CORRECT: Perfect implementation
arange_tensor = torch.arange(0, 20, 2)
print(f"\nQ10 - Values: {arange_tensor}")
print(f"Q10 - Shape: {arange_tensor.shape}")

# 11. Create a 3D tensor of shape (2, 3, 4) filled with random normal distribution
# ✅ CORRECT: Perfect implementation
normal_tensor = torch.randn(2, 3, 4)
print(f"\nQ11 - Shape: {normal_tensor.shape}")
print(f"Q11 - Mean: {normal_tensor.mean():.3f}, Std: {normal_tensor.std():.3f}")

# 12. Create a tensor and explicitly set its dtype to torch.float64
# ✅ CORRECT: Perfect implementation
float64_tensor = torch.randn(3, 3, dtype=torch.float64)
print(f"\nQ12 - Shape: {float64_tensor.shape}, Dtype: {float64_tensor.dtype}")

# 13. Create a boolean tensor of shape (3, 3) with random True/False values
# ❌ INCORRECT: torch.rand(3,3, dtype=torch.bool) doesn't work as expected
# Your approach creates all True values because any non-zero float becomes True
try:
    bool_wrong = torch.rand(3, 3, dtype=torch.bool)
    print(f"\nQ13 - Your approach result: {bool_wrong}")
except:
    print("\nQ13 - Your approach may not work as expected")

# Correct approaches:
bool_correct_v1 = torch.randint(0, 2, (3, 3), dtype=torch.bool)  # Method 1
bool_correct_v2 = torch.rand(3, 3) > 0.5  # Method 2
print(f"Q13 - Correct approach 1: {bool_correct_v1.dtype}")
print(bool_correct_v1)
print(f"Q13 - Correct approach 2: {bool_correct_v2.dtype}")
print(bool_correct_v2)

# 14. Create a tensor from a nested list with mixed data types and observe the result
# ❌ INCOMPLETE: You didn't implement this
mixed_list = [[1, 2.5, 3], [4.0, 5, 6.7]]  # Mix of int and float
mixed_tensor = torch.tensor(mixed_list)
print(f"\nQ14 - Mixed data types list: {mixed_list}")
print(f"Q14 - Resulting tensor: {mixed_tensor}")
print(f"Q14 - Tensor dtype: {mixed_tensor.dtype} (PyTorch promotes to common type)")

# Example with incompatible types:
try:
    incompatible = [[1, 2], ["a", "b"]]  # int and string
    bad_tensor = torch.tensor(incompatible)
except Exception as e:
    print(f"Q14 - Error with incompatible types: {e}")

# 15. Create a tensor and clone it (ensuring no memory sharing)
# ✅ CORRECT: Perfect implementation
x = torch.rand(2, 3)
y = x.clone()
print(f"\nQ15 - Original tensor: {x}")
print(f"Q15 - Cloned tensor: {y}")
print(f"Q15 - Same data: {torch.equal(x, y)}")
print(f"Q15 - Share memory: {x.data_ptr() == y.data_ptr()}")

# Verify no memory sharing:
x[0, 0] = 999
print(f"Q15 - After modifying original: x[0,0]={x[0,0]}, y[0,0]={y[0,0]}")
print(
    f"Q15 - Memory sharing test: {'❌ SHARED' if x[0,0] == y[0,0] else '✅ INDEPENDENT'}"
)

print("\n" + "=" * 60)
print("SECTION 1 SUMMARY:")
print("✅ Correct: Questions 1, 2, 3, 4, 5, 7, 8, 10, 11, 12, 15")
print("⚠️  Issues: Question 9 (should use randint), Question 13 (bool generation)")
print("❌ Incomplete: Questions 6 (merged with 7), 14 (not implemented)")
print("=" * 60)

tensor([1., 2., 3., 4., 5.])


In [None]:
# ==========================================
# SECTION 2: SHAPE MANIPULATION (16-30)
# ==========================================

# 16. Reshape a tensor of shape (12,) to shape (3, 4) using .view()
x = torch.rand(
    12,
)
x = x.view(3, 4)

# 17. Reshape the same tensor to (2, 6) using .reshape()
x = x.reshape(2, 6)

# 18. Explain the difference between .view() and .reshape() with an example
# --- Ans: .view works faster it is a way to represent the same tensor without changing its contiguos memory, but reshape will attempt to reconstruct the tensor

# 19. Use .squeeze() to remove all dimensions of size 1 from a tensor
x = torch.rand(1, 2, 4, 5)
x.squeeze()

# 20. Use .unsqueeze() to add a new dimension at position 1
x.unsqueeze(dim=1)

# 21. Transpose a 2D tensor (swap rows and columns)
x = torch.rand(4, 5)
x.transpose(0, 1)

# 22. Use .permute() to rearrange dimensions of a 4D tensor (B,C,H,W) to (B,H,W,C)
x = torch.rand(1, 2, 4, 5)
x.permute(0, 2, 3, 1)

# 23. Flatten a 3D tensor to 1D using .flatten()
x = torch.rand(2, 4, 5)
x.flatten()

# 24. Flatten only the last two dimensions of a 4D tensor
x = torch.rand(2, 4, 5, 3)
x.flatten(start_dim=2)

# 25. Concatenate two tensors along dimension 0 using torch.cat()
x = torch.rand(2, 4)
y = torch.rand(2, 4)
torch.cat([x, y], dim=0)

# 26. Stack two 2D tensors to create a 3D tensor using torch.stack()
x = torch.rand(2, 4)
y = torch.rand(2, 4)
torch.stack([x, y], dim=0)

# 27. Split a tensor into 3 equal parts along dimension 1
x = torch.rand(5, 6)
torch.chunk(x, 3, dim=1)

# 28. Use torch.chunk() to divide a tensor into 4 chunks
x = torch.rand(5, 8)
torch.chunk(x, 4, dim=1)

# 29. Repeat a tensor 3 times along dimension 0 using .repeat()
x = torch.rand(5, 8)
torch.repeat_interleave(x, 4, dim=0)

# 30. Expand a tensor from shape (1, 3) to shape (5, 3) without copying data

# I dont know

In [None]:
# ==========================================
# SECTION 3: MATHEMATICAL OPERATIONS (31-45)
# ==========================================

# 31. Perform element-wise multiplication of two tensors
x = torch.rand(2, 4)
y = torch.rand(2, 4)
x * y

# 32. Compute matrix multiplication of two 2D tensors using torch.matmul()

x = torch.rand(2, 4)
y = torch.rand(4, 2)
x.matmul(y)

# 33. Use torch.bmm() for batch matrix multiplication (3D tensors)
x = torch.rand(5, 2, 4)
y = torch.rand(5, 4, 2)
torch.bmm(x, y)

# 34. Calculate the mean of a tensor along dimension 1
x.mean(dim=1)

# 35. Find the maximum value and its index using torch.max()
index, value = torch.max(x, dim=1)

# 36. Compute the standard deviation of a tensor along the last dimension
x.std(dim=-1)

# 37. Apply torch.clamp() to restrict tensor values between -1 and 1
torch.clamp(x, -1, 1)

# 38. Use torch.where() to replace negative values with zeros
torch.where(x < 0, 0, x)

# 39. Compute the L2 norm of a tensor
torch.norm(x)

# 40. Calculate the dot product of two 1D tensors
# ❌ FIXED: Use 1D tensors for torch.dot
x1 = torch.rand(5)
y1 = torch.randn(5)
torch.dot(x1, y1)

# 41. Implement ReLU activation function using tensor operations
torch.where(x > 0, x, 0)

# 42. Calculate softmax probabilities for a batch of logits
# ❌ FIXED: Use torch.rand for logits and correct dim
logits = torch.rand(10, 1, 5)
torch.softmax(logits, dim=-1)

# 43. Compute the cross-entropy loss between predictions and targets
# ❌ FIXED: Use CrossEntropyLoss for classification
pred = torch.randn(5, 4)  # logits
labels = torch.randint(0, 4, (5,))  # class indices
loss = torch.nn.CrossEntropyLoss()
loss(pred, labels)


# 44. Implement a function to normalize a tensor (zero mean, unit variance)
# ⚠️ SIMPLIFIED: Use input_tensor.std(dim=0, unbiased=False)
def normalize_tensor(input_tensor: torch.Tensor) -> torch.Tensor:
    mean = input_tensor.mean(dim=0)
    std = input_tensor.std(dim=0, unbiased=False)
    return (input_tensor - mean) / std


# 45. Calculate cosine similarity between two vectors
# ❌ FIXED: Use 1D tensors for cosine similarity
x2 = torch.rand(10)
y2 = torch.rand(10)
torch.dot(x2, y2) / (torch.norm(x2) * torch.norm(y2))


# ==========================================
# SECTION 4: INDEXING & SLICING (46-55)
# ==========================================

# 46. Extract the first row and last column from a 2D tensor
# ✅ Correct: x[0, :] gives first row, x[:, -1] gives last column
x = torch.rand(5, 5)
first_row = x[0, :]
last_col = x[:, -1]

# 47. Use boolean indexing to select elements greater than 0.5
# ✅ Correct: x[x > 0.5] selects elements > 0.5
selected = x[x > 0.5]

# 48. Extract every other element from a 1D tensor using slicing
# ⚠️ Improved: Use slicing for clarity
x = torch.rand(10)
every_other = x[::2]

# 49. Use advanced indexing to gather specific elements from multiple dimensions
# ❌ FIXED: Example using torch.gather for advanced indexing
x = torch.arange(16).reshape(4, 4)
rows = torch.tensor([0, 1, 2])
cols = torch.tensor([1, 2, 3])
advanced = x[rows, cols]  # gathers (0,1), (1,2), (2,3)

# 50. Implement fancy indexing to reorder rows of a matrix
# ❌ FIXED: Example of reordering rows
x = torch.arange(12).reshape(4, 3)
order = torch.tensor([2, 0, 3, 1])
reordered = x[order]

# 51. Use torch.gather() to select elements along a dimension
# ❌ FIXED: Example for dim=1
x = torch.tensor([[10, 20, 30], [40, 50, 60]])
indices = torch.tensor([[2, 1, 0], [0, 2, 1]])
gathered = torch.gather(x, 1, indices)

# 52. Mask out certain elements of a tensor and replace with a value
# ❌ FIXED: Use torch.where to mask
x = torch.arange(6).float()
mask = x > 2
masked = torch.where(mask, torch.tensor(-1.0), x)

# 53. Extract a diagonal from a 2D tensor
# ❌ FIXED: Use torch.diag or torch.diagonal
x = torch.arange(9).reshape(3, 3)
diag = torch.diagonal(x)

# 54. Use torch.nonzero() to find indices of non-zero elements
# ✅ Correct: torch.nonzero(x)
x = torch.tensor([0, 1, 0, 2])
nonzero_indices = torch.nonzero(x)

# 55. Implement tensor indexing that mimics NumPy's ix_() function
# ❌ FIXED: Use torch.meshgrid for ix_ functionality
row_idx = torch.tensor([0, 2])
col_idx = torch.tensor([1, 3])
ix_rows, ix_cols = torch.meshgrid(row_idx, col_idx, indexing="ij")
# Use for advanced indexing: x[ix_rows, ix_cols]

tensor([[[ 0,  6],
         [ 3,  9]],

        [[ 1,  7],
         [ 4, 10]],

        [[ 2,  8],
         [ 5, 11]]], dtype=torch.int16)

In [None]:
# ==========================================
# SECTION 5: DEVICE MANAGEMENT (56-65)
# ==========================================

# 56. Check if CUDA is available and print GPU information
# ✅ Correct: Checks CUDA and prints device info
if torch.cuda.is_available():
    info = torch.cuda.get_device_properties(0)
    print(info)
else:
    print("CUDA not available.")

# 57. Move a tensor to GPU (if available) and back to CPU
# ⚠️ Improved: Use .to() and .cpu() for round-trip
x = torch.rand(3, 3)
if torch.cuda.is_available():
    x_gpu = x.to("cuda")
    x_cpu = x_gpu.to("cpu")

# 58. Create a tensor directly on GPU device
# ⚠️ Improved: Only create on GPU if available
if torch.cuda.is_available():
    y = torch.rand(3, 3, device="cuda")
else:
    y = torch.rand(3, 3)

# 59. Perform operations between tensors on the same device
# ✅ Correct: x and y must be on the same device
if x.device == y.device:
    result = x + y
else:
    result = x.to(y.device) + y

# 60. Handle device mismatch errors gracefully
# ❌ FIXED: Example with try/except
try:
    z = x + y
except RuntimeError as e:
    print(f"Device mismatch error: {e}")

# 61. Compare performance of CPU vs GPU tensor operations
# ❌ FIXED: Use torch.cuda.synchronize and time module
import time

x_cpu = torch.rand(10000, 10000)
start = time.time()
res_cpu = x_cpu @ x_cpu
cpu_time = time.time() - start
if torch.cuda.is_available():
    x_gpu = x_cpu.to("cuda")
    torch.cuda.synchronize()
    start = time.time()
    res_gpu = x_gpu @ x_gpu
    torch.cuda.synchronize()
    gpu_time = time.time() - start
    print(f"CPU time: {cpu_time:.4f}s, GPU time: {gpu_time:.4f}s")
else:
    print(f"CPU time: {cpu_time:.4f}s, GPU not available.")


# 62. Implement a function that ensures tensors are on the correct device
# ❌ FIXED: Function to move tensor to target device
def ensure_device(tensor, device):
    return tensor.to(device)


# 63. Use torch.cuda.synchronize() to measure accurate GPU timing
# ❌ FIXED: Example usage for timing
if torch.cuda.is_available():
    torch.cuda.synchronize()
    start = time.time()
    _ = x_gpu @ x_gpu
    torch.cuda.synchronize()
    elapsed = time.time() - start
    print(f"Elapsed (GPU, synchronized): {elapsed:.4f}s")

# 64. Monitor GPU memory usage during tensor operations
# ❌ FIXED: Use torch.cuda.memory_allocated()
if torch.cuda.is_available():
    mem = torch.cuda.memory_allocated()
    print(f"GPU memory allocated: {mem} bytes")

# 65. Implement device-agnostic code that works on both CPU and GPU
# ⚠️ Improved: Use device variable and move tensors
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
x = torch.rand(3, 3, device=device)
y = torch.rand(3, 3, device=device)
result = x + y

_CudaDeviceProperties(name='NVIDIA GeForce RTX 2050', major=8, minor=6, total_memory=4095MB, multi_processor_count=16, uuid=8b95a0f1-0127-85af-367b-bf481d8c9078, L2_cache_size=1MB)


In [None]:
# ==========================================
# SECTION 6: MEMORY & PERFORMANCE (66-75)
# ==========================================

# 66. Demonstrate the difference between .clone() and direct assignment

# 67. Use in-place operations (.add_(), .mul_()) and understand their implications

# 68. Implement a function that checks if two tensors share memory

# 69. Create a tensor that shares storage with another tensor using .view()

# 70. Use torch.no_grad() context manager and understand when to use it

# 71. Implement memory-efficient tensor concatenation for large tensors

# 72. Compare memory usage of different tensor creation methods

# 73. Use torch.utils.benchmark to measure operation performance

# 74. Implement a memory profiler for tensor operations

# 75. Demonstrate broadcasting rules with tensors of different shapes

In [None]:
# ===========================================
# SECTION 7: REAL-WORLD APPLICATIONS (76-85)
# ===========================================

# 76. Implement one-hot encoding for class labels
# Hint: Use torch.nn.functional.one_hot or scatter_ for one-hot encoding. Input: class indices, Output: one-hot tensor.
# Steps:
# 1. Create a tensor of class indices (e.g., torch.tensor([0, 2, 1])).
# 2. Use torch.nn.functional.one_hot(indices, num_classes) to get one-hot encoding.
class_indices = torch.randint(low=0, high=11, size=(20,))
torch.nn.functional.one_hot(class_indices, 10)

# 77. Create a function to pad sequences to the same length
# Hint: Use torch.nn.utils.rnn.pad_sequence or manual padding with torch.zeros.
# Steps:
# 1. Given a list of 1D tensors of varying lengths.
# 2. Find the max length.
# 3. Pad each tensor with zeros (or a specified value) to max length.
# 4. Stack into a 2D tensor.

# 78. Implement image normalization (ImageNet statistics)
# Hint: Subtract mean and divide by std for each channel.
# Steps:
# 1. Given an image tensor of shape (C, H, W) or (N, C, H, W).
# 2. Use mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225].
# 3. Normalize: (img - mean) / std (broadcasting over channels).

# 79. Create a sliding window operation for time series data
# Hint: Use unfold or manual slicing to create overlapping windows.
# Steps:
# 1. Given a 1D tensor and window size (and optional stride).
# 2. Use x.unfold(0, window_size, stride) to get windows.

# 80. Implement k-means clustering initialization with tensors
# Hint: Randomly select k points as initial centroids from data.
# Steps:
# 1. Given a data tensor of shape (N, D) and k.
# 2. Randomly sample k indices without replacement.
# 3. Use data[indices] as initial centroids.

# 81. Create a function to compute pairwise distances between points
# Hint: Use broadcasting or torch.cdist for pairwise Euclidean distances.
# Steps:
# 1. Given two tensors X (N, D) and Y (M, D).
# 2. Use torch.cdist(X, Y) or implement manually with broadcasting.

# 82. Implement a simple convolution operation using tensor operations
# Hint: Use unfold for im2col, then matmul with kernel, or use F.conv2d.
# Steps:
# 1. Given input tensor (N, C, H, W) and kernel (out_C, in_C, kH, kW).
# 2. Use torch.nn.functional.unfold to extract patches.
# 3. Multiply patches by kernel and sum.

# 83. Create a function to generate random augmentation parameters
# Hint: Use torch.rand or torch.randint for random crop, flip, rotation, etc.
# Steps:
# 1. Decide which augmentations (e.g., crop, flip, rotate).
# 2. Generate random values for each parameter.
# 3. Return as a dict or tuple.

# 84. Implement attention mechanism computations using tensors
# Hint: Use Q, K, V matrices and softmax(QK^T/sqrt(d_k))V.
# Steps:
# 1. Given Q, K, V tensors (batch, seq, d_k).
# 2. Compute attention scores: Q @ K.transpose(-2, -1) / sqrt(d_k).
# 3. Apply softmax to scores.
# 4. Multiply by V to get output.

# 85. Create a function to compute moving averages efficiently
# Hint: Use torch.cumsum and slicing for moving average.
# Steps:
# 1. Given a 1D tensor and window size.
# 2. Compute cumulative sum.
# 3. Subtract shifted cumsum to get window sums.
# 4. Divide by window size for average.

In [None]:
# ============================================
# SECTION 8: DEBUGGING & TROUBLESHOOTING (86-95)
# ============================================

# 86. Write a function to debug tensor shapes in a complex operation
# Hint: Print shapes at key steps in your computation pipeline.
# Steps:
# 1. Define a function that takes tensors as input.
# 2. Print tensor.shape at each step.

# 87. Implement error handling for common tensor operation failures
# Hint: Use try/except blocks to catch RuntimeError or ValueError.
# Steps:
# 1. Wrap tensor operations in try/except.
# 2. Print or log the error message for debugging.

# 88. Create a tensor validator that checks for NaN and infinite values
# Hint: Use torch.isnan and torch.isinf to check for invalid values.
# Steps:
# 1. Define a function that takes a tensor.
# 2. Use torch.isnan(tensor).any() and torch.isinf(tensor).any().
# 3. Return or print a warning if found.

# 89. Implement a function to visualize tensor statistics
# Hint: Print or plot mean, std, min, max, and optionally histogram.
# Steps:
# 1. Compute tensor.mean(), tensor.std(), tensor.min(), tensor.max().
# 2. Optionally use matplotlib to plot a histogram.

# 90. Create a debugging tool that prints tensor info (shape, dtype, device, etc.)
# Hint: Print tensor.shape, tensor.dtype, tensor.device, tensor.requires_grad.
# Steps:
# 1. Define a function that takes a tensor.
# 2. Print all relevant info.

# 91. Implement gradient checking for custom operations
# Hint: Use torch.autograd.gradcheck for numerical gradient checking.
# Steps:
# 1. Define a custom autograd Function.
# 2. Use gradcheck with double precision inputs.

# 92. Create a function to detect memory leaks in tensor operations
# Hint: Use Python's gc module and torch.cuda.memory_allocated().
# Steps:
# 1. Monitor memory usage before and after operations.
# 2. Use gc.collect() to force garbage collection.

# 93. Implement a tensor comparison function with tolerance
# Hint: Use torch.allclose for elementwise comparison with tolerance.
# Steps:
# 1. Define a function that takes two tensors and atol/rtol.
# 2. Return torch.allclose(tensor1, tensor2, atol, rtol).

# 94. Create a profiling decorator for tensor operations
# Hint: Use time.time() or torch.profiler to measure execution time.
# Steps:
# 1. Define a decorator that times a function.
# 2. Print or log the elapsed time.

# 95. Implement a function to optimize tensor layouts for performance
# Hint: Use .contiguous(), .pin_memory(), or .to(memory_format=...).
# Steps:
# 1. Check if tensor is contiguous; if not, call .contiguous().
# 2. For DataLoader, use .pin_memory() for faster host-to-GPU transfer.

In [None]:
# ========================================
# BONUS CHALLENGES (96-100)
# ========================================

# 96. Implement a custom tensor class that extends PyTorch tensors

# 97. Create a tensor caching mechanism for repeated operations

# 98. Implement automatic mixed precision for tensor operations

# 99. Create a distributed tensor operation using multiple devices

# 100. Implement a complete mini neural network using only tensor operations

# ========================================
# CONGRATULATIONS!
# ========================================
# If you've completed all these exercises, you've mastered PyTorch tensors!
# You're now ready to dive deep into neural network architectures,
# custom layers, and advanced deep learning techniques.