<a href="https://colab.research.google.com/github/vijaygwu/IntroToDeepLearning/blob/main/NumPyAndTensors.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## NumPy: The Foundation of Numerical Computing in Python
****
**Summary**

NumPy provides a powerful, efficient, and user-friendly toolkit for working with numerical data in Python. Its widespread adoption and integration with other scientific libraries make it an essential tool for anyone involved in data analysis, scientific computing, or machine learning.

**Introduction to Numpy**

NumPy (Numerical Python) is a powerful open-source library that forms the backbone of scientific and numerical computing within the Python ecosystem. It provides:

**1. The ndarray (N-dimensional array):**

* **Efficient Data Structure:** NumPy's core strength lies in its `ndarray` (or `array`) object, a multidimensional array designed for fast and efficient numerical operations. It stores data in a contiguous block of memory, enabling highly optimized computations.
* **Homogeneous Data:** Typically, all elements within an ndarray are of the same data type (e.g., integers, floats), which further enhances computational performance.

**2. A Vast Collection of Functions:**

* **Mathematical Operations:**  NumPy offers a rich set of mathematical functions for performing element-wise operations, linear algebra, Fourier transforms, random number generation, and much more, directly on arrays.
* **Broadcasting:**  NumPy's broadcasting rules allow you to perform operations on arrays of different shapes seamlessly, eliminating the need for manual loops and making your code concise and efficient.
* **Indexing and Slicing:** Powerful indexing and slicing mechanisms make it easy to access and manipulate specific elements or sub-arrays within the `ndarray`.

**3. Interoperability:**

* **Foundation for Other Libraries:**  NumPy acts as the foundational building block for many other scientific and data analysis libraries in Python. Libraries like SciPy, Pandas, Matplotlib, and many machine learning frameworks (like TensorFlow and PyTorch) heavily rely on NumPy arrays for efficient data handling and computation.

**Key Advantages:**

* **Performance:** NumPy's underlying implementation in C and Fortran delivers significant performance gains compared to working with standard Python lists, especially for large datasets.
* **Ease of Use:** NumPy's intuitive and expressive syntax makes it relatively easy to perform complex numerical computations.
* **Versatility:**  NumPy is used in a wide range of scientific and engineering domains, including data analysis, machine learning, image processing, signal processing, and more.


In [None]:
import numpy as np

# Create a 2D array
arr = np.array([[1, 2, 3], [4, 5, 6]])

# Perform element-wise multiplication
result = arr * 2
print(result)  # Output: [[ 2  4  6]
              #          [ 8 10 12]]

# Calculate the mean of all elements
mean_value = np.mean(arr)
print(mean_value)  # Output: 3.5

[[ 2  4  6]
 [ 8 10 12]]
3.5


##Types in NumPy

****

NumPy handles types in a few key ways that make it a powerful tool for numerical computation:

**1. Homogeneous Arrays:**

* **Fundamental Principle:** NumPy arrays are designed to be *homogeneous*, meaning all elements within an array have the same data type (e.g., `int32`, `float64`, `bool`).
* **Efficiency:** This homogeneity allows NumPy to optimize storage and computation significantly.
* **Data Type Specification:** When creating an array, you can specify the data type using the `dtype` argument. If not provided, NumPy attempts to infer the appropriate type based on the input data.

**2. Type Casting and Promotion:**

* **Automatic Type Conversion:** NumPy will often perform *type casting* or *type promotion* when you perform operations on arrays with different data types.
    * **Casting:**  If possible, values are converted to a compatible type without loss of precision.
    * **Promotion:** If casting isn't possible without losing information, values are promoted to a larger data type that can accommodate all values.

**3. Data Type Objects (`dtype`)**

* **Describing Scalar Types:** NumPy has a rich set of built-in scalar data types to describe various precisions of integers, floating-point numbers, booleans, and more.
* **Structured Data Types:** You can create structured data types where fields contain other data types, providing a way to organize complex data within an array.
* **Checking and Comparing:** Use the `dtype` attribute to access and compare the data type of an array. Remember to use `==` for comparison, not `is`.

**4. Flexibility and Control:**

* **Explicit Type Conversion:** You can use functions like `astype()` to explicitly convert an array to a different data type.
* **Type Safety:** Be aware of potential issues when mixing types, like unintended integer division or overflow.
* **Advanced Typing:**  NumPy 1.20+ introduces type annotations for improved static type checking with tools like `mypy`.



**Key Takeaway:** NumPy's focus on homogeneous arrays and its smart type handling mechanisms allow for efficient computation while still offering the flexibility to work with various data types.


In [None]:

import numpy as np

arr1 = np.array([1, 2, 3])  # inferred as int64
arr2 = np.array([1.5, 2.5, 3.5])  # inferred as float64

# Type Promotion: Result is float64 to accommodate all values
result = arr1 + arr2
print(result)  # Output: [2.5 4.5 6.5]
print(result.dtype) # Output: float64

# Explicit Type Conversion
int_result = result.astype(np.int32)  # Convert to int32
print(int_result)  # Output: [2 4 6]

[2.5 4.5 6.5]
float64
[2 4 6]


## Pytorch Tensor

PyTorch tensors are the fundamental building blocks for representing and manipulating data in PyTorch. They provide the core functionalities for defining models, performing computations, and training neural networks within the PyTorch framework.


**What are PyTorch Tensors?**

At their core, PyTorch tensors are multidimensional arrays, similar to NumPy arrays, but with a few key distinctions:

* **GPU Acceleration:**  PyTorch tensors can seamlessly leverage the computational power of GPUs (Graphical Processing Units), dramatically speeding up numerical computations, especially for deep learning tasks.
* **Automatic Differentiation:** PyTorch tensors are built for automatic differentiation (autograd), a crucial feature for training neural networks where gradients are calculated automatically, facilitating efficient optimization.

**Core Characteristics:**

* **Multidimensional:** Tensors can represent scalars (0-dimensional), vectors (1-dimensional), matrices (2-dimensional), and higher-dimensional arrays.
* **Homogeneous Data:** Typically, all elements within a tensor are of the same data type (e.g., float32, int64).
* **Dynamic Computation Graph:** PyTorch constructs a dynamic computational graph as your code executes, keeping track of operations on tensors for backpropagation.
* **Seamless NumPy Integration:** Tensors can be effortlessly converted to and from NumPy arrays, enabling easy interaction with the wider scientific computing ecosystem in Python.

**Key Operations:**

* **Creation:**  You can create tensors using various methods, including:
    * `torch.tensor()` (from existing data)
    * `torch.rand()`, `torch.randn()`, `torch.zeros()`, `torch.ones()` (with specific values)
    * `torch.from_numpy()` (convert from NumPy arrays)
* **Indexing and Slicing:**  Similar to NumPy, tensors support powerful indexing and slicing to access specific elements or sub-tensors.
* **Mathematical Operations:** PyTorch offers a vast collection of mathematical functions operating directly on tensors (element-wise, linear algebra, etc.).
* **Broadcasting:**  Tensors support broadcasting, allowing operations on tensors of different but compatible shapes.
* **Shape Manipulation:**  You can reshape, transpose, and concatenate tensors to fit your needs.
* **GPU Operations:** If a GPU is available, you can move tensors to the GPU using `.to('cuda')` for accelerated computation.

**Advantages of PyTorch Tensors:**

* **Intuitive:** Pythonic syntax for easy learning and usage.
* **Flexible:** Dynamic graph construction for greater control and experimentation.
* **Powerful:**  GPU acceleration and automatic differentiation for efficient deep learning.
* **Widely Adopted:** PyTorch is a popular framework in the machine learning community.





In [None]:

import torch

# Create a tensor
x = torch.tensor([[1, 2], [3, 4]])

# Perform element-wise addition
y = x + 1
print(y)

# Matrix multiplication
z = torch.matmul(x, x.t())  # t() for transpose
print(z)

# Move to GPU (if available)
if torch.cuda.is_available():
    x = x.to('cuda')
    print(x)


tensor([[2, 3],
        [4, 5]])
tensor([[ 5, 11],
        [11, 25]])


## Core Differences between NumPy and PyTorch Tensors
****

1. **GPU Acceleration:**

   * **PyTorch Tensors:** A major advantage of PyTorch tensors is their seamless integration with GPUs. This enables you to perform massively parallel computations, drastically speeding up deep learning and other computationally intensive tasks.
   * **NumPy Arrays:** NumPy arrays primarily operate on the CPU. While there are ways to leverage GPUs with NumPy (e.g., using libraries like CuPy), it's not as natively integrated as in PyTorch.

2. **Automatic Differentiation:**

   * **PyTorch Tensors:** Tensors are designed with built-in support for automatic differentiation (autograd), a crucial feature for training neural networks. PyTorch automatically tracks the operations performed on tensors, allowing it to compute gradients efficiently during backpropagation.
   * **NumPy Arrays:** NumPy doesn't have built-in automatic differentiation. While there are libraries like Autograd that can add this capability, it's not a core feature of NumPy.

3. **Dynamic vs. Static Computation Graphs:**

   * **PyTorch Tensors:** PyTorch employs a dynamic computational graph. The graph is built on the fly as your code executes, offering greater flexibility for control flow, debugging, and experimenting with models.
   * **NumPy Arrays:** NumPy focuses on array operations without explicitly building or managing a computational graph.

4. **Deep Learning Integration:**

   * **PyTorch Tensors:**  PyTorch tensors are the cornerstone of the PyTorch deep learning framework. They seamlessly integrate with neural network layers, loss functions, optimizers, and other core components.
   * **NumPy Arrays:** While NumPy provides the essential numerical foundation for many deep learning libraries, it doesn't directly offer the higher-level abstractions for building and training neural networks.

**When to Choose Which**

* **PyTorch Tensors:** Ideal for deep learning tasks due to their GPU acceleration, automatic differentiation, and dynamic graph capabilities. They offer a natural fit within the PyTorch ecosystem.
* **NumPy Arrays:** Excellent for general scientific computing, numerical analysis, and data manipulation tasks. They are widely supported across the scientific Python stack.

**Interoperability**

* **Conversion:** You can easily convert between NumPy arrays and PyTorch tensors using `torch.from_numpy()` and `.numpy()`. This allows you to leverage the strengths of both libraries when needed.

**In Summary:**

Both NumPy arrays and PyTorch tensors are vital tools for scientific computing in Python.  

* PyTorch tensors excel in deep learning tasks due to their GPU capabilities and automatic differentiation.
* NumPy arrays are more general-purpose and perfect for numerical operations and data analysis.

Choose the one that best suits your specific task and leverage their interoperability when required.


**How are types handled in NumPy and Pytorch Tensor**
****

**NumPy Arrays**

* **Homogeneity:** NumPy arrays are fundamentally designed to be homogeneous, where all elements within an array share the same data type (e.g., `int32`, `float64`). This facilitates optimized memory layout and computation.
* **Type Inference:** When creating an array, NumPy tries to intelligently infer the appropriate data type based on the provided data. If you give it a mix of integers and floats, it'll likely choose `float64` to preserve precision.
* **Explicit `dtype`:** You have the control to specify the data type during array creation using the `dtype` argument.
* **Type Casting/Promotion:** NumPy handles operations between arrays with differing types through *casting* (converting to a compatible type without loss) or *promotion* (upgrading to a larger type that can accommodate all values).

**PyTorch Tensors**

* **Similar Homogeneity:** Like NumPy, PyTorch tensors also strive for homogeneity, aiming for all elements to have the same data type.
* **Default `float32`:** By default, tensors are created with the `float32` data type, suitable for many machine learning tasks.
* **`dtype` Control:**  You can explicitly specify the data type using the `dtype` argument when creating a tensor.
* **Type Promotion:**  PyTorch employs type promotion to handle operations between tensors of different types. It aims to find the smallest data type that can represent all values without loss of precision.
* **Automatic Type Conversion:** PyTorch often performs automatic type conversion when necessary for computations. For example, if you add an integer tensor to a float tensor, the integer tensor will be automatically converted to float.

**Key Distinctions:**

* **Strictness:** NumPy is generally stricter about type consistency. It'll raise errors more readily if you try to mix incompatible types within an array.
* **Flexibility:** PyTorch tends to be more flexible, often performing automatic conversions behind the scenes to enable computations. This can be convenient but also requires you to be mindful of potential precision loss or unintended behavior.
* **GPU Compatibility:** A defining feature of PyTorch tensors is their ability to reside and perform computations on GPUs. NumPy arrays, while powerful for CPU-based tasks, lack this native GPU support.


**Remember:**

* Both NumPy and PyTorch allow you to check and change data types using `dtype` and functions like `astype()` (NumPy) or `to()` (PyTorch).
* Always be aware of potential type-related issues (like unintended integer division or overflow) when performing operations on arrays/tensors with mixed data types.




In [None]:
# Numpy

import numpy as np

arr1 = np.array([1, 2, 3])  # int64
arr2 = np.array([1.5, 2.5])  # float64

# Type Promotion: Result is float64 to accommodate all values
result = arr1[:2] + arr2  # [2.5 4.5]

# Torch

import torch

tensor1 = torch.tensor([1, 2, 3])  # int64
tensor2 = torch.tensor([1.5, 2.5])  # float32

# Automatic Conversion: Integer tensor is converted to float32
result = tensor1[:2] + tensor2  # tensor([2.5000, 4.5000])