# Vectors in NumPy

A **Vector Space** is represented as $(E,\mathbb{K}, \oplus, \odot)$, such that $\mathbb{K}$ is a set of scalars and $E$ is a non-empty set ($E \neq \emptyset$); defined by two operations: $\oplus$ internal addition in $E$ and $\odot$ scalar multiplication (internal) that satisfy the following axioms: $\forall x,y \in E$, $\forall \alpha,\beta \in \mathbb{K}$

1. *Commutative addition*: $x \oplus y = y \oplus x$
2. *Associative addition*: $(x \oplus y) \oplus z  = x \oplus (y \oplus z)$
3. *Zero element*: There exists a unique element in $E$, denoted by $0_{E}$ and called the zero of the space, such that: $x \oplus 0_{E} = x$
4. *Opposite element*: For every element $x \in E$ there exists a unique element $(-x)$ called the opposite of $x$ which satisfies: $x \oplus (-x) = 0_{E}$
5. *Distributivity of vector addition and scalar multiplication*: $\alpha \odot (x \oplus y) = \alpha \odot x \oplus \alpha \odot y$
6. *Distributivity of scalar addition and vector multiplication*: $(\alpha + \beta) \odot x = \alpha \odot x \oplus \beta \odot x$
7. *Associativity of scalar multiplication*: $(\alpha \beta) \odot x = \alpha \odot (\beta \odot x)$
8. *Scalar identity*: $1 \odot x = x$, where $1 \in \mathbb{K}$

**NumPy** is a Python library for efficient handling of multidimensional arrays and linear algebra operations. Using the `numpy.array` class, it is possible to represent and manipulate vectors and perform basic operations on them (such as addition, scalar multiplication, dot product, and many others) in a way similar to their definition in linear algebra.

In [None]:
import time 
import numpy as np 
import pandas as pd

## Vectors of One Dimension and Two Dimensions in Vector Spaces

A one-dimensional vector, such as `np.array([1.0, 2.0, 3.0])`, is a typical element of a vector space of dimension 3 over the real numbers ($\mathbb{R}^3$). Each component corresponds to a coordinate in that space. These vectors can be added and multiplied by scalars following the rules of linear algebra.

A two-dimensional vector, or matrix, such as `np.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])`, can be interpreted as a collection of vectors from the space $\mathbb{R}^2$ or as points in the $3 \times 2$ space of matrices. Both structures satisfy the properties of a vector space.

## Explanation of NumPy Functions

- `np.ones(X)`: Generates a vector (or array) of dimension `X` filled with the number 1. These vectors are useful to initialize and represent the constant one vector in a vector space.
- `np.zeros(X)`: Generates a vector (or array) of dimension `X` filled with zeros. The zero vector is the neutral element of vector addition in any vector space.
- `np.arange(a,b,c)`: Creates a vector with numbers starting at `a`, ending before `b`, and advancing in steps of `c`. It is useful for generating arithmetic sequences, often used in examples of discrete vector spaces.
- `np.random.rand(X)`: Generates a vector of dimension `X` with random numbers between 0 and 1, useful to simulate experiments or select arbitrary values in vector spaces.


In [None]:
vector_list = np.array([1.0, 2.0, 3.0]) 
vector_matrix = np.array( 
  [
    [1.0, 2.0], 
    [3.0, 4.0], 
    [5.0, 6.0]
  ]
)
vector_ones = np.ones(4) 
vector_zeros = np.zeros(4)
vector_range = np.arange(2, 11, 2)
vector_random = np.random.rand(4) 

print("1D Vector:", vector_list)
print("2D Vector (Matrix):\n", vector_matrix)
print("Vector of Ones:", vector_ones)
print("Vector of Zeros:", vector_zeros)
print("Vector with Range:", vector_range)
print("Random Vector:", vector_random)

## Types of Vectors According to the Vector Space in NumPy

Vectors in NumPy can be defined over various numeric sets, which correspond to different possible vector spaces depending on the nature of their elements. Each type of vector is distinguished by the `dtype` (data type) of its components in the vector. Common data types include:

- **Integers (`int`)**: These represent vectors whose elements are integers of varying bit-widths, such as `int8`, `int16`, `int32`, and `int64`.
- **Unsigned Integers (`uint`)**: Non-negative integers like `uint8`, `uint16`, `uint32`, and `uint64`.
- **Floating-point numbers (`float`)**: Real numbers with decimals, commonly `float16`, `float32`, and `float64`.
- **Complex numbers (`complex`)**: Numbers with real and imaginary parts, typically `complex64` and `complex128`.
- **Boolean (`bool`)**: Vectors whose elements are boolean values (`True` or `False`).

Each `dtype` defines the type of elements held in the vector, which directly relates to the underlying vector space's field (such as integers, real numbers, complex numbers, or logical values). NumPy provides flexibility to create vectors with any of these types depending on the application.

In [None]:
vector_int = np.array([1, 2, 3], dtype=np.int32)
vector_float64 = np.array([1.0, 2.0, 3.0], dtype=np.float64)
vector_complex = np.array([1 + 2j, 3 - 1j, 5 + 0j], dtype=complex)
vector_bool = np.array([True, False, True], dtype=np.bool_)

print("Type Integer Vector:", vector_int)
print("Type Float Vector:", vector_float64)
print("Type Complex Vector:", vector_complex)
print("Type Boolean Vector:", vector_bool)

## Properties of a Vector in NumPy

A vector in NumPy offers several important properties including:

- *Size* (`size`): The total number of elements contained in the vector.
- *Shape* (`shape`): A tuple indicating the dimensions of the vector, such as the length for a 1D vector, or rows and columns for a matrix.
- *Number of Dimensions* (`ndim`): The number of dimensions of the array (for a vector, ndim is typically 1).
- *Data Type* (`dtype`): The data type of the elements contained in the vector.
- *Number of Bytes* (`nbytes`): The total memory used by the vector in bytes.

These properties allow you to inspect the structure and type characteristics of vectors (and arrays in general) in NumPy. By using these same properties, it is also possible to create new vectors and matrices with desired characteristics.

In [None]:
vector = np.array([1, 2, 3, 4, 5])
vector_properties = pd.DataFrame(
  {
    "Properties": [
      "Vector",
      "Size",
      "Shape",
      "Dimensions",
      "Data Type",
      "Bytes"
    ],
    "Values": [
      vector,
      vector.size,
      vector.shape,
      vector.ndim,
      str(vector.dtype),
      vector.nbytes
    ]
  }
)

print(vector_properties)

# using vector properties to create zeros vector, ones vector, and identity matrix
dimension = vector.size
vector_zeros = np.zeros(dimension)
vector_ones = np.ones(dimension)
vector_identity = np.eye(dimension)

print("Zeros Vector:", vector_zeros)
print("Ones Vector:", vector_ones)
print("Identity Matrix:\n", vector_identity)


## Basis of a Vector Space. Canonical Basis

**Basis**: A system of vectors of a vector space $E$ over $\mathbb{K}$ that is both a generating system and linearly independent.

The canonical basis is the set formed by the unit vectors whose only non-zero component is a 1 in a specific position and zeros in the others. In notation, in $\mathbb{R}^n$:
$$B = \{ e_1 = (1, 0, ..., 0), e_2 = (0, 1, 0, ..., 0), ... , e_n = (0,0,...,1)\}$$

This canonical basis is linearly independent and generates the entire vector space. It is sometimes called the **standard basis** or **usual basis**, where every vector in the space can be uniquely expressed as a linear combination of these basis vectors. Each $e_i$ is a vector with 1 in the $i$-th position and 0 elsewhere, which makes them orthonormal under the usual dot product in $\mathbb{R}^n$.

For example, in $\mathbb{R}^3$, the canonical basis is:
$$\{ e_1 = (1,0,0), e_2 = (0,1,0), e_3 = (0,0,1) \}$$
and any vector $v = (x,y,z)$ can be uniquely decomposed as:
$$v = x e_1 + y e_2 + z e_3$$


In [None]:
# basis of a vector space: canonical basis
for i in range(dimension):
  e_i = np.zeros(dimension)
  e_i[i] = 1
  print(f"e_{i+1} = {e_i}")

## Vector Operations

The basic operations on vectors in a vector space are: addition, scalar multiplication, and dot product. Addition and scalar multiplication satisfy the property that the result is a vector belonging to the same vector space.

- *Vector Addition*: Given two vectors $v, w$ in a vector space $V$, their sum $v + w$ is also in $V$. Addition is performed component-wise.
- *Scalar Multiplication*: Given a scalar $\alpha$ and a vector $v$, the vector $\alpha v$ belongs to $V$. Each component of the vector is multiplied by $\alpha$.
- *Dot Product*: This is an inner product that results in a scalar, not a vector. However, it is a fundamental operation for measuring angles and projections in a vector space.
- *Linear Combination*: Expressed as $\alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_n v_n$, it is the sum of scaled vectors resulting in another vector in $V$.

In [None]:
v1 = np.random.rand(5)
v2 = np.random.rand(5)
scalar = np.random.rand(1)[0]

print(f"""Information 
Vector v1: {v1}
Vector v2: {v2}
Scalar: {scalar}
""")

vector_sum = v1 + v2
vector_scalar_mult = scalar * v1
vector_dot_product = np.dot(v1, v2)
vector_lineal_combination = 2.0 * v1 + 3.0 * v2

print(f"""Results
Sum: {vector_sum}
Scalar Multiplication: {vector_scalar_mult} 
Dot Product: {vector_dot_product}
Linear Combination: {vector_lineal_combination}
  where alpha_1 = 2.0, alpha_2 = 3.0
""")

# check if result vectors are valid
assert vector_sum.shape == v1.shape, "Sum vector shape mismatch"
assert vector_scalar_mult.shape == v1.shape, "Scalar multiplication vector shape mismatch"
assert vector_dot_product.shape == (), "Dot product result should be a scalar"
assert vector_lineal_combination.shape == v1.shape, "Linear combination vector shape mismatch"

## Rank of a Matrix Formed by a Set of Vectors

The rank of a matrix (also called the rank of a set of vectors) is the dimension of the vector space generated by the rows (or columns) of the matrix. In other words, the rank indicates the maximum number of linearly independent vectors within the given set.

## Linear Independence vs. Linear Dependence

Let $E$ be a vector space over $\mathbb{K}$, and let $S = \{ v_{1},\dots,v_{n} \}$ be a system of vectors in $E$. The set $S$ is said to be **linearly dependent** if 
$$\exists v_{j} = \sum_{i \neq j} \alpha_{i} v_{i}$$
otherwise, the system is said to be **linearly independent**. 


In [None]:
vectors = np.array(
  [
    [1, 0, 1],
    [0, 1, 1],
    [1, 1, 0]
  ]
)
print(vectors)

vector_range = np.linalg.matrix_rank(vectors)
print("Rank of the vector matrix:", vector_range)
print(f"N vectors: {vectors.shape[0]}")
print(f"N dimensions: {vectors.shape[1]}")

# lineal independence
if vector_range == vectors.shape[0]: print("The vectors are linearly independent.")
else: print("The vectors are linearly dependent.")

## Comparison of Time Complexity of Vector Operations Using NumPy and Native Python (Lists)

In [None]:
lambda_np_sum = lambda v1, v2: v1 + v2
lambda_np_scalar_mult = lambda scalar, v: scalar * v

lambda_py_sum = lambda v1, v2: [x + y for x, y in zip(v1, v2)]
lambda_py_scalar_mult = lambda scalar, v: [scalar * x for x in v]

def measure_time(func, *args, **kwargs):
  t_start = time.time()
  result = func(*args, **kwargs)
  t_end = time.time()
  elapsed_time = t_end - t_start
  return result, elapsed_time

dimensions = [100, 1000, 10000, 50000, 100000]
benchmark_results = {
  'dimensions': dimensions,
  'np_sum': [], 'py_sum': [],
  'np_scalar_mult': [], 'py_scalar_mult': []
}

n_samples = 100
scalar = np.random.rand(1)[0]

for dim in dimensions:
  np_v1 = np.random.rand(dim)
  np_v2 = np.random.rand(dim)
  py_v1 = list(np_v1)
  py_v2 = list(np_v2)
  
  time_np_sum = sum(measure_time(lambda_np_sum, np_v1, np_v2)[1] for _ in range(n_samples)) / n_samples
  time_py_sum = sum(measure_time(lambda_py_sum, py_v1, py_v2)[1] for _ in range(n_samples)) / n_samples
  time_np_scalar_mult = sum(measure_time(lambda_np_scalar_mult, scalar, np_v1)[1] for _ in range(n_samples)) / n_samples
  time_py_scalar_mult = sum(measure_time(lambda_py_scalar_mult, scalar, py_v1)[1] for _ in range(n_samples)) / n_samples
  
  # save results
  benchmark_results['np_sum'].append(time_np_sum)
  benchmark_results['py_sum'].append(time_py_sum)
  benchmark_results['np_scalar_mult'].append(time_np_scalar_mult)
  benchmark_results['py_scalar_mult'].append(time_py_scalar_mult)

df = pd.DataFrame(benchmark_results)
df.set_index('dimensions', inplace=True)
print(df)