Information on data in python that may be useful: 
Users of Python are often drawn-in by its ease of use, one piece of which is dynamic typing. While a statically-typed language like C or Java requires each variable to be explicitly declared, a dynamically-typed language like Python skips this specification. 

The standard Python implementation is written in C. This means that every Python object is simply a cleverly-disguised C structure, which contains not only its value, but other information as well.

Because of Python's dynamic typing, we can even create heterogeneous lists, but this flexibility comes at a cost: to allow these flexible types, each item in the list must contain its own type info, reference count, and other information–that is, each item is a complete Python object. 

In [28]:
import numpy as np 

#integer array
np.array([1,4,2,5,3])

#set data type of array
np.array([1,2,3,4,5], dtype='float32')

#Unlike python lists, NumPy arrays can be mulit-dimensional
#Initialize mulit-dimensional array - in the list [2,5,6] each number in that list is the start of 
np.array([range(i, i+3) for i in [2,5,6]])


array([[2, 3, 4],
       [5, 6, 7],
       [6, 7, 8]])

In [29]:
#More efficient to create arrays from scratch using NumPy functions

#create 3x5 floating-point array filled with ones
np.ones((3,5), dtype=float)

#create a length-10 integer array filled with zeros
np.zeros(10, dtype=int)

#create 3x5 array filled with 3.14
np.full((3,5), 3.14)

# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
np.arange(0, 20, 2)

# Create an array of five values evenly spaced between 0 and 1
np.linspace(0, 1, 5)

# Create a 3x3 array of uniformly distributed
# random values between 0 and 1
np.random.random((3, 3))

# Create a 3x3 array of normally distributed random values
# with mean 0 and standard deviation 1
np.random.normal(0, 1, (3, 3))

# Create a 3x3 array of random integers in the interval [0, 10)
np.random.randint(0, 10, (3, 3))

array([[9, 9, 8],
       [9, 3, 9],
       [1, 5, 9]])

Array Attributes
Attributes of arrays: Determining the size, shape, memory consumption, and data types of arrays
Indexing of arrays: Getting and setting the value of individual array elements
Slicing of arrays: Getting and setting smaller subarrays within a larger array
Reshaping of arrays: Changing the shape of a given array
Joining and splitting of arrays: Combining multiple arrays into one, and splitting one array into many

random.seed() is a function in the Python random module that is used to initialize the random number generator with a specific seed value. The seed value determines the initial state of the random number generator, which in turn influences the sequence of random numbers generated.

Here's how random.seed() works:

Setting the Seed: By calling random.seed(), you provide a seed value as an argument. This seed value can be an integer, a string, or an object that can be converted into an integer.

Deterministic Behavior: Setting the seed value ensures that the sequence of random numbers generated by the random number generator is reproducible. If you run the same code with the same seed value, you'll get the same sequence of random numbers every time.

Unpredictability: While setting the seed value makes the sequence of random numbers reproducible, the sequence itself remains unpredictable and statistically random. In other words, knowing the seed value does not allow you to predict the sequence of random numbers.

In [None]:
#Define three random arrays: one-dimensional, two-dimensional, three-dimemnsional

import numpy as np 

np.random.seed

x1 = np.random.randint(10) #1D
x2 = np.random.randint(0, 10, (3, 4)) #2D
x3 = np.random.randint(0, 10, (3, 4, 5)) #3D

#each array has attributes
x2.shape #size of each dimension
x2.ndim #number of dimensions 
x2.size #total size of array 

#itemsize lists the size (in bytes) of each array element, and nbytes which lists the total size (in bytes) of the array 
#array indexing: array[i] = array([i0, i1, i2,...])
#negative indexing can be used to index from the end of the array
x1[-2]

#array slicing: 
#x[start:stop:step], default to the alues start=0, stop=size oof dimension, step=1
x1[:3] #first three elements
x1[3:] #elements after index 3
x1[1:4] #middle subarray
x1[::2] #every second element
x1[1::2] #every second element, starting at index 1
x1[::-1] #all elements, reversed
x1[4::-2] #every second element from index 4, reversed

#multidimensional subarrays
x2[:2, :3] #first two rows & three columns
x2[:3, ::2] #three rows, every second column
x2[::-1, ::-1] #all rows and columns, reversed 

#accessing single rows or columns of an array via combining indexing and slicing 
x2[:, 0] #first column of x2
x2[, 0] #DOES NOT WORK, only works for the case of rows 
x2[0] # shorthand for x2[0, :]

#NumPy arrays support the concept of views, which are alternative array representations that share the same data buffer with the original array. These views do not create a new copy of the data but provide a different way to access and manipulate the underlying data.

#If we modify subarrays created from a view, the original data will change

In [55]: x2
Out[55]:
array([[1, 7, 7, 0],
       [0, 1, 0, 9],
       [6, 8, 4, 3]])

In [56]: x2_sub = x2[:2, :2]

In [57]: x2_sub
Out[57]:
array([[1, 7],
       [0, 1]])

In [58]: x2_sub[0, 0] = 0

In [59]: x2
Out[59]:
array([[0, 7, 7, 0],
       [0, 1, 0, 9],
       [6, 8, 4, 3]])


#On the other hand, copying an array in NumPy creates a new array object with its own separate data buffer. Any modifications made to the copied array do not affect the original array, and vice versa. 
x2_sub_copy = x2[:2, :2].copy()





(3, 4)

np.reshape() is a function in NumPy used to change the shape of an array without changing its data. It allows you to transform arrays into different shapes, provided that the total number of elements remains the same.

Here's how np.reshape() works:

Syntax: The syntax for np.reshape() is np.reshape(a, new_shape, order='C'), where:

a: The array to be reshaped.
new_shape: The new shape that you want to assign to the array. This can be specified as a tuple of integers or as a single integer if the array has only one dimension.
order (optional): Specifies the order in which the elements of the array are read and reshaped. It can be 'C' (row-major order, default) or 'F' (column-major order).
Reshaping: np.reshape() rearranges the elements of the input array according to the new shape specified. The total number of elements in the reshaped array must match the total number of elements in the original array.

Data Sharing: np.reshape() creates a new view of the array with the specified shape, but it does not copy the data. It shares the same underlying data buffer with the original array. Therefore, modifying elements of the reshaped array will also affect the original array, and vice versa.

In [None]:
#Reshaping of arrays 
In [69]: grid = np.arange(0, 10)

In [70]: grid
Out[70]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [72]: grid = grid.reshape(2,5)

In [73]: grid
Out[73]:
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])


np.newaxis is a constant in NumPy that serves as a tool for array manipulation. It is not a function but rather a convenient way to increase the dimensionality of an array by one. When used in array indexing or slicing operations, np.newaxis inserts a new axis into the array at the specified position.

Here's what np.newaxis does:

Inserting New Axes: It inserts a new axis into the array at the specified position, effectively increasing the dimensionality of the array by one.

Reshaping Arrays: It facilitates reshaping operations by allowing you to explicitly control the dimensions of arrays.

Broadcasting: It plays a crucial role in broadcasting operations, where NumPy automatically aligns the shapes of arrays to perform element-wise operations efficiently.

For example, if you have a 1D array and you want to reshape it into a column vector (2D array), you can use np.newaxis to insert a new axis along the second dimension, effectively converting the 1D array into a column vector.

In [None]:
#can use np.newaxis to create a new column/row during splicing
In [74]: x1
Out[74]: array([2, 3, 2, 1, 1, 0, 4, 7, 6, 7])

In [77]: gridB = x1[:, np.newaxis]

In [78]: gridB
Out[78]:
array([[2],
       [3],
       [2],
       [1],
       [1],
       [0],
       [4],
       [7],
       [6],
       [7]])

np.concatenate() is a NumPy function used to concatenate arrays along a specified axis. It allows you to combine arrays into a single array along the specified axis.

Here's how np.concatenate() works:

Syntax: np.concatenate((array1, array2, ...), axis=0)
array1, array2, ...: A sequence of arrays to be concatenated. All arrays must have the same shape along the specified axis, except for the dimension being concatenated.
axis: (Optional) The axis along which the arrays will be concatenated. If not provided, the default value is 0, which means the arrays will be concatenated along the first axis.
Key points:

Dimensionality: All arrays being concatenated must have the same shape along the axis being concatenated, except for the dimension being concatenated.

Axis: The axis parameter specifies the axis along which the concatenation will be performed. Concatenating along axis 0 means stacking arrays vertically (along rows), while concatenating along axis 1 means stacking arrays horizontally (along columns).

Resulting Shape: The resulting shape of the concatenated array depends on the axis along which the concatenation is performed. For example, if concatenating along axis 0, the resulting array will have a shape where the size of the axis being concatenated increases. If concatenating along axis 1, the resulting array will have a shape where the number of arrays being concatenated increases.

Data Types: np.concatenate() preserves the data type of the arrays being concatenated. If the arrays being concatenated have different data types, the resulting array will have a data type that can accommodate all the different data types (upcasting).

In [None]:
#can combine multiple arrays into one, and to split single array into multiple arrays 
#routines np.concatenate, np.vstack, np.hstack

#Can concatenate one-dimensional arrays
In [126]: x1a
Out[126]: array([ 0,  2,  4,  6,  8, 11, 13, 15, 17, 20])

In [127]: x1
Out[127]: array([2, 3, 2, 1, 1, 0, 4, 7, 6, 7])

In [134]: np.concatenate([x1, x1a])
Out[134]:
array([ 2,  3,  2,  1,  1,  0,  4,  7,  6,  7, 50, 51, 52, 53, 54, 55, 56,
       57, 58, 59])

#Can concatenate two-dimensional arrays 
In [135]: grid
Out[135]:
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [136]: %who
blub     grid    gridB   np      x       x1      x1a     x2      x2_sub
x2_sub_copy      x3D     xBig

In [137]: gridB
Out[137]:
array([[2],
       [3],
       [2],
       [1],
       [1],
       [0],
       [4],
       [7],
       [6],
       [7]])

In [138]: gridB = gridB.reshape(2, 5)

In [139]: gridB
Out[139]:
array([[2, 3, 2, 1, 1],
       [0, 4, 7, 6, 7]])

In [140]: grid
Out[140]:
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [141]: gridB
Out[141]:
array([[2, 3, 2, 1, 1],
       [0, 4, 7, 6, 7]])

In [146]: np.concatenate([grid, gridB], axis=0) #concatenat along first axis 
Out[146]:
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9],
       [2, 3, 2, 1, 1],
       [0, 4, 7, 6, 7]])

In [147]: np.concatenate([grid, gridB], axis=1) #concatenate along second axis (zero-indexed)
Out[147]:
array([[0, 1, 2, 3, 4, 2, 3, 2, 1, 1],
       [5, 6, 7, 8, 9, 0, 4, 7, 6, 7]])


Horizontal stacking refers to the process of joining arrays along their horizontal axis. In two-dimensional arrays, the horizontal axis is typically the axis representing columns. When you horizontally stack arrays, you are essentially appending one array to the right of another array.

For example, consider two arrays A and B:

A = [[1, 2],
     [3, 4]]

B = [[5, 6],
     [7, 8]]
If you horizontally stack A and B, you'll get:

[[1, 2, 5, 6],
 [3, 4, 7, 8]]
Here, the elements of B are appended to the right of the elements of A, resulting in a new array where each row contains the elements from both arrays.

In NumPy, np.hstack() is used to perform horizontal stacking. It takes a sequence of arrays and joins them along the horizontal axis to create a new array. The arrays being stacked must have compatible shapes along all dimensions other than the second one.

In [None]:
#hstack example 
In [168]: grid
Out[168]:
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [169]: x1b
Out[169]: array([7, 6, 7, 4, 0])

In [170]: x1b[:2]
Out[170]: array([7, 6])

In [171]: np.hstack([x1b[:2].reshape(2,1), grid])
Out[171]:
array([[7, 0, 1, 2, 3, 4],
       [6, 5, 6, 7, 8, 9]])

Vertical stacking refers to the process of joining arrays along their vertical axis. In two-dimensional arrays, the vertical axis is typically the axis representing rows. When you vertically stack arrays, you are essentially appending one array below another array.

For example, consider two arrays A and B:

A = [[1, 2],
     [3, 4]]

B = [[5, 6],
     [7, 8]]
If you vertically stack A and B, you'll get:

[[1, 2],
 [3, 4],
 [5, 6],
 [7, 8]]
 
Here, the elements of B are appended below the elements of A, resulting in a new array where each column contains the elements from both arrays.

In NumPy, np.vstack() is used to perform vertical stacking. It takes a sequence of arrays and joins them along the vertical axis to create a new array. The arrays being stacked must have compatible shapes along all dimensions other than the first one.

In [None]:
#vstack example
In [161]: x1b
Out[161]: array([7, 6, 7, 4, 0])

In [162]: grid
Out[162]:
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [163]: np.vstack([x1b, grid])
Out[163]:
array([[7, 6, 7, 4, 0],
       [0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

Opposite of concatenating is splitting: np.split, np.hsplit, hp.vsplit 

In [None]:
In [220]: x1
Out[220]: array([2, 3, 2, 1, 1, 0, 4, 7, 6, 7])

In [221]: a1, a2, a3 = np.split(x1, [4, 8], axis = 0 ) #[4, 8] is the element in the array start:stop, axis zero is vertical split

In [223]: print(a1, a2, a3)
[2 3 2 1] [1 0 4 7] [6 7]

In [232]: x3Da
Out[232]:
array([[95, 96, 69, 81,  8, 30, 74, 18,  5, 97],
       [ 1, 53, 33,  4,  1, 69, 67, 43, 95, 92],
       [25, 73, 49, 84, 60, 14, 72, 33, 36, 55],
       [61, 54, 30, 24, 37, 98, 66, 10, 37, 50],
       [36, 13, 45, 43, 25, 21, 64, 28, 90, 27]])

#horizontal split
In [244]: left, right = np.hsplit(x3Da, [5])
In [246]: print(left)
[[95 96 69 81  8]
 [ 1 53 33  4  1]
 [25 73 49 84 60]
 [61 54 30 24 37]
 [36 13 45 43 25]]

#vertical split 
In [240]: upper, lower = np.vsplit(x3Da, [3])
In [241]: print(upper)
[[95 96 69 81  8 30 74 18  5 97]
 [ 1 53 33  4  1 69 67 43 95 92]
 [25 73 49 84 60 14 72 33 36 55]]

#default axis zero appears to be same as vertical split, but vsplit comes down to speed and ease of use 
In [250]: upper, lower = np.split(x3Da, [3], axis=0)
In [251]: upper
Out[251]:
array([[95, 96, 69, 81,  8, 30, 74, 18,  5, 97],
       [ 1, 53, 33,  4,  1, 69, 67, 43, 95, 92],
       [25, 73, 49, 84, 60, 14, 72, 33, 36, 55]])