# Data Representation

## Tensors
- A tensor is a container for numerical data. So, container for numbers
- e.g. Matrices --> 2D tensors
- Tensors are a generalization of matrices to any arbitrary number of dimensions
- Within Tensor terminology, a dimension is referred an axis


### Scalars
- A tensor that contains only one number -->  scalar  (i.e. 0 dim or 0D tensor)
- float32 or float64 (for Numpy) --> Scalar tensor (or scalar array)
- using ndim attribute --> show number of axes 
- scalar tensor has 0 axes (ndim = 0)
- Rank --> the number of axes of a tensor


In [None]:
#import numpy
import numpy as np

In [None]:
x = np.array(104)

In [None]:
x

In [None]:
x.ndim

### Vectors
- An array of numbers --> called a vector, or 1D tensor 
- A 1D tensor have exactly one axis

In [None]:
x = np.array([10, 34, 68, 904, 5])

In [7]:
x

array([ 10,  34,  68, 904,   5])

In [8]:
x.ndim

1

#### About Vectors
- The vector has five entries (--> 5 dimensional vector)
- 5D Vector vs 5D Tensor
 - a 5D vector has only one axis and has five dimensions along its axis 
 - a 5D tensor has five axes (and may have any number of dimensions along each axis). 
 
 ##### Confusing !
  - Dimensionality can denote either the number of entries along a specific axis (as in the case 5D vector)  
  OR  
  - the number of axes in a tensor (such as a 5D tensor)
  
  ##### Use of Rank
   - Rank --> the number of axes of a tensor
   - So use -->  tensor of rank 5  (but the ambiguous notation 5D tensor is used loosely)

### Matrices
- An array of vectors is a matrix, or 2D tensor. 
- A matrix has two axes (rows and columns)

In [9]:
x = np.array([[51, 70, 2, 32, 3],
        [69, 7, 3, 35, 10],
        [1, 8, 4, 6, 9]])

In [10]:
x

array([[51, 70,  2, 32,  3],
       [69,  7,  3, 35, 10],
       [ 1,  8,  4,  6,  9]])

In [11]:
x.ndim

2

#### About Matrices
- the entries from the first axis are called the rows, 
- the entries from the second axis are called the columns

### 3D tensors and above
- 3D tensor --> pack matrices in a new array
- So it's Cube of numbers

In [13]:
x = np.array([
        [[51, 70, 2, 32, 3],
        [69, 7, 3, 35, 10],
        [1, 8, 4, 6, 9]],
    
         [[51, 70, 2, 32, 3],
        [69, 7, 3, 35, 10],
        [1, 8, 4, 6, 9]],
    
        [[51, 70, 2, 32, 3],
        [69, 7, 3, 35, 10],
        [1, 8, 4, 6, 9]]
    ])

In [14]:
x

array([[[51, 70,  2, 32,  3],
        [69,  7,  3, 35, 10],
        [ 1,  8,  4,  6,  9]],

       [[51, 70,  2, 32,  3],
        [69,  7,  3, 35, 10],
        [ 1,  8,  4,  6,  9]],

       [[51, 70,  2, 32,  3],
        [69,  7,  3, 35, 10],
        [ 1,  8,  4,  6,  9]]])

In [15]:
x.ndim

3

#### About 3D tensors & above
- 4D tensor --> By packing 3D tensors in an array,
- So on..

### Summary of Tensors
- Number of axes (Rank)
    - ndim in Python library (e.g. Numpy)
- Shape
    - describes how many dimensions the tensor has along each axis 
    - e.g matrix has shape (3, 5) -->tuple of integers
- Data type 
 - type of the data contained in the tensor (e.g. float32, int32, float64, etc). Rarely char (but no strings)
 - referred as dtype (e.g numpy library)

#### Little bit about data type

In [24]:
a = np.array([1,4,5]) 

In [25]:
print("type is: ",type(a)) 
print("dtype is: ",a.dtype) 

type is:  <class 'numpy.ndarray'>
dtype is:  int32


### ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

## Tensor Operations
 - Element-wise operations (e.g. add)
 - Addition of different shaped tensors
 - dot product
 - reshaping

#### Element-wise operations 
- Add, Subtraction etc

In [1]:
#import numpy
import numpy as np

In [61]:
def my_add(x, y):
    assert len(x.shape) == 2
    assert x.shape == y.shape      #x , y --> 2D numpy tensors
    
    x = x.copy()                  #avoids overwriting the input tensor

    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            x[i, j] = x[i, j] + y[i, j]
    return x

In [62]:
x = np.array([[5.2, 78, 2, 34, 0],
[6, 79, 3, 35, 1],
[7, 80, 4, 36, 2]])

In [44]:
x.ndim

2

In [45]:
x.shape[0]

3

In [31]:
x.shape[1]

5

In [32]:
r = x.shape

In [34]:
type(r)

tuple

In [33]:
print(r[0], r[1])

3 5


In [63]:
y = np.array([[4, 2, 1, 3, 0],
[6, 1, 3, 5, 2],
[7, 8, 4, 6, 2]])

In [53]:
y.ndim

2

In [64]:
z = my_add(x,y)

In [65]:
z

array([[ 9.2, 80. ,  3. , 37. ,  0. ],
       [12. , 80. ,  6. , 40. ,  3. ],
       [14. , 88. ,  8. , 42. ,  4. ]])

### Using numpy array

In [66]:
#Add operation using numpy

z = x + y

In [67]:
z

array([[ 9.2, 80. ,  3. , 37. ,  0. ],
       [12. , 80. ,  6. , 40. ,  3. ],
       [14. , 88. ,  8. , 42. ,  4. ]])

In [68]:
#Set elements less than a particular value to max
z = np.maximum(y, 5.)

In [69]:
z

array([[5., 5., 5., 5., 5.],
       [6., 5., 5., 5., 5.],
       [7., 8., 5., 6., 5.]])

### Adding dissimilar tensors

Consider X with shape (32, 10) and y with shape (10,). 

First, we add an empty first axis to y, whose shape becomes (1, 10). 

Then, we repeat y, 32 times alongside this new axis, so that we end up with a tensor Y with shape
(32, 10), where Y[i, :] == y for i in range(0, 32). 


(Note: In terms of implementation, no new 2D tensor is created, because that would be
terribly inefficient. The repetition operation is entirely virtual: it happens at the algorithmic
level rather than at the memory level.)

In [70]:
def my_add_matrix_and_vector(x, y):
    assert   len(x.shape) == 2
    assert   len(y.shape) == 1
    assert x.shape[1] == y.shape[0]
    
    x = x.copy()
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            x[i, j] = x[i, j] + y[j]
    return x

In [71]:
x = np.array([[5.2, 78, 2, 34, 0],
[6, 79, 3, 35, 1],
[7, 80, 4, 36, 2]])

In [72]:
x.ndim   #len(x.shape)

2

In [73]:
y = np.array([1,2,3,1,2])

In [74]:
y.ndim      #len(y.shape)

1

In [75]:
z  = my_add_matrix_and_vector(x,y)

In [76]:
z

array([[ 6.2, 80. ,  5. , 35. ,  2. ],
       [ 7. , 81. ,  6. , 36. ,  3. ],
       [ 8. , 82. ,  7. , 37. ,  4. ]])

In [77]:
a = np.random.random((1, 2, 3, 10))


In [78]:
a.ndim

4

In [79]:
a

array([[[[0.6879327 , 0.8127683 , 0.71610136, 0.30892117, 0.4262383 ,
          0.29657382, 0.43176568, 0.91820337, 0.15635607, 0.10944801],
         [0.52958294, 0.49380278, 0.89113152, 0.52429444, 0.80312166,
          0.12741533, 0.1017671 , 0.14724887, 0.59295321, 0.75999495],
         [0.80040736, 0.45786883, 0.35533269, 0.00293199, 0.67123464,
          0.58153465, 0.4775209 , 0.00836018, 0.34592725, 0.01266709]],

        [[0.38973986, 0.9640342 , 0.39487593, 0.21431064, 0.03562499,
          0.14028533, 0.83507075, 0.67410949, 0.51092346, 0.7310739 ],
         [0.83448165, 0.40471807, 0.56328339, 0.53169478, 0.73503138,
          0.96149606, 0.28121189, 0.65703504, 0.88548414, 0.16971476],
         [0.323954  , 0.95807326, 0.7135679 , 0.16298144, 0.80195579,
          0.79407793, 0.9792558 , 0.18881872, 0.04458517, 0.99573057]]]])

In [81]:
b = np.random.random((3, 10))


In [82]:
b

array([[0.11170256, 0.77074418, 0.50610967, 0.74404778, 0.09220491,
        0.53332396, 0.99522618, 0.00804618, 0.8776228 , 0.82411369],
       [0.24273142, 0.10596559, 0.53092636, 0.27794606, 0.86283134,
        0.42517119, 0.32468711, 0.10946031, 0.29859451, 0.44893551],
       [0.18935417, 0.52903433, 0.77830448, 0.73392248, 0.75456839,
        0.73160941, 0.75748902, 0.79040901, 0.13253271, 0.91367937]])

In [83]:
c = np.maximum(a, b)

In [84]:
c

array([[[[0.6879327 , 0.8127683 , 0.71610136, 0.74404778, 0.4262383 ,
          0.53332396, 0.99522618, 0.91820337, 0.8776228 , 0.82411369],
         [0.52958294, 0.49380278, 0.89113152, 0.52429444, 0.86283134,
          0.42517119, 0.32468711, 0.14724887, 0.59295321, 0.75999495],
         [0.80040736, 0.52903433, 0.77830448, 0.73392248, 0.75456839,
          0.73160941, 0.75748902, 0.79040901, 0.34592725, 0.91367937]],

        [[0.38973986, 0.9640342 , 0.50610967, 0.74404778, 0.09220491,
          0.53332396, 0.99522618, 0.67410949, 0.8776228 , 0.82411369],
         [0.83448165, 0.40471807, 0.56328339, 0.53169478, 0.86283134,
          0.96149606, 0.32468711, 0.65703504, 0.88548414, 0.44893551],
         [0.323954  , 0.95807326, 0.77830448, 0.73392248, 0.80195579,
          0.79407793, 0.9792558 , 0.79040901, 0.13253271, 0.99573057]]]])

### Dot Product 

In [85]:
def my_vector_dot(x, y):
    assert len(x.shape) == 1
    assert len(y.shape) == 1
    assert x.shape[0] == y.shape[0]

    z = 0.
    for i in range(x.shape[0]):
        z = z + x[i] * y[i]
    return z

In [86]:
a = np.array([5,2,3,4,6])

In [87]:
b = np.array([3,2,4,1,5])

In [88]:
c = my_vector_dot(a,b)

In [89]:
c

65.0

In [90]:
c = np.dot(a, b)

In [91]:
c

65

In [92]:
def my_matrix_vector_dot(x, y):
    assert len(x.shape) == 2
    assert len(y.shape) == 1
    assert x.shape[1] == y.shape[0]

    z = np.zeros(x.shape[0])
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            z[i] = z[i]  + x[i, j] * y[j]
    return z

In [93]:
a = np.array([[5, 78, 2, 34, 0],
[6, 79, 3, 35, 1],
[7, 80, 4, 36, 2]])

In [94]:
b = np.array([3,2,4,1,5])

In [95]:
c = my_matrix_vector_dot(a,b)

In [97]:
c

array([213., 228., 243.])

In [100]:
def my_matrix_dot(x, y):
    assert len(x.shape) == 2
    assert len(y.shape) == 2
    assert x.shape[1] == y.shape[0]

    z = np.zeros((x.shape[0], y.shape[1]))
    for i in range(x.shape[0]):
        for j in range(y.shape[1]):
            row_x = x[i, :]
            column_y = y[:, j]
            z[i, j] = my_vector_dot(row_x, column_y)
    return z

In [101]:
a = np.array([[5, 7, 2, 34, 0],
[6, 7, 3, 35, 1],
[7, 8, 4, 36, 2]])

In [102]:
b = np.array([
    [4, 2, 1],
    [6, 1, 3],
    [7, 8, 4],
    [3, 4, 5],
    [2, 2, 2]
    ])

In [103]:
c = my_matrix_dot(a,b)

In [104]:
c

array([[178., 169., 204.],
       [194., 185., 216.],
       [216., 202., 231.]])

### Reshaping

In [105]:
x = np.array([[0.2, 1.1],
        [2.4, 3.5],
        [4.1, 5.3]])

In [106]:
x

array([[0.2, 1.1],
       [2.4, 3.5],
       [4.1, 5.3]])

In [107]:
print(x.shape)

(3, 2)


In [108]:
#reshape

x = x.reshape((6,1))

In [109]:
x.shape

(6, 1)

In [110]:
x

array([[0.2],
       [1.1],
       [2.4],
       [3.5],
       [4.1],
       [5.3]])

In [111]:
x = x.reshape((2, 3))

In [112]:
x.shape

(2, 3)

In [113]:
x

array([[0.2, 1.1, 2.4],
       [3.5, 4.1, 5.3]])

In [114]:
#Transpose

x = np.zeros((300, 20))  #create all zeroes matrix 

In [115]:
x.shape


(300, 20)

In [116]:
#now transpose 

y = np.transpose(x)

In [117]:
y.shape

(20, 300)

### ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

### RELU

In [2]:
#relu(x) is max(x, 0)

def my_relu(x):
    assert len(x.shape) == 2

    x = x.copy()
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            x[i, j] = max(x[i, j], 0)
    return x

In [3]:
y = np.array([[-3.2, 78, 2, 34, 0],
[6, 79, 3, -35, 1],
[7, 80, 4, 36, -2.8]])

In [4]:
z = my_relu(y)

In [5]:
z 

array([[ 0., 78.,  2., 34.,  0.],
       [ 6., 79.,  3.,  0.,  1.],
       [ 7., 80.,  4., 36.,  0.]])

In [None]:
names = {"Steve", "Rick", "Negan"}
names2 = names

# adding a new element in the new set
names2.add("Glenn")

# removing an element from the old set
names.remove("Negan")


print("Old Set is:", names)
print("New Set is:", names2)

In [None]:
# A Set of names
names = {"Steve", "Rick", "Negan"}

# copying using the copy() method
names2 = names.copy()

# adding "Glenn" to the new set
names2.add("Glenn")

# removing "Negan" from the old set
names.remove("Negan")

# displaying both the sets
print("Old Set is:", names)
print("New Set is:", names2)