In [3]:
Names: Rowan, Joey, Jyothi

SyntaxError: invalid syntax (768974057.py, line 1)

---

**IMPORTANT** : You need to answer the questions on this notebook during class time. 

When the lesson is over these notebooks will be collected and graded. Once the notebooks are collected you can not modify them anymore.

---

# Data Manipulation
:label:`sec_ndarray`

> In order to get anything done,  we need some way to store and manipulate data.

Generally, there are two important things 
we need to do with data: 

* acquire them; 
* and process them once they are inside the computer. 

There is no point in acquiring data  without some way to store it, so to start, let's get our hands dirty with $n$-dimensional arrays,  which we also call *tensors*.

> If you already know the `NumPy` scientific computing package,  this will be a breeze.




For all modern deep learning frameworks, the *tensor class* (
`Tensor` in `PyTorch`) 
resembles NumPy's `ndarray`,
with a few killer features added.

First, the tensor class supports *automatic differentiation*. Second, it leverages GPUs to accelerate numerical computation, whereas `NumPy` only runs on `CPUs`.

These properties make neural networks both easy to code and fast to run.



## Getting Started

(**To start, we import the PyTorch library.
Note that the package name is `torch`.**)


In [4]:
import torch

> **A tensor represents a (possibly multidimensional) array of numerical values.**

In the one-dimensional case, i.e., when only one axis is needed for the data,
a tensor is called a *vector*.

With two axes, a tensor is called a *matrix*.

With $k > 2$ axes, we drop the specialized names
and just refer to the object as a $k^\textrm{th}$-*order tensor*.

![Example of Tensors](tensors.png)

`PyTorch` provides a variety of functions 
for creating new tensors 
prepopulated with values. 

#### Method  `arange`

For example, by invoking `arange(n)`,
we can create a vector of evenly spaced values,
starting at 0 (included) 
and ending at `n` (not included).
By default, the interval size is $1$.
Unless otherwise specified, 
new tensors are stored in main memory 
and designated for CPU-based computation.


In [5]:
x = torch.arange(12, dtype=torch.float32)
x

tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

Each of these values is called
an *element* of the tensor.
The tensor `x` contains 12 elements.




####  Method `numel`

We can inspect the total number of elements 
in a tensor via its `numel` method.

In [6]:
x.numel()

12

#### Method `shape`


(**We can access a tensor's *shape***) 
(the length along each axis)
by inspecting its `shape` attribute.
Because we are dealing with a vector here,
the `shape` contains just a single element
and is identical to the size.


In [7]:
x.shape

torch.Size([12])

#### Method `reshape`

We can [**change the shape of a tensor
without altering its size or values**],
by invoking `reshape`.

For example, we can transform 
our vector `x` whose shape is (12,) 
to a matrix `X`  with shape (3, 4).
This new tensor retains all elements
but reconfigures them into a matrix.
Notice that the elements of our vector
are laid out one row at a time and thus
`x[3] == X[0, 3]`.


In [8]:
X = x.reshape(4, -1)
print("x=",x)
print("X=",X)
print(x[4])
print(X[0,0])
X.numel()
X.shape

x= tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])
X= tensor([[ 0.,  1.,  2.],
        [ 3.,  4.,  5.],
        [ 6.,  7.,  8.],
        [ 9., 10., 11.]])
tensor(4.)
tensor(0.)


torch.Size([4, 3])

> Note that specifying every shape component to `reshape` is redundant.

Because we already know our tensor's size, we can work out one component of the shape given the rest.



---
**EXAMPLE**
Given a tensor of size $n$ and target shape ($h$, $w$), we know that $w = n/h$.

---

To automatically infer one component of the shape,
we can place a `-1` for the shape component
that should be inferred automatically.
In our case, instead of calling `x.reshape(3, 4)`,
we could have equivalently called `x.reshape(-1, 4)` or `x.reshape(3, -1)`.

---

**Question 1**
Please write code below that will define first the following tensor:
$$v=[2.0,4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0]$$ and then will reshape it, using the -1 option, into the following tensor:
the following matrix:
$$
w=\begin{bmatrix} 2.0 & 4.0 & 6.0 & 8.0\\
10 & 12 & 14 & 16
\end{bmatrix}
$$
Your code should print $v$ and $w$. You should explore the options of the method `arange` by doing an internet search and use it in the code below.

In [9]:
#please write code for question 1 here.
v=torch.arange(2,17,2,dtype=torch.float32)
w=v.reshape(2,4)
print(v)
print(w)

tensor([ 2.,  4.,  6.,  8., 10., 12., 14., 16.])
tensor([[ 2.,  4.,  6.,  8.],
        [10., 12., 14., 16.]])


#### Methods `zeros` and `ones`

Practitioners often need to work with tensors
initialized to contain all 0s or 1s.
[**We can construct a tensor with all elements set to 0**] 
and a shape of (2, 3, 4) via the `zeros` function.


In [10]:
torch.zeros((2, 3, 4))

tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

Similarly, we can create a tensor 
with all 1s by invoking `ones`.


In [11]:
q=torch.ones((2, 3, 4))
print(q.numel())
print(q.shape)

24
torch.Size([2, 3, 4])


#### Method `randn`

We often wish to 
[**sample each element randomly (and independently)**] 
from a given probability distribution.
For example, the parameters of neural networks
are often initialized randomly.
The following snippet creates a tensor 
with elements drawn from 
a standard Gaussian (normal) distribution
with mean 0 and standard deviation 1.

![Normal Distribution](normal.png)

In [12]:
torch.randn(3, 4)

tensor([[-0.9378,  0.2723, -1.3325,  0.1529],
        [-0.4410, -0.7499, -0.0120, -0.5835],
        [-0.7309, -0.0594,  0.7880,  0.3028]])

#### Constructing Tensors Directly

Finally, we can construct tensors by
[**supplying the exact values for each element**] 
by supplying (possibly nested) Python list(s) 
containing numerical literals.
Here, we construct a matrix with a list of lists,
where the outermost list corresponds to axis 0,
and the inner list corresponds to axis 1.


In [13]:
torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

tensor([[2, 1, 4, 3],
        [1, 2, 3, 4],
        [4, 3, 2, 1]])

## Indexing and Slicing

As with  Python lists,
we can access tensor elements 
by indexing (starting with 0).
To access an element based on its position
relative to the end of the list,
we can use negative indexing.

Finally, we can access whole ranges of indices 
via slicing (e.g., `X[start:stop]`), 
where the returned value includes 
the first index (`start`) *but not the last* (`stop`).
Finally, when only one index (or slice)
is specified for a $k^\textrm{th}$-order tensor,
it is applied along axis 0.
Thus, in the following code,
[**`[-1]` selects the last row and `[1:3]`
selects the second and third rows**].


In [14]:
print("X=", X)
print("X[-1]=",X[-1])
print("X[1:3]=",X[1:4])

X= tensor([[ 0.,  1.,  2.],
        [ 3.,  4.,  5.],
        [ 6.,  7.,  8.],
        [ 9., 10., 11.]])
X[-1]= tensor([ 9., 10., 11.])
X[1:3]= tensor([[ 3.,  4.,  5.],
        [ 6.,  7.,  8.],
        [ 9., 10., 11.]])


**QUESTION 2)**

For the tensor $X$ above, write code that will extract the array $[0.,1.,2.]$ and the array $[6.,7.,8.]$ using ONLY `reshape` and slicing.

In [15]:
#write your answer here

X=torch.arange(12)
X = X.reshape(4,3)


y=X[0::2]
print(y)


tensor([[0, 1, 2],
        [6, 7, 8]])


Beyond reading them, (**we can also *write* elements of a matrix by specifying indices.**)


In [16]:
X[1, 2] = 17
X

tensor([[ 0,  1,  2],
        [ 3,  4, 17],
        [ 6,  7,  8],
        [ 9, 10, 11]])

If we want [**to assign multiple elements the same value,
we apply the indexing on the left-hand side 
of the assignment operation.**]
For instance, `[:2, :]`  accesses 
the first and second rows,
where `:` takes all the elements along axis 1 (column).
While we discussed indexing for matrices,
this also works for vectors
and for tensors of more than two dimensions.


In [17]:
X[:2, :] = 11
X
L=torch.zeros(10,50)
L[0, :5]=12
L[-1,-5:]=12
print(L)

tensor([[12., 12., 12., 12., 12.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
          0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  

**QUESTION 3)**

Create first the following tensor:

$$
Z=\begin{bmatrix} 1 & 2 & 3 & 4 \\ 5 & 6 & 7 & 8 \\ 9 & 10 & 11 & 12 \\ 13 & 14 &15 & 16 
\end{bmatrix}
$$

Using **only** the tool discussed above, create this tensor so that now it is:
$$
Z=\begin{bmatrix} 2 & 2 & 2  & 2\\ 2 & 2& 2 &2 \\9 & 10 & 11 & 12\\ 7& 7& 7 & 7
\end{bmatrix}
$$
your code should print your results for the tensor Z.

In [18]:
# Write you answer to Question 3 here


# Create the array
Z=torch.arange(1,17)
# Form the array into the correct shape
Z=Z.reshape(4,4)
# Turn the first two rows to '2's
Z[: 2]=2
# Turn the final row to '7's
Z[3:]=7
# Print the array
print(Z)

tensor([[ 2,  2,  2,  2],
        [ 2,  2,  2,  2],
        [ 9, 10, 11, 12],
        [ 7,  7,  7,  7]])


## Operations in Tensors

Now that we know how to construct tensors
and how to read from and write to their elements,
we can begin to manipulate them
with various mathematical operations.

Among the most useful of these 
are the *elementwise* operations.
These apply a standard scalar operation
to each element of a tensor.

For functions that take two tensors as inputs,
elementwise operations apply some standard binary operator
on each pair of corresponding elements.

We can create an elementwise function 
from any function that maps 
from a scalar to a scalar.

In mathematical notation, we denote such
*unary* scalar operators (taking one input)
by the signature 
$f: \mathbb{R} \rightarrow \mathbb{R}$.
This just means that the function maps
from any real number onto some other real number.
Most standard operators, including unary ones like $e^x$, can be applied elementwise.


In [19]:
x=torch.arange(12)
x[0:8,]=12

print(x)
torch.exp(x)

tensor([12, 12, 12, 12, 12, 12, 12, 12,  8,  9, 10, 11])


tensor([162754.7969, 162754.7969, 162754.7969, 162754.7969, 162754.7969,
        162754.7969, 162754.7969, 162754.7969,   2980.9580,   8103.0840,
         22026.4648,  59874.1406])

Likewise, we denote *binary* scalar operators,
which map pairs of real numbers
to a (single) real number
via the signature 
$f: \mathbb{R}\times \mathbb{R} \rightarrow \mathbb{R}$.

Given any two vectors $\mathbf{u}$ 
and $\mathbf{v}$ *of the same shape*,
and a binary operator $f$, we can produce a vector
$\mathbf{c} $
by setting $c_i = f(u_i, v_i)$ for all $i$,
where $c_i, u_i$, and $v_i$ are the $i^\textrm{th}$ elements
of vectors $\mathbf{c}, \mathbf{u}$, and $\mathbf{v}$.

We can write $\mathbf{c}=F(\mathbf{u},\mathbf{v})$, were $F: \mathbb{R}^d\times \mathbb{R}^d \rightarrow \mathbb{R}^d$ is created by *lifting* the binary scalar function $f$ to a vector operation.

The common standard arithmetic operators
for addition (`+`), subtraction (`-`), 
multiplication (`*`), division (`/`), 
and exponentiation (`**`)
have all been *lifted* to elementwise operations
for identically-shaped tensors of arbitrary shape.


In [20]:
x = torch.tensor([1.0, 2, 4, 8])
y = torch.tensor([2., 2.,3. , 1.])
print("x=",x)
print("y=",y)
print("x+y=",x + y)
print("x-y=", x - y)
print("x*y=", x * y)
print("x/y", x / y)
print("x**y",x ** y)

x= tensor([1., 2., 4., 8.])
y= tensor([2., 2., 3., 1.])
x+y= tensor([3., 4., 7., 9.])
x-y= tensor([-1.,  0.,  1.,  7.])
x*y= tensor([ 2.,  4., 12.,  8.])
x/y tensor([0.5000, 1.0000, 1.3333, 8.0000])
x**y tensor([ 1.,  4., 64.,  8.])


In addition to elementwise computations,
we can also perform linear algebraic operations,
such as dot products and matrix multiplications.
We will elaborate on these
in :numref:`sec_linear-algebra`.



#### Method `cat`

We can also [***concatenate* multiple tensors,**]
stacking them end-to-end to form a larger one.
We just need to provide a list of tensors
and tell the system along which axis to concatenate.


The example below shows what happens when we concatenate
two matrices along rows (axis 0)
instead of columns (axis 1).We can see that the first output's axis-0 length ($6$)
is the sum of the two input tensors' axis-0 lengths ($3 + 3$);
while the second output's axis-1 length ($8$)
is the sum of the two input tensors' axis-1 lengths ($4 + 4$).

In [21]:
X = torch.arange(12, dtype=torch.float32).reshape((3,4))
print("tensor X:",X)
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
print("tensor Y:",Y)
print("Concatenating X and Y along axis 0, rows")
print(torch.cat((X, Y), dim=0))
print("Concatenating X and Y along axis 1, columns")
print(torch.cat((X, Y), dim=1))

tensor X: tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])
tensor Y: tensor([[2., 1., 4., 3.],
        [1., 2., 3., 4.],
        [4., 3., 2., 1.]])
Concatenating X and Y along axis 0, rows
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [ 2.,  1.,  4.,  3.],
        [ 1.,  2.,  3.,  4.],
        [ 4.,  3.,  2.,  1.]])
Concatenating X and Y along axis 1, columns
tensor([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],
        [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],
        [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.]])


**QUESTION 4)**
Consider two PyTorch tensors defined as follows:
$$E1=\begin{bmatrix} 1 & 2 &3 \\ 10 & 11 & 12\\ \end{bmatrix}$$
$$E2=\begin{bmatrix} 20 & 22 &23\\3 & 4 &5\end{bmatrix}$$

Write **Python** code using `PyTorch` operations to perform the following element-wise operations on tensors E1 and E2:

1. coordinatewise Addition (E1+ E1)
2. coordinatewise Multiplication (E2 * E2)
3. Coordinatewise Subtraction (E2 - E2)
4. coordinatewise Division (E1 / E1) 
Additionally, perform concatenation of tensors E1 and E2 along:
* Axis 0 (vertical stacking)
* Axis 1 (horizontal stacking)
Your code must be commented and must print all the important results.

In [22]:
# Write your answer to Question 4) here
E1 = torch.arange(1,13).reshape(-1,3)[::3]
E2 = torch.tensor([[20,22,23],[3,4,5]])


print(E1 + E1)
print(E2 * E2)
print(E2 - E2)
print(E1 / E1)

print(torch.cat(E1,E2), dim =0)
print(torch.cat((E1,E2), dim =1))



tensor([[ 2,  4,  6],
        [20, 22, 24]])
tensor([[400, 484, 529],
        [  9,  16,  25]])
tensor([[0, 0, 0],
        [0, 0, 0]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])


TypeError: cat() received an invalid combination of arguments - got (Tensor, Tensor), but expected one of:
 * (tuple of Tensors tensors, int dim, *, Tensor out)
 * (tuple of Tensors tensors, name dim, *, Tensor out)


#### Constructing Binary Tensors

Sometimes, we want to 
[**construct a binary tensor via *logical statements*.**]
Take `X == Y` as an example.
For each position `i, j`, if `X[i, j]` and `Y[i, j]` are equal, 
then the corresponding entry in the result takes value `1`,
otherwise it takes value `0`.


In [23]:
X == Y

tensor([[False,  True, False,  True],
        [False, False, False, False],
        [False, False, False, False]])

#### Summing a tensor

[**Summing all the elements in the tensor**] yields a tensor with only one element.


In [24]:
X.sum()

tensor(66.)

## Broadcasting
:label:`subsec_broadcasting`

By now, you know how to perform 
elementwise binary operations
on two tensors of the same shape. 

Under certain conditions,
even when shapes differ, 
we can still [**perform elementwise binary operations
by invoking the *broadcasting mechanism*.**]

Broadcasting works according to 
the following two-step procedure:

1. expand one or both arrays by copying elements along axes with length 1 so that after this transformation, the two tensors have the same shape;
2. perform an elementwise operation on the resulting arrays.


### Example

Consider two matrices:
$$a=\begin{bmatrix} 0 \\ 1 \\ 2\end{bmatrix}$$
$$b=\begin{bmatrix} 0 & 1\end{bmatrix}$$

In [25]:
a = torch.arange(3).reshape((3, 1))
b = torch.arange(2).reshape((1, 2))
a, b

(tensor([[0],
         [1],
         [2]]),
 tensor([[0, 1]]))

Since `a` and `b` are $3\times1$ 
and $1\times2$ matrices, respectively,
their shapes do not match up.

So, for example, we can not add them up....unless we use broadcasting.


Broadcasting produces a larger $3\times2$ matrix 
by replicating matrix `a` along the columns
and matrix `b` along the rows
before adding them elementwise.

$$\text{new } a=\begin{bmatrix} 0 & 0\\ 1  & 1 \\ 2 & 2\end{bmatrix}$$
$$\text{ new }b=\begin{bmatrix} 0 & 1\\ 0& 1 \\0 & 1\end{bmatrix}$$


In [26]:
a + b

tensor([[0, 1],
        [1, 2],
        [2, 3]])

**QUESTION 5)**

![image](question5.png)



WRITE YOUR ANSWERS HERE IN MARKDOWN

The dimension of the final matrix will be 3x2. y is only a 1x2 matrix so it will be copied to create a matrix with three identical rows.
The resulting matrix will be:

[[11, 22],
[13, 24],
[15, 26]]

The dimension of the final matrix will be 3x2. y is only a 1x2 matrix so it will be copied to create a matrix with three identical rows.
The resulting matrix will be:

[[10, 40],
[30, 80],
[50, 120]]



## Saving Memory

[**Running operations can cause new memory to be
allocated to host results.**]
For example, if we write `Y = X + Y`,
we dereference the tensor that `Y` used to point to
and instead point `Y` at the newly allocated memory.

> We can demonstrate this issue with Python's `id()` function,
which gives us the exact address 
of the referenced object in memory.

Note that after we run `Y = Y + X`,
`id(Y)` points to a different location.
That is because Python first evaluates `Y + X`,
allocating new memory for the result 
and then points `Y` to this new location in memory.


In [27]:
before = id(Y)
Y = Y + X
id(Y) == before

False

This might be undesirable for two reasons.
* First, we do not want to run around allocating memory unnecessarily all the time. In machine learning, we often have hundreds of megabytes of parameters and update all of them multiple times per second. Whenever possible, we want to perform these updates *in place*.
* Second, we might point at the  same parameters from multiple variables. If we do not update in place,  we must be careful to update all of these references, lest we spring a memory leak  or inadvertently refer to stale parameters.


Fortunately, (**performing in-place operations**) is easy.
We can assign the result of an operation
to a previously allocated array `Y`
by using slice notation: `Y[:] = <expression>`.

To illustrate this concept, 
we overwrite the values of tensor `Z`,
after initializing it, using `zeros_like`,
to have the same shape as `Y`.


In [28]:
Z = torch.zeros_like(Y)
print(Z)
print('id(Z):', id(Z))
Z[:] = X + Y
print('id(Z):', id(Z))

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])
id(Z): 140573837240224
id(Z): 140573837240224


[**If the value of `X` is not reused in subsequent computations,
we can also use `X[:] = X + Y` or `X += Y`
to reduce the memory overhead of the operation.**]


In [29]:
before = id(X)
X += Y
id(X) == before

True

## Conversion to Other Python Objects


[**Converting to a NumPy tensor (`ndarray`)**], or vice versa, is easy.
The torch tensor and NumPy array 
will share their underlying memory, 
and changing one through an in-place operation 
will also change the other.


In [30]:
A = X.numpy()
B = torch.from_numpy(A)
type(A), type(B)

(numpy.ndarray, torch.Tensor)

To (**convert a size-1 tensor to a Python scalar**),
we can invoke the `item` function or Python's built-in functions.


In [31]:
a = torch.tensor([3.5])
a, a.item(), float(a), int(a)

(tensor([3.5000]), 3.5, 3.5, 3)

## Summary

The tensor class is the main interface for storing and manipulating data in deep learning libraries.
Tensors provide a variety of functionalities including construction routines; indexing and slicing; basic mathematics operations; broadcasting; memory-efficient assignment; and conversion to and from other Python objects.


**END OF WORKSHEET**

Make sure that you answered all the questions on time. This completed `Jupyter Notebook` will be collected and graded. 

Once the `Jupyter Notebook` is collected it can not be modified.