## 2.1 Tensor Basic 

In [11]:
import torch # torch modue imported 

In [2]:
a = torch.ones(3) # call a function of ones
a

tensor([1., 1., 1.])

In [3]:
a[1]

tensor(1.)

In [5]:
float(a[1])

1.0

In [6]:
a[2] = 2.0

#### List example
 Python lists or tuples of numbers are collections of Python objects that are individually allocated in memory.

In [7]:
b = [1,2,3]

In [9]:
float(b[1])

2.0

#### Back to tensor

In [10]:
a

tensor([1., 1., 2.])

#### Analysis: 
Python lists or tuples of numbers are collections of Python objects that are individually allocated in memory. PyTorch tensors or NumPy arrays, on the other hand, are views over (typically) contiguous memory blocks containing unboxed C numeric types, not Python objects.

 In this case, 32 bits (4 bytes) float, as you see on the right side of figure 2.3. So a 1D tensor of 1 million float numbers requires 4 million contiguous bytes to be stored, plus a small overhead for the metadata (dimensions, numeric type, and so on).

-----

#### Example: 
Suppose that you have a list of 2D coordinates that you’d like to manage to represent
 a geometrical object, such as a triangle. Instead of having coordinates as numbers in a
 Python list, you can use a one-dimensional tensor by storing xs in the even indices and
 ys in the odd indices, like so:


In [12]:
points = torch.zeros(6) # to get an appropriate size of an array by initializing 

In [14]:
# overwrites zeros with the desired values 
points[0] = 1.0 
points[1] = 4.0 
points[2] = 2.0 
points[3] = 1.0 
points[4] = 3.0 
points[5] = 5.0 

In [15]:
points

tensor([1., 4., 2., 1., 3., 5.])

>>> other way to pass coordinates

In [16]:
ponits = torch.tensor([1,4,2,1,4,5])

In [17]:
points

tensor([1., 4., 2., 1., 3., 5.])

In [21]:
# to get coordinates of first point
float(points[0]),float(points[1])

(1.0, 4.0)

#### Analysis:
This technique is OK, although it would be practical to have the first index refer to
 individual 2D points rather than point coordinates.
 
 For this purpose, use a 2D
 tensor

In [26]:
# points using 2D tensor
points = torch.tensor([[1.0,4.0],[2.0,1.0],[3.0,5.0]]) # we pass list of lists

In [23]:
points

tensor([[1., 4.],
        [2., 1.],
        [3., 5.]])

In [24]:
points[0]

tensor([1., 4.])

In [28]:
# check the shape of tensor
points.shape # first output is row and second column

torch.Size([3, 2])

In [31]:
# we can initialze tensor by providing size as a tuple
points = torch.zeros(3,2)
points

tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])

In [33]:
# initialize
points = torch.FloatTensor([[1.0, 4.0], [2.0, 1.0],[3.0,5.0]])
points


tensor([[1., 4.],
        [2., 1.],
        [3., 5.]])

In [34]:
# Now you can access an individual element in the tensor by using two indices:
points[0]

tensor([1., 4.])

In [36]:
points[0,1] # y coordinate of 0th location 

tensor(4.)

## 2.2  Tensors and storages

Values are allocated in contiguous chunks of memory, managed by torch.Storage instances.

A storage is a one-dimensional array of numerical data, such as a contiguous
 block of memory containing numbers of a given type, perhaps a float or int32. A PyTorch Tensor is a view over such a Storage that’s capable of indexing into that
 storage by using an offset and per-dimension strides.
 

 The underlying memory is allocated only once, however, so creating alternative tensor
 views on the data can be done quickly, regardless of the size of the data managed by
 the Storage instance

Next, you see how indexing into the storage works in practice with 2D points. You can
 access the storage for a given tensor by using the .storage property

In [37]:
points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]]) 
points.storage()

 1.0
 4.0
 2.0
 1.0
 3.0
 5.0
[torch.FloatStorage of size 6]

Even though the tensor reports itself as having three rows and two columns, the storage under the hood is a contiguous array of size 6. In this sense, the tensor knows how
 to translate a pair of indices into a location in the storage. 
 
 You can also index into a storage manually:


In [38]:
points_storage = points.storage() 
points_storage[0]

1.0

In [40]:
# another way
points.storage()[1]

4.0

You can’t index a storage of a 2D tensor by using two indices. The layout of a storage is
 always one-dimensional, irrespective of the dimensionality of any tensors that may
 refer to it. 

Changing the value of a storage
 changes the content of its referring tensor:


In [41]:
points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]]) 
points_storage = points.storage()
points_storage[0] = 2.0 
points

tensor([[2., 4.],
        [2., 1.],
        [3., 5.]])

#### Analysis:
You’ll seldom, if ever, use storage instances directly, but understanding the relationship between a tensor and the underlying storage is useful for understanding the cost
 (or lack thereof) of certain operations later

## 2.3 size, storage offset, and strides

The size (or shape, in NumPy parlance) is a tuple indicating how many elements across
 each dimension the tensor represents.
 
  The storage offset is the index in the storage that
 corresponds to the first element in the tensor. 
 
 The stride is the number of elements in
 the storage that need to be skipped to obtain the next element along each dimension. 

In [42]:
# You can get the second point in the tensor by providing the corresponding index:
points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
second_point = points[1]
second_point.storage_offset()

2

In [43]:
second_point.size()

torch.Size([2])

The resulting tensor has offset 2 in the storage (because we need to skip the first
 point, which has two items) and the size is an instance of the Size class containing
 one element because the tensor is one-dimensional

Note: this information
 is the same information contained in the shape property of tensor objects:

In [44]:
second_point.shape

torch.Size([2])

Last, stride is a tuple indicating the number of elements in the storage that have to be
 skipped when the index is increased by 1 in each dimension. The points tensor, for
 example, has a stride: 

In [47]:
points.stride()

(2, 1)

Accessing an element i, j in a 2D tensor results in accessing the storage_offset +
 stride[0] * i + stride[1] * j element in the storage. The offset will usually be
 zero; if this tensor is a view into a storage created to hold a larger tensor the offset
 might be a positive value

for example:

In [54]:
for i in range(5):
    for j in range(5):
        print(i,j)
        print(0+2*i+1*j) # stride 0, stride[0]=2, stride[1]=1
        break  

0 0
0
1 0
2
2 0
4
3 0
6
4 0
8


 The indirection between Tensor and Storage leads some operations, such as
 transposing a tensor or extracting a subtensor, to be inexpensive, as they don’t lead to
 memory reallocations; instead, they consist of allocating a new tensor object with a different value for size, storage offset, or stride.

In [55]:
points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]]) 
second_point = points[1] 
second_point.size()

torch.Size([2])

In [57]:
second_point.storage_offset()

2

In [58]:
second_point.stride()

(1,)

Bottom line, the subtensor has one fewer dimension (as you’d expect) while still
 indexing the same storage as the original points tensor. Changing the subtensor has a
 side effect on the original tensor too:

In [59]:
points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]]) 
second_point = points[1]
second_point[0] = 10.0 
points

tensor([[ 1.,  4.],
        [10.,  1.],
        [ 3.,  5.]])

This effect may not always be desirable, so you can eventually clone the subtensor into
 a new tensor:

In [60]:
points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
second_point = points[1].clone()

In [61]:
second_point

tensor([2., 1.])

In [62]:
second_point[0] = 10.0

In [63]:
points

tensor([[1., 4.],
        [2., 1.],
        [3., 5.]])

Try transposing now. Take your points tensor, which has individual points in the rows
 and x and y coordinates in columns, and turn it around so that individual points
 are along the columns. 

In [64]:
points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
points

tensor([[1., 4.],
        [2., 1.],
        [3., 5.]])

In [66]:
points_t = points.t()
points_t

tensor([[1., 2., 3.],
        [4., 1., 5.]])

We can easily verify that two tensor share storage 

In [68]:
id(points.storage()) == id(points_t.storage())

True

In [70]:
points_t.storage()

 1.0
 4.0
 2.0
 1.0
 3.0
 5.0
[torch.FloatStorage of size 6]

They only differ shape and stride

In [71]:
points.stride()

(2, 1)

In [72]:
points_t.stride()

(1, 2)

Transposing in PyTorch isn’t limited to matrices. You can transpose a multidimensional array by specifying the two dimensions along which transposing (such as flipping shape and stride) should occur:

In [73]:
some_tensor = torch.ones(3, 4, 5)
some_tensor_t = some_tensor.transpose(0,2)

In [78]:
some_tensor[2]

tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

In [82]:
some_tensor_t.shape

torch.Size([5, 4, 3])

In [105]:
aa = torch.zeros(4,2,4)

In [107]:
aa.shape

torch.Size([4, 2, 4])

In [110]:
aa.transpose(2,0).shape

torch.Size([4, 2, 4])

 In this case, points is contiguous but its transpose is not.


In [111]:
points.is_contiguous()

True

In [112]:
points_t.is_contiguous()

False

False You can obtain a new contiguous tensor from a noncontiguous one by using the contiguous method. The content of the tensor stays the same, but the stride changes, as
 does the storage:

In [115]:
points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]]) 
points_t = points.t()
points_t

tensor([[1., 2., 3.],
        [4., 1., 5.]])

In [129]:
points

tensor([[1., 4.],
        [2., 1.],
        [3., 5.]])

In [116]:
points_t.storage()

 1.0
 4.0
 2.0
 1.0
 3.0
 5.0
[torch.FloatStorage of size 6]

In [119]:
points.storage()

 1.0
 4.0
 2.0
 1.0
 3.0
 5.0
[torch.FloatStorage of size 6]

In [120]:
points_t.stride()

(1, 2)

In [121]:
points_t_cont = points_t.contiguous()

In [122]:
points_t_cont

tensor([[1., 2., 3.],
        [4., 1., 5.]])

In [123]:
points_t_cont.stride()

(3, 1)

In [124]:
points.stride()

(2, 1)

In [128]:
points_t_cont.storage() # row by row in a line

 1.0
 2.0
 3.0
 4.0
 1.0
 5.0
[torch.FloatStorage of size 6]

## 2.4 Numeric types

The data type specifies the possible values that the
 tensor can hold (integers versus floating point numbers) and the number of bytes per value.
 
The dypte argument is deliberately similar to the standard NumPy argument
 of the same name. 

 To allocate a tensor of the right numeric type, you can specify the proper dtype as
 an argument to the constructor, as follow:

In [131]:
double_points = torch.ones(10, 2, dtype=torch.double) 
short_points = torch.tensor([[1, 2], [3, 4]], dtype=torch.short)

In [132]:
# to check dtype of tensor

In [134]:
double_points.dtype

torch.float64

In [135]:
short_points.dtype

torch.int16

You can also cast the output of a tensor-creation function to the right type by using the
 corresponding casting method.

In [136]:
double_points = torch.zeros(10, 2).double()
short_points = torch.ones(10, 2).short()

or the more convenient to method:

In [137]:
double_points = torch.zeros(10, 2).to(torch.double) 
short_points = torch.ones(10, 2).to(dtype=torch.short)

Under the hood, type and to perform the same type check-and-convert-if-needed
 operation, but the to method can take additional arrgument.

you can always cast a tensor of one type as a tensor of another type by using the
 type method:


In [143]:
points = torch.randn(10,2)
short_points = points.type(torch.short)

## 2.5 Indexing tensors
We can do the same way as we are doing in Python lists

In [144]:
points[1:] # all rows after first (all columns)

tensor([[ 0.0262, -0.5986],
        [ 0.3913, -0.2973],
        [ 2.2562, -2.1667],
        [ 0.4971, -1.0562],
        [ 1.6252, -1.0766],
        [-0.1461,  0.4239],
        [ 1.0319,  0.2249],
        [ 0.9232,  0.8186],
        [-0.7398, -0.7280]])

In [146]:
points.shape

torch.Size([10, 2])

In [149]:
points[1:,:] #equivalent to points[1:]

tensor([[ 0.0262, -0.5986],
        [ 0.3913, -0.2973],
        [ 2.2562, -2.1667],
        [ 0.4971, -1.0562],
        [ 1.6252, -1.0766],
        [-0.1461,  0.4239],
        [ 1.0319,  0.2249],
        [ 0.9232,  0.8186],
        [-0.7398, -0.7280]])

In [151]:
points[1:,0] # all rows, first column

tensor([ 0.0262,  0.3913,  2.2562,  0.4971,  1.6252, -0.1461,  1.0319,  0.9232,
        -0.7398])

In addition to using ranges, Pytorch features a powerful form indexing 
called "Advanced Indexing"

## 2.6 NumPy interoperability 
 PyTorch tensors can be converted to NumPy arrays
 and vice versa efficiently. By doing so, you can leverage the huge swath of functionality
 in the wider Python ecosystem that has built up around the NumPy array type. This zero-copy interoperability with NumPy arrays is due to the storage system that works
 with the Python buffer protocol.

In [157]:
# To get a NumPy array out of your points tensor, call:

points = torch.ones(3, 4) 
points_np = points.numpy() 
points

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [158]:
points_np

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], dtype=float32)

In [159]:
points_np[0]=4

which returns a NumPy multidimensional array of the right size, shape, and numerical type. Interestingly, the returned array shares an underlying buffer with the tensor
 storage. As a result, the numpy method can be executed effectively at essentially no cost
 as long as the data sits in CPU RAM and modifying the NumPy array leads to a change
 in the originating tensor.

 If the tensor is allocated on the GPU, PyTorch makes a copy of the content of the
 tensor into a NumPy array allocated on the CPU. Conversely, you can obtain a PyTorch tensor from NumPy array this way:

>..

In [161]:
 points = torch.from_numpy(points_np)

In [162]:
points

tensor([[4., 4., 4., 4.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

## 2.7 Serializing tensors

Creating a tensor on the fly is all well and good, but if the data inside it is of any value
 to you, you want to save it to a file and load it back at some point.
 
PyTorch uses pickle under the hood to serialize the tensor object, as well as
 dedicated serialization code for the storage. 
 
 Here’s how you can save your points tensor to a ourpoints.t file:

In [164]:
torch.save(points, 'C:/Users/Haier/ourpoints.t') 

In [163]:
cd

C:\Users\Haier


As an alternative, you can pass a file descriptor in lieu of the filename:

In [165]:
with open('C:/Users/Haier/ourpoints.t','wb') as f:  
    torch.save(points, f)

Loading back:

In [166]:
points = torch.load('C:/Users/Haier/ourpoints.t')

In [167]:
points

tensor([[4., 4., 4., 4.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [169]:
# the equivalent :
with open('C:/Users/Haier/ourpoints.t','rb') as f: 
    points = torch.load(f) 

This technique allows you to save tensors quickly in case you only want to load them
 with PyTorch, but the file format itself isn’t interoperable.

For those cases when you need to, however, you can use the HDF5 format and
 library. HDF5 is a portable, widely supported format for representing serialized multidimensional arrays, organized in a nested key-value dictionary. 

Python supports HDF5
 through the h5py library, which accepts and returns data under the form of NumPy
 arrays.

In [173]:
import h5py

f = h5py.File('C:/Users/Haier/ourpoints.hdf5', 'w') 
dset = f.create_dataset('coords', data=points.numpy()) 
f.close() 

We can load in this format the data we need only

In [174]:
f = h5py.File('C:/Users/Haier/ourpoints.hdf5', 'r')
dset = f['coords']

In [175]:
dset

<HDF5 dataset "coords": shape (3, 4), type "<f4">

In [176]:
last_points = dset[1:]

In [177]:
last_points

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.]], dtype=float32)

In [178]:
# store in the form of tensor 
last_points = torch.from_numpy(last_points)

In [179]:
last_points

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]])

<------------------------------------------- End CH # 2 ---------------------------------->

### Exercise :