ses3B_dotProd

In [1]:
import array as arr
import numpy as np
import time

Arrays are more memory-efficient than lists

Performance measurements using two methods

1)`process_time_ns()`:Only measures the process excution time

2)`perf_counter_ns()`: This much more granular and measure I/O wait time also which makes it perfect for overall performance measurements.

In [2]:
# Array is created with the values from 0 to 99999
A = arr.array('d', range(100000)) 
B = arr.array('d', range(100000))
# classic dot product of vectors in Python
Dot = 0.0;
# Measure the start and end times in nano seconds.
#startTime = time.process_time_ns() 
startTime = time.perf_counter_ns() 
for i in range(len(A)): 
      Dot += A[i] * B[i]
      
endTime = time.perf_counter_ns() 

print('Dot product computed without NumPy: ', Dot)
print('Perf count in nsecs without NumPy: ', endTime-startTime)
pyPerfCount = endTime-startTime
print(pyPerfCount)
print(type(endTime))

Dot product computed without NumPy:  333328333350000.0
Perf count in nsecs without NumPy:  18030600
18030600
<class 'int'>


Now find out the time taken to calculate the dot product in `np.arry`

In [3]:
npA = np.array(A)
npB = np.array(B)
startTime = time.perf_counter_ns() 

# Using the same Python arrays A, B perform dot product using np funtion dot()
npDot = np.dot(npA, npB) # Here pointers to the arrays are passed
endTime = time.perf_counter_ns()
print('Type of returned value from np.dot() is ', type(npDot))
print('Dot product computed with NumPy: ', npDot)
print('Perf count in nsecs with NumPy: ', endTime-startTime)
npPerfCount = endTime-startTime

Type of returned value from np.dot() is  <class 'numpy.float64'>
Dot product computed with NumPy:  333328333350000.0
Perf count in nsecs with NumPy:  726800


In [4]:
print("Improvement in performance because of NumPy is ", 
        int(pyPerfCount/npPerfCount))


Improvement in performance because of NumPy is  24


The reason why `np.array` calculates the dot product faster than Python's built-in `array.array` is as follows

1)Vectorization in NumPy:

  The operation is highly optimized and runs at the compiled C-level, not at the Python interpreter level.
  
  Vectorization refers to the ability to apply an operation (like addition, multiplication) to entire arrays without using explicit loops in Python. In contrast, in a Python loop (like the for loop in your array.array implementation), the interpreter executes each iteration, which involves overhead for managing the loop

2)Low-Level Optimizations in NumPy:

  NumPy uses SIMD (Single Instruction, Multiple Data) and other low-level optimizations to process large chunks of data at once.

3)Memory Contiguity


`out` parameter:

In [5]:
npVal = np.array(0.0)
chkVal = np.dot(A, B, out=npVal)
print('type of npVal is ', type(npVal), ' and npVal = ', npVal)
print('Typeof chkVal is ', type(chkVal), ' and chkVal is ', chkVal)

type of npVal is  <class 'numpy.ndarray'>  and npVal =  333328333350000.0
Typeof chkVal is  <class 'numpy.float64'>  and chkVal is  333328333350000.0


In [6]:
npA = np.array([1, 2, 3])
npB = np.array([2, 2, 2])
eleMult = npA * npB
print('eleMult is: ', eleMult)

eleMult is:  [2 4 6]


using @ operator to do dot product

In [7]:
dotProduct = npA @ npB
print('dotProduct is: ', dotProduct)
A = np.array([[1, 1], 
              [2, 2]])
B = np.array([[3, 5], 
              [4, 6]])
npDot = np.dot(A, B)
print(npDot)

dotProduct is:  12
[[ 7 11]
 [14 22]]


Ses3B_dtypes


In [8]:
import sys
import platform

In [9]:
print('Python interpreter version running on Windows PE')
print(platform.architecture())

Python interpreter version running on Windows PE
('64bit', 'WindowsPE')


In [10]:
class X:
    def __init__(self, name):
        self.name = name
    
    def __str__(self):
      return 'I am am instance of class X\n'  + 'My name is ' + self.name

class Y:
    def __init__(self, name):
        self.name = name
    
    def __str__(self):
      return 'I am am instance of class Y\n'  + 'My name is ' + self.name

class Z:
    def __init__(self, name):
        self.name = name
    
    def __str__(self):
      return 'I am am instance of class Z\n'  + 'My name is ' + self.name

objX = X('ObjX')
objY = Y('ObjY')
objZ = Z('ObjZ')

`np.object_`is a special dtype in NumPy that allows you to store arbitrary Python objects in NumPy arrays.

In this line, objX (an instance of class X) is wrapped as a NumPy object.

In [11]:
npObjX = np.object_(objX)
print('\nAbout dtype object')
print('Type: ', type(npObjX))
print('Value: ', npObjX)



About dtype object
Type:  <class '__main__.X'>
Value:  I am am instance of class X
My name is ObjX


You create a NumPy array npObjArr of length 3 with dtype=object. This allows you to store arbitrary Python objects (in this case, the instances of X, Y, and Z) in the array

In [12]:
npObjArr = np.ndarray(3, dtype=object)

npObjArr[0] = objX
npObjArr[1] = objY
npObjArr[2] = objZ

`itemsize`: This attribute returns the size (in bytes) of each element in the array. For object arrays, the item size is the size of the memory pointer to the object (which is platform-dependent, typically 8 bytes on 64-bit systems).

On 64-bit systems, each pointer typically takes 8 bytes (64 bits).

On 32-bit systems, each pointer takes 4 bytes (32 bits).

When dealing with NumPy arrays of dtype=object, the itemsize refers to the size of the memory pointer that points to the object stored in the array, rather than the size of the object itself.

`size`: This attribute returns the total number of elements in the array

The total size (in bytes) of the array can be computed by multiplying the size of the array by the itemsize.

In [13]:
print('\nAbout NumPy array of objects')
print('Item size of objXArr is ', npObjArr.itemsize)
print('Number of elements in the npObjXArr is ', npObjArr.size)
print('Total size in bytes, taken by npObjXArr is ', 
      npObjArr.size * npObjArr.itemsize)


About NumPy array of objects
Item size of objXArr is  8
Number of elements in the npObjXArr is  3
Total size in bytes, taken by npObjXArr is  24


In [14]:
for i in range(3):
    print('\nDetails of npObjArr[', i, ']', sep='')
    print('Type: ', type(npObjArr[i]))
    print('Value: ', npObjArr[i])


Details of npObjArr[0]
Type:  <class '__main__.X'>
Value:  I am am instance of class X
My name is ObjX

Details of npObjArr[1]
Type:  <class '__main__.Y'>
Value:  I am am instance of class Y
My name is ObjY

Details of npObjArr[2]
Type:  <class '__main__.Z'>
Value:  I am am instance of class Z
My name is ObjZ


Here we are working with the NumPy power function to compute 100^8 using different data types, specifically `np.int64` and `np.int32`

The example highlights how integer overflow occurs in NumPy when using smaller integer types like int32, which does not happen with Python's built-in integers (since Python’s int can grow in size dynamically)

Python’s built-in int type is **arbitrary-precision**, meaning it can grow dynamically to accommodate large values without causing overflow

In [15]:
# About Overflows in NumPy which does not happen with Python ints
npPow64 = np.power(100, 8, dtype=np.int64)
print('\nAbout overflows in NumPy')
print('Value of 100^8 using int64', npPow64)
npPow32 = np.power(100, 8, dtype=np.int32)
print('Value of 100^8 using int32', npPow32)



About overflows in NumPy
Value of 100^8 using int64 10000000000000000
Value of 100^8 using int32 1874919424


In [16]:
# Incorrect even with 64-bit int, then float64 can be used instead
npPow64 = np.power(100, 100, dtype=np.int64) 
print('Value of 100^100 using npPow64', npPow64)
npPowF64 = np.power(100, 100, dtype=np.float64) 
print('Value of 100^100 using npPowF64', npPowF64)

Value of 100^100 using npPow64 0
Value of 100^100 using npPowF64 1e+200


 **byte order** (also known as **endianness**)

Here, 'i2' specifies a signed 2-byte integer (or 16-bit integer). In NumPy, 'i' stands for an integer, and the number 2 specifies the size in bytes.

`dt.byteorder` gives the byte order (endianness) of the data type.

The byteorder attribute can return:

`=`: Native endianness, which means the byte order is the same as the system’s architecture (depends on whether your machine uses little-endian or big-endian).

`<`: Little-endian.

`>`: Big-endian.

In [17]:
# About byteorder
dt = np.dtype('i2') # An  integer with two bytes
print('Byte order of two bytes int is: ', dt.byteorder) # Native endianess
print('Item size is ', dt.itemsize)

Byte order of two bytes int is:  =
Item size is  2


In [18]:
# Learn about endiness of the machine this is running on
print('Endianess: ', np.dtype(np.uintc).byteorder)
print('Endianess: ', np.dtype(np.ubyte).byteorder)

print('System byteorder: ', sys.byteorder)

intNonNative = np.dtype(np.uintc).newbyteorder('S')
print(type(intNonNative))
print('Non-native byteorder is ', intNonNative.byteorder)
intNative = np.dtype(np.uintc).newbyteorder('S').newbyteorder('S')
print('Native byteorder is ', intNative.byteorder)

Endianess:  =
Endianess:  |
System byteorder:  little
<class 'numpy.dtypes.UIntDType'>
Non-native byteorder is  >
Native byteorder is  <


Ses3B_ndarrays_in_cls

In [19]:
import numpy as np

a = np.array([0, 1, 2, 3])
print(a)




[0 1 2 3]


In [20]:
b = np.array([[0, 1, 2], [3, 4, 5]])    # 2 x 3 array
print(b)

print(b.shape)
print(b.ndim)
print(b[0])
print(b[1])


[[0 1 2]
 [3 4 5]]
(2, 3)
2
[0 1 2]
[3 4 5]


In [21]:
a = np.arange(10) # 0 .. n-1  (!)
print(a)

aFloat = np.array([1, 2, 3], dtype=float)
print(aFloat)

aComplex = np.array([1+2j, 3+4j, 5+6*1j])
print(aComplex)
print(aComplex.dtype)



[0 1 2 3 4 5 6 7 8 9]
[1. 2. 3.]
[1.+2.j 3.+4.j 5.+6.j]
complex128


Ses3B_endianess.py

`id(arr)` returns the memory address of the array object `arr`. Each object in Python has a unique ID, which corresponds to the address of the object in memory.

In [22]:
arr = np.array([0, 1, 2, 3])
print(arr)
print("id(arr) = ", id(arr))

[0 1 2 3]
id(arr) =  2338078241200


In [23]:
# Individual elements of a NumPy array can be modified without changing its size
arr[0] = 10
print(arr)
print("id(arr) = ", id(arr))

[10  1  2  3]
id(arr) =  2338078241200


**NumPy arrays** are of fixed size. This means that you cannot change the size of an array in place

In [24]:
# Adding a new element or changing the size of NumPy array creates a new array
arr = np.append(arr, 4)
print(arr)
print("id(arr) = ", id(arr))


[10  1  2  3  4]
id(arr) =  2338078523536


 the number will be represented in **little-endian** format, meaning the least significant byte comes first.

In [25]:
# Let us understand the contents of different integer values in NumPy
npMyInt32 = np.int32(5)
print("value in npMyInt32 is ", npMyInt32)
print("Size of npMyInt32 is ", npMyInt32.nbytes)
print("Binary contents of npMyInt32 is ", npMyInt32.tobytes())

value in npMyInt32 is  5
Size of npMyInt32 is  4
Binary contents of npMyInt32 is  b'\x05\x00\x00\x00'


Calculation of the bits:

Two's Complement:

1)First, find the binary representation of +2 in 64 bits:

`00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000010`.

2)Invert All Bits:

`11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111101`

3)Add 1 :
`11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111110`.

`\xfe` is the hex representation of the least significant byte `11111110`.

`\xff` represents `11111111`, which are the remaining 7 bytes.






In [26]:
npMyInt64 = np.int64(-2)
print("value in npMyInt64 is ", npMyInt64)
print("Size of npMyInt64 is ", npMyInt64.nbytes)
print("Binary contents of npMyInt64 is ", npMyInt64.tobytes())

value in npMyInt64 is  -2
Size of npMyInt64 is  8
Binary contents of npMyInt64 is  b'\xfe\xff\xff\xff\xff\xff\xff\xff'


In [27]:
print(sys.byteorder)

little


 Different values of the byteorder attribute of NumPy
 byteorder is an attribute of NumPy arrays (ndarray) and data types (dtype).

‘=’ : native 

‘<’ : little-endian

‘>’ : big-endian

‘|’   : not applicable

`i1` represents a signed 8-bit integer in NumPy

However, for a single byte (like i1), endianness is irrelevant because a single byte is the smallest possible unit of data. There is no need to order bytes differently since there's only one byte.

In [28]:
# Is endianess relevant for a byte value?
print("Endianess of a byte: ", np.dtype('i1').byteorder)

Endianess of a byte:  |


In [29]:

# Is endianess relevant for a byte value?
print("Endianess of a short or int16: ", np.dtype('i2').byteorder)

Endianess of a short or int16:  =


In [30]:
npMyInt32 = np.int32(8)

In [31]:
print("Value in npMyInt32 is ", npMyInt32)
print("Value in npMyInt32 in hex is ", hex(npMyInt32))

Value in npMyInt32 is  8
Value in npMyInt32 in hex is  0x8


In [32]:
print("Endianess or byteorder of npMyInt32 is ", npMyInt32.dtype.byteorder)
print("Print the byte contents of npMyInt32: ", npMyInt32.tobytes())

Endianess or byteorder of npMyInt32 is  =
Print the byte contents of npMyInt32:  b'\x08\x00\x00\x00'


 After reversing:

 `00001000 00000000 00000000 00000000`

 **2^27 = 134217728**

Because the binary digit `1`is located in the 27th position (counting from the right, starting at position 0), which is equivalent to `2^27`. All the other bits are zeros.



In [33]:

# Change the byteorder of the NumPy variable (in-place is not allowed)
npMyInt32New = npMyInt32.byteswap()
print("Value in npMyInt32New is ", npMyInt32New)
print("Value in npMyInt32New in hex is ", hex(npMyInt32New))

print("Swapped byte contents of npMyInt32New: ", npMyInt32New.tobytes())

Value in npMyInt32New is  134217728
Value in npMyInt32New in hex is  0x8000000
Swapped byte contents of npMyInt32New:  b'\x00\x00\x00\x08'


In [34]:
# Byteswap in-place is not allowed on a scalar value in NumPy
#npMyInt32.byteswap(inplace=True)

intNonNative = np.dtype(np.uintc).newbyteorder('S')
print('Non-native byteorder is ', intNonNative.byteorder)
intNative = np.dtype(np.uintc).newbyteorder('S').newbyteorder('S')
print('Native byteorder is ', intNative.byteorder)

Non-native byteorder is  >
Native byteorder is  <


Ses3B_in_SecA_cls-.py

In [35]:
import numpy as np
import sys
import platform

In [36]:
print('Python interpreter version running on Windows PE')
print(platform.architecture())


Python interpreter version running on Windows PE
('64bit', 'WindowsPE')


In [37]:
class X:
    def __init__(self, name):
        self.name = name
    
    def __str__(self):
      return 'I am am instance of class X\n'  + 'My name is ' + self.name

class Y:
    def __init__(self, name):
        self.name = name
    
    def __str__(self):
      return 'I am am instance of class Y\n'  + 'My name is ' + self.name

class Z:
    def __init__(self, name):
        self.name = name
    
    def __str__(self):
      return 'I am am instance of class Z\n'  + 'My name is ' + self.name

objX = X('ObjX')
objY = Y('ObjY')
objZ = Z('ObjZ')

print(objX)

I am am instance of class X
My name is ObjX


Storing the objects in an array

In [38]:

npObjArr = np.ndarray(3, dtype=object)
npObjArr[0] = objX
npObjArr[1] = objY
npObjArr[2] = objZ

In [39]:
arr = np.array([0, 1, 2, 3])
print(arr)
print("id(arr) = ", id(arr))

[0 1 2 3]
id(arr) =  2338078244464


In [40]:
# Individual elements of a NumPy array can be modified without changing its size
arr[0] = 10
print(arr)
print("id(arr) = ", id(arr))

[10  1  2  3]
id(arr) =  2338078244464


In [41]:
# Adding a new element or changing the size of NumPy array creates a new array
arr = np.append(arr, 4)
print(arr)
print("id(arr) = ", id(arr))

[10  1  2  3  4]
id(arr) =  2338078523824


In [42]:
# Let us understand the contents of different integer values in NumPy
npMyInt32 = np.int32(5)
print("value in npMyInt32 is ", npMyInt32)
print("Size of npMyInt32 is ", npMyInt32.nbytes)
print("Binary contents of npMyInt32 is ", npMyInt32.tobytes())

value in npMyInt32 is  5
Size of npMyInt32 is  4
Binary contents of npMyInt32 is  b'\x05\x00\x00\x00'


In [43]:
npMyInt64 = np.int64(-2)
print("value in npMyInt64 is ", npMyInt64)
print("Size of npMyInt64 is ", npMyInt64.nbytes)
print("Binary contents of npMyInt64 is ", npMyInt64.tobytes())

value in npMyInt64 is  -2
Size of npMyInt64 is  8
Binary contents of npMyInt64 is  b'\xfe\xff\xff\xff\xff\xff\xff\xff'


Ses3B_in_SecC_cls-.py

In [47]:
import numpy as np
import sys
import platform

print('Python interpreter version running on Windows PE')
print(platform.architecture())

class X:
    def __init__(self, name):
        self.name = name
    
    def __str__(self):
      return 'I am am instance of class X\n'  + 'My name is ' + self.name

class Y:
    def __init__(self, name):
        self.name = name
    
    def __str__(self):
      return 'I am am instance of class Y\n'  + 'My name is ' + self.name

class Z:
    def __init__(self, name):
        self.name = name
    
    def __str__(self):
      return 'I am am instance of class Z\n'  + 'My name is ' + self.name


objX = X('ObjX')
objY = Y('ObjY')
objZ = Z('ObjZ')

npObjX = np.object_(objX)
print('\nAbout dtype object')
print('Type: ', type(npObjX))
print('Value: ', npObjX)

Python interpreter version running on Windows PE
('64bit', 'WindowsPE')

About dtype object
Type:  <class '__main__.X'>
Value:  I am am instance of class X
My name is ObjX


In [49]:
npObjArr = np.ndarray(3, dtype=object)

npObjArr[0] = objX
npObjArr[1] = objY
npObjArr[2] = objZ

`itemsize` typically returns the size of a pointer to an object in memory, not the size of the object itself. This is usually the same size as a reference (or pointer) on the underlying platform (e.g., 4 bytes on a 32-bit system and 8 bytes on a 64-bit system).


In [50]:
print('\nAbout NumPy array of objects')
print('Item size of objXArr is ', npObjArr.itemsize)
print('Number of elements in the npObjXArr is ', npObjArr.size)
print('Total size in bytes, taken by npObjXArr is ', 
      npObjArr.size * npObjArr.itemsize)


About NumPy array of objects
Item size of objXArr is  8
Number of elements in the npObjXArr is  3
Total size in bytes, taken by npObjXArr is  24
