# 100 numpy exercises

This is a collection of exercises that have been collected in the numpy mailing list, on stack overflow
and in the numpy documentation. The goal of this collection is to offer a quick reference for both old
and new users but also to provide a set of exercises for those who teach.


If you find an error or think you've a better way to solve some of them, feel
free to open an issue at <https://github.com/rougier/numpy-100>.

File automatically generated. See the documentation to update questions/answers/hints programmatically.

Run the `initialize.py` module, then for each question you can query the
answer or an hint with `hint(n)` or `answer(n)` for `n` question number.

In [1]:
%run initialise.py
hint(1)
answer(1)

import numpy as np
rng = np.random.default_rng(42)

hint: import … as
import numpy as np


#### 1. Import the numpy package under the name `np` (★☆☆)

In [2]:
# import numpy as np

#### 2. Print the numpy version and the configuration (★☆☆)

In [3]:
np.__version__

'2.3.4'

In [4]:
np.show_config()

Build Dependencies:
  blas:
    detection method: pkgconfig
    found: true
    include directory: c:/Users/sk/miniforge3/Library/include
    lib directory: c:/Users/sk/miniforge3/Library/lib
    name: blas
    openblas configuration: unknown
    pc file directory: c:/Users/sk/miniforge3/Library/lib/pkgconfig
    version: 3.9.0
  lapack:
    detection method: pkgconfig
    found: true
    include directory: c:/Users/sk/miniforge3/Library/include
    lib directory: c:/Users/sk/miniforge3/Library/lib
    name: lapack
    openblas configuration: unknown
    pc file directory: c:/Users/sk/miniforge3/Library/lib/pkgconfig
    version: 3.9.0
Compilers:
  c:
    commands: cl.exe
    linker: link
    name: msvc
    version: 19.44.35217
  c++:
    commands: cl.exe
    linker: link
    name: msvc
    version: 19.44.35217
  cython:
    commands: cython
    linker: cython
    name: cython
    version: 3.1.5
Machine Information:
  build:
    cpu: x86_64
    endian: little
    family: x86_64
    sys

#### 3. Create a null vector of size 10 (★☆☆)

In [5]:
arr = np.zeros(10) # OR np.zeros( (10) ) / np.zeros( shape=(10,) )
arr

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

#### 4. How to find the total memory size of any array (★☆☆)

In [6]:
arr.nbytes  # .itemsize for each ele, .size for number of ele
# dont use sys.getsizeof(arr[0]) , gives new Python scalar object wrapper's size

80

#### 5. How to get the documentation of the numpy add function from the command line? (★☆☆)

In [7]:
print(np.add.__doc__)

add(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature])

Add arguments element-wise.

Parameters
----------
x1, x2 : array_like
    The arrays to be added.
    If ``x1.shape != x2.shape``, they must be broadcastable to a common
    shape (which becomes the shape of the output).
out : ndarray, None, or tuple of ndarray and None, optional
    A location into which the result is stored. If provided, it must have
    a shape that the inputs broadcast to. If not provided or None,
    a freshly-allocated array is returned. A tuple (possible only as a
    keyword argument) must have length equal to the number of outputs.
where : array_like, optional
    This condition is broadcast over the input. At locations where the
    condition is True, the `out` array will be set to the ufunc result.
    Elsewhere, the `out` array will retain its original value.
    Note that if an uninitialized `out` array is created via the default
    ``out=None``,

In [8]:
np.info(np.add)

add(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature])

Add arguments element-wise.

Parameters
----------
x1, x2 : array_like
    The arrays to be added.
    If ``x1.shape != x2.shape``, they must be broadcastable to a common
    shape (which becomes the shape of the output).
out : ndarray, None, or tuple of ndarray and None, optional
    A location into which the result is stored. If provided, it must have
    a shape that the inputs broadcast to. If not provided or None,
    a freshly-allocated array is returned. A tuple (possible only as a
    keyword argument) must have length equal to the number of outputs.
where : array_like, optional
    This condition is broadcast over the input. At locations where the
    condition is True, the `out` array will be set to the ufunc result.
    Elsewhere, the `out` array will retain its original value.
    Note that if an uninitialized `out` array is created via the default
    ``out=None``,

#### 6. Create a null vector of size 10 but the fifth value which is 1 (★☆☆)

In [9]:
arr = np.zeros(10)
arr[4] = 1
arr

array([0., 0., 0., 0., 1., 0., 0., 0., 0., 0.])

#### 7. Create a vector with values ranging from 10 to 49 (★☆☆)

In [10]:
arr = np.arange(10,50,1)
arr

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
       27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
       44, 45, 46, 47, 48, 49])

#### 8. Reverse a vector (first element becomes last) (★☆☆)

In [11]:
arr[::-1]

array([49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33,
       32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16,
       15, 14, 13, 12, 11, 10])

#### 9. Create a 3x3 matrix with values ranging from 0 to 8 (★☆☆)

In [12]:
np.arange(0,9).reshape(3,3)

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

#### 10. Find indices of non-zero elements from [1,2,0,0,4,0] (★☆☆)

In [13]:
arr = np.array([1,2,0,0,4,0])
print(  list(filter(lambda x: x!=0, arr))   ) # it will be list, less efficient
# or
print(  np.nonzero(arr) )   # its index

print(  arr[np.nonzero(arr)] )
print(  arr[arr!=0] )   # a boolean mask: [ True  True False False  True False]

[np.int64(1), np.int64(2), np.int64(4)]
(array([0, 1, 4]),)
[1 2 4]
[1 2 4]


#### 11. Create a 3x3 identity matrix (★☆☆)

In [14]:
np.eye(3)   # k=-1 (1 line down diagonal 1 starts)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

#### 12. Create a 3x3x3 array with random values (★☆☆)

In [15]:
# Modern approach
rng = np.random.default_rng(42)
rng.random((3,3,3))             # Uniform [0,1)
# rng.standard_normal((3, 3, 3))    # Standard normal

array([[[0.77395605, 0.43887844, 0.85859792],
        [0.69736803, 0.09417735, 0.97562235],
        [0.7611397 , 0.78606431, 0.12811363]],

       [[0.45038594, 0.37079802, 0.92676499],
        [0.64386512, 0.82276161, 0.4434142 ],
        [0.22723872, 0.55458479, 0.06381726]],

       [[0.82763117, 0.6316644 , 0.75808774],
        [0.35452597, 0.97069802, 0.89312112],
        [0.7783835 , 0.19463871, 0.466721  ]]])

In [16]:
# Or simply use this
np.random.random((3,3,3)) # OR np.random.rand(3,3,3)

array([[[0.32991942, 0.39675202, 0.2943448 ],
        [0.6947414 , 0.16322833, 0.82272343],
        [0.54105884, 0.96363128, 0.2550003 ]],

       [[0.09744535, 0.53312283, 0.01536884],
        [0.61717569, 0.93755122, 0.34631293],
        [0.6033785 , 0.16729445, 0.72719677]],

       [[0.43282168, 0.27446915, 0.94527791],
        [0.92774941, 0.72428457, 0.50287336],
        [0.28867366, 0.21442997, 0.52836935]]])

## So, re-use ```rng```
 
```rng = np.random.default_rng(42)
rng.random_function(size_in_tuple)```

In [17]:
# Each run, start fresh - get same random output every time
rng = np.random.default_rng(42)   # No seed - different output each time
# print(rng.random((3, 4)))  # Etc same output every time
rng

Generator(PCG64) at 0x1B0BA293BC0

#### 13. Create a 10x10 array with random values and find the minimum and maximum values (★☆☆)

In [18]:
# arr = np.random.randint(1,100,(10,10))
arr = rng.integers(1,100,(10,10))   # 100 is exclusive
arr

array([[ 9, 77, 65, 44, 43, 86,  9, 70, 20, 10],
       [53, 97, 73, 76, 72, 78, 51, 13, 84, 45],
       [50, 37, 19, 92, 78, 64, 40, 82, 54, 44],
       [45, 23, 10, 55, 88,  7, 85, 82, 28, 63],
       [17, 76, 70, 36,  7, 97, 45, 89, 68, 78],
       [76, 20, 37, 47, 50,  5, 55, 16, 74, 68],
       [92, 74, 37, 96, 41, 33, 90, 37,  8, 47],
       [79, 19, 46, 13, 68, 48, 33, 23, 56, 67],
       [94, 44, 16, 83, 63, 70, 10, 31, 77, 83],
       [44, 80, 84, 39, 89, 29, 24, 68, 64, 14]])

In [19]:
# for reduction operations both work but arr.method() is recommended
arr.min() , np.max(arr)

(np.int64(5), np.int64(97))

In [20]:
np.max(arr) == 100   # np.False_ treated as False

np.False_

In [21]:
np.True_ == True, np.True_ is True

(np.True_, False)

#### 14. Create a random vector of size 30 and find the mean value (★☆☆)

In [22]:
# np.random.rand(30,30).mean()

rng.random((30,30)).mean()    # axis=None

np.float64(0.49492976908403896)

#### 15. Create a 2d array with 1 on the border and 0 inside (★☆☆)

In [23]:
arr = np.ones((5,10))
arr

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

In [24]:
arr[1:-1, 1:-1] = 0     # arr[rowsindexes, colmsindexes] or just arr[rowsindexes]
arr

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

#### 16. How to add a border (filled with 0's) around an existing array? (★☆☆)
Opposite of above

In [25]:
ar = np.ones((5,10))    # alt way of above # OR, np.pad
ar

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

In [26]:
ar[:, [0,-1]] = 0
ar

array([[0., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 0.]])

In [27]:
ar[[0,-1], :] = 0
ar

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [28]:
# OR size 3,8 + 2,2 = 5,10
np.pad(np.ones((3,8)), pad_width=1, mode='constant', constant_values=0)

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

#### 17. What is the result of the following expression? (★☆☆)
```python
0 * np.nan
np.nan == np.nan
np.inf > np.nan
np.nan - np.nan
np.nan in set([np.nan])
0.3 == 3 * 0.1
```

In [29]:
a = np.array([1,2,np.nan,3])
b = np.array([1,2,np.nan,3])

a is b

False

In [30]:
np.astype(np.array([np.nan]), np.int8)  # position-only arg

  return x.astype(dtype, copy=copy)


array([0], dtype=int8)

In [31]:
print(0 * np.nan)       # except != every nan ops = nan
print(np.nan == np.nan)
print(np.inf > np.nan)  # Comparisons with nan are always False.
print(np.nan - np.nan)
print(np.nan in set([np.nan]))
print(0.3 == 3 * 0.1)   # if deno != 2^n and deno is in x/10 fraction - can't be precisely calculated in binary, 
# so cant be exactly same. ex 0.1 = 1/10 , 0.2 = 2/10 , 0.3 = 3/10, 0.4 = 4/10, 
# exactly same for -> 0.5 = 5/10 = 1/2, 0.25 = 25/100 = 1/4

nan
False
False
nan
True
False


#### 18. Create a 5x5 matrix with values 1,2,3,4 just below the diagonal (★☆☆)

In [32]:
np.diag(v=[1,2,3,4], k=-1)      # Extract (for 2D) OR construct (for 1D ip) a diagonal array.

array([[0, 0, 0, 0, 0],
       [1, 0, 0, 0, 0],
       [0, 2, 0, 0, 0],
       [0, 0, 3, 0, 0],
       [0, 0, 0, 4, 0]])

In [33]:
x = np.array([  [1,2],
                [3,4]]  )
np.diag(x)  # invalid value of k will return []

array([1, 4])

#### 19. Create a 8x8 matrix and fill it with a checkerboard pattern (★☆☆)

In [34]:
arr = np.zeros((8,8))
arr

array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])

In [35]:
arr[::2, 1::2] = 1
arr[1::2, ::2] = 1
arr

array([[0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.]])

In [36]:
M = np.fromfunction(lambda i, j: (i + j) % 2, (8, 8), dtype=int) # i-th row, j-th col, dtype of i&j is int
M

array([[0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0]])

#### 20. Consider a (6,7,8) shape array, what is the index (x,y,z) of the 100th element? (★☆☆)

In [37]:
np.unravel_index(99,(6,7,8))    # np.argmax(), which returns the index of the largest value as a single flat number. 

(np.int64(1), np.int64(5), np.int64(3))

#### 21. Create a checkerboard 8x8 matrix using the tile function (★☆☆)

In [38]:
arr = np.fromfunction(lambda i,j: (i+j)%2, (8,8))
arr

array([[0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.]])

In [39]:
base_pattern = np.array([[0, 1], [1, 0]])
np.tile(base_pattern, (4, 4))

array([[0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1, 0, 1, 0]])

#### 22. Normalize a 5x5 random matrix (★☆☆)

In [40]:
# arr = np.random.random_integers(1,100,(5,5))
arr = rng.integers(1,100,(5,5))
arr

array([[53, 53, 72, 25, 33],
       [25, 30, 16,  4, 94],
       [ 8, 88,  1, 77, 33],
       [52, 28, 49, 45, 53],
       [39, 54, 82, 44, 36]])

In [41]:
( arr - np.mean(arr) ) / np.std(arr)

array([[ 0.37450291,  0.37450291,  1.14458465, -0.76035439, -0.43610945],
       [-0.76035439, -0.5577013 , -1.12512995, -1.61149736,  2.03625824],
       [-1.44937489,  1.79307453, -1.73308922,  1.34723774, -0.43610945],
       [ 0.33397229, -0.63876254,  0.21238044,  0.05025797,  0.37450291],
       [-0.19292574,  0.41503353,  1.54989083,  0.00972735, -0.31451759]])

In [42]:
# or directly by
rng.normal(loc=0, scale=1, size=(5,5))
# rng.standard_normal((5,5))

array([[ 0.00601668,  0.44832407,  1.16530754,  1.647394  ,  0.30962008],
       [ 0.58954689, -1.15086451, -0.08787674,  0.94028946,  0.86596864],
       [ 0.21160973,  0.88639396,  0.49076671,  1.20030626,  0.28935916],
       [-0.35569832,  0.33584126, -2.93059438,  0.38288574, -3.64841283],
       [-1.72346341,  0.4517686 ,  0.47752934, -1.16242878, -0.71210204]])

#### 23. Create a custom dtype that describes a color as four unsigned bytes (RGBA) (★☆☆)

In [43]:
mera_datatype = np.dtype(
    [("r", np.ubyte),   # unsigned byte consists of 1 byte ie 8 bits. max values = 2^8=256 so 0~255, where 255 is Full intensity
    ("g", np.ubyte),
    ("b", np.ubyte),
    ("a", np.ubyte)]    # alpha... full transparent = 0 , opaque = 255
)
mera_datatype

dtype([('r', 'u1'), ('g', 'u1'), ('b', 'u1'), ('a', 'u1')])

In [44]:
# Create an array with this dtype
color_arr = np.array([(255, 0, 0, 255), (0, 255, 0, 128)], dtype=mera_datatype)

# Access a specific field
print(color_arr['r'])  # Output: [255, 0]

[255   0]


#### 24. Multiply a 5x3 matrix by a 3x2 matrix (real matrix product) (★☆☆)

In [45]:
# mat1 = np.random.randint(1,10,(5,3))
# mat2 = np.random.randint(2,20,(3,2))

mat1 = rng.integers(1,10,(5,3))
mat2 = rng.integers(2,20,(3,2))

np.matmul(mat1,mat2)    # OR mat1 @ mat2, in 2d dot product is also same

array([[145, 143],
       [157, 125],
       [178,  81],
       [242, 174],
       [171,  62]])

#### 25. Given a 1D array, negate all elements which are between 3 and 8, in place. (★☆☆)

In [46]:
# arr = np.random.randint(1,10,10)

arr = rng.integers(1,10,10)
print(arr)
arr[(3 <= arr) & (arr <= 8)] *= -1  # boolean mask
print(arr)

[5 9 3 3 8 1 4 5 8 2]
[-5  9 -3 -3 -8  1 -4 -5 -8  2]


#### 26. What is the output of the following script? (★☆☆)
```python
# Author: Jake VanderPlas

print(sum(range(5),-1))
from numpy import *
print(sum(range(5),-1))
```

In [47]:
print(sum(range(5),-1)) # 9
# from numpy import *   # not recommended
print(sum(range(5),-1)) # 10 | actually axis=-1

9
9


#### 27. Consider an integer vector Z, which of these expressions are legal? (★☆☆)
```python
Z**Z
2 << Z >> 2
Z <- Z
1j*Z
Z/1/1
Z<Z>Z
```

In [48]:
Z = rng.integers(-10,12,10)
print(Z)
# print(  Z**Z  )            # z^z , Integers to negative integer powers are not allowed
# print(  2 << Z >> 2  )     # left shift of negative is invalid. Formula = a×(2^±n) here a=2,n=z, + for left shift
print(  Z <- Z  )         # same as, Z < -Z
print(  1j*Z  )           # -2,-10 --> -0. -2.j    -0. -10.j
print(  Z/1/1 )           # (Z/1)/1
# print(  Z<Z>Z )         # same as (Z < Z) and (Z > Z) / `and` is invalid of np.array
                # The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
mask = (Z > -5) & (Z < 5)
print(mask.all())

[-1 -7  7  3  0 -2 -7 10  4 -3]
[ True  True False False False  True  True False False  True]
[-0. -1.j -0. -7.j  0. +7.j  0. +3.j  0. +0.j -0. -2.j -0. -7.j  0.+10.j
  0. +4.j -0. -3.j]
[-1. -7.  7.  3.  0. -2. -7. 10.  4. -3.]
False


#### 28. What are the result of the following expressions? (★☆☆)
```python
np.array(0) / np.array(0)
np.array(0) // np.array(0)
np.array([np.nan]).astype(int).astype(float)
np.array([1]) / np.array([-0.0])
```

In [49]:
# all produces warning
a = np.array(0) / np.array(0)
print(a)    # np.float64(nan)
b = np.array(0) // np.array(0)
print(b)    # np.int64(0)
c = np.array([np.nan]).astype(int) #.astype(float)
print(c)    # –9,223,372,040,000,000,000
# NumPy forced the conversion (like NaN, inf, or overflowing values TO int), NumPy casts to the lowest possible int for that dtype  
print(  np.array([1]) / np.array([0.0])     )    # array([inf])
print(  np.array([1]) / np.array([-0.0])    )    # 0.0 == -0.0 , but are different numbers

nan
0
[-9223372036854775808]
[inf]
[-inf]


  a = np.array(0) / np.array(0)
  b = np.array(0) // np.array(0)
  c = np.array([np.nan]).astype(int) #.astype(float)
  print(  np.array([1]) / np.array([0.0])     )    # array([inf])
  print(  np.array([1]) / np.array([-0.0])    )    # 0.0 == -0.0 , but are different numbers


#### 29. How to round a float array, away from zero? (★☆☆)

###### like [-2.3, -1.1, 0.2, 1.4] -> [-3. -2.  1.  2.]

In [50]:
Z = rng.uniform(-10, 10, size=5)    # uniform gives float
print(Z)

print(np.copysign(np.ceil(np.abs(Z)), Z))           # np.copysign(magnitude, sign_source)

# More readable but less efficient
# print(np.where(Z>0 ))   # gives index, same as Z>0
print(np.where(Z>0, np.ceil(Z), np.floor(Z)))

print("WRONG WAY")
print(np.trunc(Z))
print(np.floor(Z))

[ 4.52360257 -0.7780114   3.19902602  1.99390713 -0.54432054]
[ 5. -1.  4.  2. -1.]
[ 5. -1.  4.  2. -1.]
WRONG WAY
[ 4. -0.  3.  1. -0.]
[ 4. -1.  3.  1. -1.]


#### 30. How to find common values between two arrays? (★☆☆)

In [51]:
a1 = np.array([2,10,33,22,1,0,23,1,-1])
print(a1)
a2 = np.array([20,1,3,2,8,00,2,1,-1])
print(a2)

print(np.intersect1d(a1,a2))

[ 2 10 33 22  1  0 23  1 -1]
[20  1  3  2  8  0  2  1 -1]
[-1  0  1  2]


#### 31. How to ignore all numpy warnings (not recommended)? (★☆☆)

In [52]:
np.geterr()

{'divide': 'warn', 'over': 'warn', 'under': 'ignore', 'invalid': 'warn'}

In [53]:
# change error handling behavior and store previous settings
previous_settings = np.seterr(all='raise')
previous_settings

{'divide': 'warn', 'over': 'warn', 'under': 'ignore', 'invalid': 'warn'}

In [54]:
# Back to sanity
np.seterr(**previous_settings)

{'divide': 'raise', 'over': 'raise', 'under': 'raise', 'invalid': 'raise'}

In [55]:
# Equivalently with a context manager
with np.errstate(all="ignore"):
    np.arange(3) / 0

In [56]:
import warnings # silence every warning from every library (NumPy, pandas, sklearn, etc.) in one go
# warnings.filterwarnings("ignore")     # one of "error", "ignore", "always", "default", "module", or "once"
warnings.filterwarnings("default")  

In [57]:
Z = np.ones(1) / 0
Z

  Z = np.ones(1) / 0


array([inf])

#### 32. Is the following expressions true? (★☆☆)
```python
np.sqrt(-1) == np.emath.sqrt(-1)
```

In [58]:
np.sqrt(-1) == np.emath.sqrt(-1)    # nan ; 1j

  np.sqrt(-1) == np.emath.sqrt(-1)    # nan ; 1j


np.False_

#### 33. How to get the dates of yesterday, today and tomorrow? (★☆☆)

In [59]:
import datetime
tode = datetime.date.today()    # or datetime

yesterday = tode - datetime.timedelta(days=1)
tomorrow = tode + datetime.timedelta(days=1)

print(yesterday)
print(tode)
print(tomorrow)

2026-01-28
2026-01-29
2026-01-30


In [60]:
# or using np

yesterday = np.datetime64('today') - np.timedelta64(1)      # coercion ie forced/automatic type conversion to 'D'
tode      = np.datetime64('today')
tomorrow  = np.datetime64('today') + np.timedelta64(1, 'D')

print(yesterday, tode, tomorrow, sep='\n')  # numpy.datetime64

2026-01-28
2026-01-29
2026-01-30


##### Other supported units
```
Y  (years)
M  (months)
W  (weeks)
D  (days)
h  (hours)
m  (minutes)
s  (seconds)
ms (milliseconds)
us (microseconds)
ns (nanoseconds)
ps (picoseconds)
fs (femtoseconds)
as (attoseconds)
```

In [61]:
# Fixed Units: Safe to mix (Day and smaller).
# Variable Units: Year (Y) and Month (M).

# The Rule: You cannot mix Fixed and Variable because the math becomes ambiguous.

# FIX: Convert Year to Days (assuming 365) then add
year_in_days = np.timedelta64(1, 'Y').astype('timedelta64[D]')
result = year_in_days + np.timedelta64(1, 'D') 
# result is 366 days
print(result.dtype)

timedelta64[D]


#### 34. How to get all the dates corresponding to the month of July 2016? (★★☆)

In [62]:
# print(np.datetime64.__doc__)
np.datetime64(30, 'D')  # from 1StJan1970 add 30 days

np.datetime64('1970-01-31')

In [63]:
start = np.datetime64('2025-08-01') # 1970-01-01T00:00:00 (and offset is 0000, timezone is utc) YYYY-MM-DD
end   = np.datetime64('2025-09-01')

dates = np.arange(start, end, dtype='datetime64[D]')    # [D] - days.. ms , W
dates

array(['2025-08-01', '2025-08-02', '2025-08-03', '2025-08-04',
       '2025-08-05', '2025-08-06', '2025-08-07', '2025-08-08',
       '2025-08-09', '2025-08-10', '2025-08-11', '2025-08-12',
       '2025-08-13', '2025-08-14', '2025-08-15', '2025-08-16',
       '2025-08-17', '2025-08-18', '2025-08-19', '2025-08-20',
       '2025-08-21', '2025-08-22', '2025-08-23', '2025-08-24',
       '2025-08-25', '2025-08-26', '2025-08-27', '2025-08-28',
       '2025-08-29', '2025-08-30', '2025-08-31'], dtype='datetime64[D]')

#### 35. How to compute ((A+B)*(-A/2)) in place (without copy)? (★★☆)

In [64]:
A = np.ones(3)
B = np.ones(3)*2
np.add(A,B,out=B)
np.divide(A,2,out=A)
np.negative(A,out=A)
np.multiply(A,B,out=A)
A

array([-1.5, -1.5, -1.5])

#### 36. Extract the integer part of a random array of positive numbers using 4 different methods (★★☆)

In [65]:
arr = rng.uniform(1, 10, 5)
arr

array([9.55054985, 4.08442278, 2.59511951, 7.10284655, 8.61406805])

In [66]:
print(  np.floor(arr)   )
print(  np.trunc(arr)   )   # np.fix(arr)
print(  arr // 1        )
print(  arr.astype(int) )   # np.astype(arr, int)
print(  arr - arr%1     )

[9. 4. 2. 7. 8.]
[9. 4. 2. 7. 8.]
[9. 4. 2. 7. 8.]
[9 4 2 7 8]
[9. 4. 2. 7. 8.]


#### 37. Create a 5x5 matrix with row values ranging from 0 to 4 (★★☆)

In [67]:
Z = np.zeros((5,5))
Z += np.arange(5)   #  Broadcast
print(Z)

# without broadcasting
Z = np.tile(np.arange(0, 5), (5,1)) # Construct an array by repeating A the number of times given by reps.
                        # 5 → repeat rows 5 times , 1 → repeat columns once
print(Z)

[[0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]]
[[0 1 2 3 4]
 [0 1 2 3 4]
 [0 1 2 3 4]
 [0 1 2 3 4]
 [0 1 2 3 4]]


#### 38. Consider a generator function that generates 10 integers and use it to build an array (★☆☆)

In [68]:
# defining - generator function
def generate():
    for x in range(10):
        yield x

def master_gen():
    yield from range(3)       # 0, 1, 2
    yield from ["A", "B"]     # "A", "B"
    yield from generate()     # 0 through 9 from your other function
    return 

print(*master_gen())

0 1 2 A B 0 1 2 3 4 5 6 7 8 9


In [69]:
# defining - generator function
def generate():
    for x in range(10):
        yield x
# OR
def generate():
    yield from range(10)
# gen = (x for x in range(10))
# generate    # <function __main__.generate()>
generate()  # <generator object at ...>

<generator object generate at 0x000001B0BA420AC0>

In [70]:
gen = (x for x in range(10))    # Generator Expression
print(  np.fromiter(gen, int)                           )   # numpy.int64
print(  np.fromiter(generate(), dtype=int, count=-1)    )   # count=-1 means all, 50 for first 50 etc

[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]


#### 39. Create a vector of size 10 with values ranging from 0 to 1, both excluded (★★☆)

In [71]:
np.finfo('float16') # FYI, similarly np.iinfo(np.int8) etc

finfo(resolution=0.001, min=-6.55040e+04, max=6.55040e+04, dtype=float16)

In [72]:
arr = rng.uniform(0, 1, 10)
arr2 = np.clip(arr, 1e-6, 1 - 1e-6)  # if value is 0 it will be 0.000001
arr2

# more efficient
# eps = np.finfo(float).eps
# arr2 = np.clip(arr, eps, 1 - eps)

array([0.04025374, 0.44963269, 0.89248769, 0.74961862, 0.99180772,
       0.53141381, 0.65999572, 0.30248028, 0.9475356 , 0.36633549])

#### 40. Create a random vector of size 10 and sort it (★★☆)

In [73]:
arr = rng.uniform(0, 1+ 1e-10, 10)
sortedarr = np.sort(arr)
print(sortedarr)

[0.34236243 0.4031402  0.40815667 0.41930099 0.50893415 0.56181893
 0.6470718  0.71978254 0.73851071 0.90708633]


In [74]:
arr.sort()  # in-place sort
arr

array([0.34236243, 0.4031402 , 0.40815667, 0.41930099, 0.50893415,
       0.56181893, 0.6470718 , 0.71978254, 0.73851071, 0.90708633])

#### 41. How to sum a small array faster than np.sum? (★★☆)

In [75]:
arr = rng.uniform(0, 1+ 1e-10, 10)
sum(arr)
s = arr.sum()       # faster
np.add.reduce(arr)  # fastest uses ufunc directly

np.float64(3.965772739519008)

#### 42. Consider two random arrays A and B, check if they are equal (★★☆)

In [76]:
arr1 = rng.integers(1,10,5)
arr2 = rng.integers(1,10,5)
print( arr1==arr2 )             # these 3 Not good for floats because of rounding errors.
print( np.equal(arr1,arr2) )
print( np.array_equal(arr1,arr2) )  # single boolean


print( np.allclose(arr1,arr2) ) # rtol= , atol= , equal_nan=  False
                                # good for floats as they are never exactly equal.
                                # Works elementwise and then AND-reduces to a single boolean.
print( np.isclose(arr1, arr2).all())  # Same

# print( np.testing.assert_allclose(arr1, arr2) ) # no op, raises error if all not same within tolerance

[False False False False False]
[False False False False False]
False
False
False


#### 43. Make an array immutable (read-only) (★★☆)

In [78]:
arr = np.array([1, 2, 3])
arr.flags.writeable = False
# arr[0] = 100   # raises ValueError: assignment destination is read-only

In [79]:
arr.flags

# c -> c lang / row style ie Memory is laid out like: row1 → row2 → row3…
# f -> fortran / column-major style
# OWNDATA -> This array owns its memory (or not). Not a view, not a slice, not referencing another array.
# WRITEABLE -> .view()/np.frombuffer() makes it false
# ALIGNED -> CPU alligned, god for performance
# WRITEBACKIFCOPY -> legacy NumPy “COPY_IF_NEEDED” semantics. modifications to the temporary array would be written back to the original.

  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : False
  ALIGNED : True
  WRITEBACKIFCOPY : False

#### 44. Consider a random 10x2 matrix representing cartesian coordinates, convert them to polar coordinates (★★☆)

In [80]:
pts = rng.standard_normal((10, 2))
print(pts)

x = pts[:, 0]
y = pts[:, 1]

# polar coordinates
r = np.sqrt(x**2 + y**2)        # radius
theta = np.arctan2(y, x)        # angle in radians

polar = np.stack((r, theta), axis=1)   # along columns
print(theta)
print(polar)

[[-3.36831968e-01  6.19701965e-01]
 [ 3.39875361e-01  3.16047871e-01]
 [ 4.09828457e-01  6.16134668e-01]
 [-2.10795334e+00 -3.64438252e-01]
 [-2.18021006e+00  3.60599268e-02]
 [-4.63297222e-03  1.04553227e+00]
 [ 1.18763439e+00  2.02775048e-01]
 [-5.00361077e-01  4.85161113e-01]
 [-5.27917637e-01 -1.39265144e-03]
 [ 9.86136317e-01 -5.57771346e-01]]
[ 2.06866527  0.74908748  0.98383716 -2.97039769  3.1250545   1.57522751
  0.16910794  2.37161653 -3.13895465 -0.51475095]
[[ 0.70532709  2.06866527]
 [ 0.46411369  0.74908748]
 [ 0.73998736  0.98383716]
 [ 2.13922475 -2.97039769]
 [ 2.18050825  3.1250545 ]
 [ 1.04554254  1.57522751]
 [ 1.2048208   0.16910794]
 [ 0.6969523   2.37161653]
 [ 0.52791947 -3.13895465]
 [ 1.13294912 -0.51475095]]


In [81]:
a1 = np.array([1,2,3])
a2 = np.array([1,2,5])
np.stack( (a1,a2), axis=1 )

array([[1, 1],
       [2, 2],
       [3, 5]])

#### 45. Create random vector of size 10 and replace the maximum value by 0 (★★☆)

In [82]:
arr = rng.integers(1,5,10)
arr[arr == arr.max()] = 0
print(arr)

# np.where(arr==max(arr),0,arr)

[1 0 0 0 2 2 0 3 3 2]


In [83]:
a = np.array([[4, 1], [np.nan, 3]]) 
np.argmax(a), np.argmin(a)      # returns first max/nan index

(np.int64(2), np.int64(2))

In [84]:
# fix
np.nanargmax(a), np.nanargmin(a)

(np.int64(0), np.int64(1))

#### 46. Create a structured array with `x` and `y` coordinates covering the [0,1]x[0,1] area (★★☆)

#### 47. Given two arrays, X and Y, construct the Cauchy matrix C (Cij =1/(xi - yj)) (★★☆)

#### 48. Print the minimum and maximum representable values for each numpy scalar type (★★☆)

In [85]:
np.iinfo(int), np.iinfo(np.int8), np.iinfo(np.int64), np.iinfo('int'), np.iinfo(np.int32)   # similarly finfo

(iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64),
 iinfo(min=-128, max=127, dtype=int8),
 iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64),
 iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64),
 iinfo(min=-2147483648, max=2147483647, dtype=int32))

In [86]:
np.iinfo(np.int32).max      # uint16, float64

2147483647

#### 49. How to print all the values of an array? (★★☆)
as, normally numpy shortens large arrays like: array([1, 2, 3, ..., 999, 1000])

In [87]:
np.get_printoptions()

{'edgeitems': 3,
 'threshold': 1000,
 'floatmode': 'maxprec',
 'precision': 8,
 'suppress': False,
 'linewidth': 75,
 'nanstr': 'nan',
 'infstr': 'inf',
 'sign': '-',
 'formatter': None,
 'legacy': False,
 'override_repr': None}

In [88]:
np.set_printoptions(threshold=float("inf")) # Or, sys.maxsize or default 1000 (If the array has more than 1000 elements, NumPy will truncate the output and show)
Z = np.zeros((4,120))
print(Z)

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0

#### 50. How to find the closest value (to a given scalar) in a vector? (★★☆)

In [89]:
matrix = np.array([ [10, 20],
                    [30, 4.5] ])
scalar_m = 14

# 1. Calculate the difference across the entire flattened array
abs_diff_m = np.abs(matrix - scalar_m)

# 2. Use argmin on the flattened array
index_flat = np.argmin(abs_diff_m)

# 3. Use unravel_index to convert the flat index back to (row, col) coordinates
row, col = np.unravel_index(index_flat, matrix.shape)

closest_value_m = matrix[row, col]
print(closest_value_m)

# OR 3rd
closest_value = matrix.flat[index_flat]
print(closest_value)

10.0
10.0


#### 51. Create a structured array representing a position (x,y) and a color (r,g,b) (★★☆)

In [90]:
# Define the structure of the array using a dtype
point_dtype = np.dtype([
    ('position', 'f8', (2,)),  # 'f8' for float64, (2,) for a pair of (x, y)
    ('color',    'u1', (3,))   # 'u1' for unsigned int 8-bit (0-255), (3,) for (r, g, b)
])          
# FYI, f8 -> float with 8 bytes=64 bits. so, float64. similarly, i4, u1, b1 for boolean

print(point_dtype)

structured_array = np.array([
    ((10.5, 20.1), (255, 0, 0)),     # Point 1: (10.5, 20.1) and Red
    ((5.0, -1.2),  (0, 128, 255)),   # Point 2: (5.0, -1.2) and Aqua-Blue
    ((99.9, 0.0),  (50, 50, 50))     # Point 3: (99.9, 0.0) and Dark Gray
], dtype=point_dtype)

print(structured_array)

[('position', '<f8', (2,)), ('color', 'u1', (3,))]
[([10.5, 20.1], [255,   0,   0]) ([ 5. , -1.2], [  0, 128, 255])
 ([99.9,  0. ], [ 50,  50,  50])]


In [91]:
print("This text is being sent to the file instead of the screen! 1")
with open('my_log.txt', 'a+') as f:
    print("This text is being sent to the file instead of the screen! 2", file=f)

This text is being sent to the file instead of the screen! 1


#### 52. Consider a random vector with shape (100,2) representing coordinates, find point by point distances (★★☆)

In [92]:
pts = rng.integers(1,10,(100,2))
dists = np.linalg.norm(pts[1:] - pts[:-1], axis=1)  # from 2nd row, upto last row; 
dists                                               # then compute their distance using root of x^2+y^2 (horizontally ie axis=1)

array([4.47213595, 1.        , 7.21110255, 5.        , 6.08276253,
       6.08276253, 8.24621125, 8.        , 6.        , 2.82842712,
       5.        , 2.23606798, 3.16227766, 3.        , 3.        ,
       3.60555128, 2.23606798, 2.23606798, 6.70820393, 3.        ,
       6.70820393, 2.23606798, 4.24264069, 5.65685425, 3.        ,
       6.08276253, 6.40312424, 5.09901951, 4.47213595, 5.83095189,
       7.28010989, 6.        , 8.24621125, 3.60555128, 5.        ,
       5.        , 8.54400375, 3.16227766, 6.70820393, 5.83095189,
       5.38516481, 2.82842712, 4.47213595, 6.32455532, 6.08276253,
       4.47213595, 3.16227766, 2.        , 2.        , 5.09901951,
       7.61577311, 0.        , 4.        , 4.        , 8.48528137,
       5.38516481, 6.08276253, 2.        , 1.41421356, 3.        ,
       6.32455532, 4.24264069, 5.65685425, 4.        , 2.        ,
       4.        , 5.09901951, 4.12310563, 4.12310563, 5.83095189,
       4.        , 7.07106781, 6.70820393, 7.61577311, 6.08276

#### 53. How to convert a float (32 bits) array into an integer (32 bits) array in place?

In [93]:
# There is no true in-place dtype conversion in NumPy. Dtype change ⇒ new memory, always.

ar = rng.uniform(10,20,(10,5)).astype(np.float32)
print(ar)

ar = ar.astype(np.int32, copy=False)
print(ar)

[[12.191893  12.083794  17.514908  11.189103  10.859823 ]
 [11.762914  11.753465  18.231075  13.985066  19.903885 ]
 [11.372112  16.535398  14.487295  13.929182  18.750841 ]
 [19.755957  18.72859   11.923007  12.207979  16.56074  ]
 [12.890836  17.347252  15.6642065 15.509088  18.28547  ]
 [17.105328  10.265778  10.494592  16.015585  14.86192  ]
 [12.601847  14.186561  17.579103  18.266115  15.61223  ]
 [13.853695  12.706967  15.219185  13.147496  15.640951 ]
 [16.766523  10.660792  10.010674  12.122771  18.945114 ]
 [16.462908  11.719604  18.90264   14.699559  14.760918 ]]
[[12 12 17 11 10]
 [11 11 18 13 19]
 [11 16 14 13 18]
 [19 18 11 12 16]
 [12 17 15 15 18]
 [17 10 10 16 14]
 [12 14 17 18 15]
 [13 12 15 13 15]
 [16 10 10 12 18]
 [16 11 18 14 14]]


#### 54. How to read the following file (file1.txt)? (★★☆)
```
1, 2, 3, 4, 5
6,  ,  , 7, 8
 ,  , 9,10,11
```

In [94]:
arr = np.genfromtxt('file1.txt', delimiter=',')
arr, arr.dtype  # default datatype is float64, for int its int64 etc

(array([[ 1.,  2.,  3.,  4.,  5.],
        [ 6., nan, nan,  7.,  8.],
        [nan, nan,  9., 10., 11.]]),
 dtype('float64'))

#### 55. What is the equivalent of enumerate for numpy arrays? (★★☆)

In [95]:
arr = np.ndenumerate(arr)
print(*arr)     # (0, 0), np.float64(1.0) ...

((0, 0), np.float64(1.0)) ((0, 1), np.float64(2.0)) ((0, 2), np.float64(3.0)) ((0, 3), np.float64(4.0)) ((0, 4), np.float64(5.0)) ((1, 0), np.float64(6.0)) ((1, 1), np.float64(nan)) ((1, 2), np.float64(nan)) ((1, 3), np.float64(7.0)) ((1, 4), np.float64(8.0)) ((2, 0), np.float64(nan)) ((2, 1), np.float64(nan)) ((2, 2), np.float64(9.0)) ((2, 3), np.float64(10.0)) ((2, 4), np.float64(11.0))


#### 56. Generate a generic 2D Gaussian-like array (★★☆)

#### 57. How to randomly place p elements in a 2D array? (★★☆)

In [96]:
m, n = 5, 6   # shape
p = 7         # number of elements to place

arr = np.zeros((m, n), dtype=int)

# pick p unique flat indices
idx = rng.choice(m*n, size=p, replace=False)    # from np.arange(m*n) select p no of items
print(idx)

# assign them
values = rng.integers(10, 50, size=p)
arr.flat[idx] = values
print(arr)

[ 1 14  6 23 20 12  9]
[[ 0 18  0  0  0  0]
 [13  0  0 26  0  0]
 [14  0 45  0  0  0]
 [ 0  0 18  0  0 26]
 [ 0  0  0  0  0  0]]


#### 58. Subtract the mean of each row of a matrix (★★☆)

In [97]:
X = rng.random((5, 4))
print(X)
# print(X.mean())     # entire mean as default axis=None

Y = X - X.mean(axis=1, keepdims=True)   # keepdims keeps size to (n, 1) 
        # else will be error as unsuported broadcasting between (5,4) and (5,)
        # or use x_mean.reshape()
Y

[[0.1761383  0.13469202 0.86028189 0.3203972 ]
 [0.36075869 0.05527041 0.35736962 0.26480718]
 [0.61048507 0.2053903  0.88836545 0.92966662]
 [0.09983106 0.10740296 0.11532644 0.61956926]
 [0.12196896 0.85238232 0.75427151 0.81980886]]


array([[-0.19673905, -0.23818533,  0.48740453, -0.05248015],
       [ 0.10120721, -0.20428106,  0.09781815,  0.0052557 ],
       [-0.04799179, -0.45308656,  0.22988859,  0.27118976],
       [-0.13570137, -0.12812947, -0.12020599,  0.38403683],
       [-0.51513895,  0.2152744 ,  0.1171636 ,  0.18270094]])

#### 59. How to sort an array by the nth column? (★★☆)

In [98]:
arr = np.array([
    [3, 7, 1],
    [2, 5, 9],
    [8, 1, 4]
])
            # arr[ rows, cols]
arr_sorted = arr[  arr[:, 1].argsort()  ]   # sorted index
arr_sorted

array([[8, 1, 4],
       [2, 5, 9],
       [3, 7, 1]])

In [99]:
arr = np.array([
    [3, 7, 1],
    [2, 5, 9],
    [8, 1, 4]
])
arr[ 0 ]
arr[ [0] ]
arr[ 0, 1]
arr[ [0,1] ]
arr[ [0,1] ]
arr[ :, 1 ]
# arr[ [:, 1] ]


array([7, 5, 1])

#### 60. How to tell if a given 2D array has null columns? (★★☆)

In [100]:
arr = np.array([
    [3, np.nan, 1],
    [2, np.nan, 9],
    [8, np.nan, 4]
])

# print(  np.isnan(arr)  )    # boolean matrix
# print(  np.isnan(arr).all() )   # False
print(  np.isnan(arr).all(axis=0) )   # [False  True False]

[False  True False]


#### 61. Find the nearest value from a given value in an array (★★☆)

In [101]:
Z = rng.uniform(0,1,(10,3))
z = 0.5
m = Z.flat[np.abs(Z - z).argmin()] # without flat multiple answer for n-dim
print(m)

0.4986395468763152


#### 62. Considering two arrays with shape (1,3) and (3,1), how to compute their sum using an iterator? (★★☆)

In [102]:
a = np.array([[1, 2, 3]])   # shape (1,3)
b = np.array([[4], [5], [6]])  # shape (3,1)

it = np.nditer([a, b, None])

for x, y, z in it:
    z[...] = x + y

print(it.operands[2])
# or without iter (directly using broadcasting)
print(a+b)

[[5 6 7]
 [6 7 8]
 [7 8 9]]
[[5 6 7]
 [6 7 8]
 [7 8 9]]


In [103]:
a = np.array([[1, 2, 3]])   # shape (1,3)
b = np.array([[4], [5], [6]])  # shape (3,1)
out = np.empty((3,3)) 

it = np.nditer([a, b, out], op_flags=[['readonly'], ['readonly'], ['writeonly']])

for x, y, z in it:
    z[...] = x + y

print(out)

[[5. 6. 7.]
 [6. 7. 8.]
 [7. 8. 9.]]


#### 63. Create an array class that has a name attribute (★★☆)

#### 64. Consider a given vector, how to add 1 to each element indexed by a second vector (be careful with repeated indices)? (★★★)

In [104]:
a = np.array([0, 0, 0, 0, 0])
idx = np.array([0, 1, 1, 3])   # index 1 is repeated

np.add.at(a, idx, 1)   # in-place, a[idx] += 1 does NOT handle repeated indices, better than bincount as it supports sparse or negative indices 

print(a)

[1 2 0 1 0]


#### 65. How to accumulate elements of a vector (X) to an array (F) based on an index list (I)? (★★★)

In [105]:
X = np.array([10, 20, 30, 40])
I = np.array([0, 2, 2, 1])   # notice index 2 repeats

F = np.zeros(4)

np.add.at(F, I, X)

print(F)

[10. 40. 50.  0.]


#### 66. Considering a (w,h,3) image of (dtype=ubyte), compute the number of unique colors (★★☆)

#### 67. Considering a four dimensions array, how to get sum over the last two axis at once? (★★★)

In [106]:
arr = np.array([    # shape of (2, 2, 2, 3) -> last 2 is row and col
    [
        [[1, 2, 3],
         [4, 5, 6]],

        [[7, 8, 9],
         [10,11,12]]
    ],

    [
        [[2, 1, 0],
         [3, 3, 3]],

        [[5, 5, 5],
         [1, 1, 1]]
    ]
])

print( arr.sum(axis= (-2,-1)))   # Sum over (-2,-1): axis 2 and 3 → remove last 2 and 3, new shape 2,2

[[21 57]
 [12 18]]


In [107]:
arr[0,0]  # this sum is of result[0,0] ie 1+2+3+4+5+6=21

array([[1, 2, 3],
       [4, 5, 6]])

#### 68. Considering a one-dimensional vector D, how to compute means of subsets of D using a vector S of same size describing subset  indices? (★★★)

In [108]:
# def printer(var): # HW: takes one or multiple input prints variable name, then the respective value (like x: next line 8), each in newline. and for this cell output in jupyter nb don't collapse
#     print(f"{var}: var")

In [109]:
D = rng.uniform(0,1,100)     # wgt/vals
S = rng.integers(0,10,100)   # Index
print(D, S, sep='\n')
D_sums = np.bincount(S, weights=D) # in S ->  ??
D_counts = np.bincount(S)   # in S -> 0 is repeated ... time, then 1 is repeated ... time, upto max ele of the array S ??
print('D_counts: ')
print(D_counts)
D_means = D_sums / D_counts
print(D_means)

[0.4510465  0.56793423 0.46181791 0.65875341 0.55439902 0.32071555
 0.34573081 0.37859279 0.09463977 0.16606263 0.7231138  0.38909671
 0.21453897 0.56267241 0.75553954 0.25742236 0.82678988 0.92750486
 0.59739453 0.66894766 0.05257599 0.94551989 0.39248473 0.9239004
 0.57897752 0.0046188  0.038542   0.68041881 0.56266801 0.02653244
 0.74345996 0.85214534 0.49641748 0.46575744 0.00547311 0.78664971
 0.33071607 0.87876362 0.37306059 0.56507791 0.27034179 0.16058831
 0.7749079  0.49659048 0.53648792 0.96541595 0.96357991 0.85564212
 0.18760929 0.59427269 0.87814742 0.37446394 0.10110617 0.8109629
 0.48176355 0.56439268 0.9821443  0.60721143 0.42497939 0.36998437
 0.41761929 0.11839908 0.79475056 0.83771686 0.28439713 0.30379662
 0.87433426 0.02397036 0.52045744 0.46367571 0.69012677 0.1633917
 0.66833074 0.37673401 0.61871478 0.04009501 0.4617272  0.52225089
 0.2143264  0.13962365 0.1884743  0.34414108 0.09681015 0.9473198
 0.40499406 0.8089696  0.97441712 0.08073528 0.65166947 0.87140909

#### 69. How to get the diagonal of a dot product? (★★★)

In [110]:
A = rng.uniform(0,1,(5,5))
B = rng.uniform(0,1,(5,5))

# Slow version
np.diag(np.dot(A, B))   # OR, np.diag( A*B )

array([1.18223139, 1.14835441, 1.32162252, 0.96223346, 1.42982017])

#### 70. Consider the vector [1, 2, 3, 4, 5], how to build a new vector with 3 consecutive zeros interleaved between each value? (★★★)

In [111]:
vec = np.array([1, 2, 3, 4, 5])

new = np.zeros( (len(vec) + 3*((len(vec)-1) )))
print(new)
new[::4]= vec
print(new)

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[1. 0. 0. 0. 2. 0. 0. 0. 3. 0. 0. 0. 4. 0. 0. 0. 5.]


#### 71. Consider an array of dimension (5,5,3), how to multiply it by an array with dimensions (5,5)? (★★★)

In [112]:
a = np.ones((5,5,3))
b = np.ones((5,5))
res = a* b.reshape((5,5,1)) # OR, C = A * B[:, :, None] same thing
res.shape

(5, 5, 3)

#### 72. How to swap two rows of an array? (★★★)

In [113]:
arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
# print(  arr[0,2,1][:]  )  # error, first [] as its 2d array and also [:] will fail as it'll be just a value
# arr[[2,1,0]]  # select multiple rows
arr[[0, 2]] = arr[[2, 0]]
print(arr)

[[7 8 9]
 [4 5 6]
 [1 2 3]]


#### 73. Consider a set of 10 triplets describing 10 triangles (with shared vertices), find the set of unique line segments composing all the  triangles (★★★)

#### 74. Given a sorted array C that corresponds to a bincount, how to produce an array A such that np.bincount(A) == C? (★★★)

In [114]:
C = np.bincount([1,1,2,3,4,4,6])  # A
A = np.repeat(a= np.arange(len(C)), repeats= C)
print(A)

[1 1 2 3 4 4 6]


#### 75. How to compute averages using a sliding window over an array? (★★★)

In [115]:
def moving_average(a, n=3):
    ret = np.cumsum(a, dtype=float)
    ret[n:] = ret[n:] - ret[:-n]
    return ret[n - 1:] / n

arr = np.arange(20)
print(np.cumsum(arr))
print(moving_average(arr, n=3))

[  0   1   3   6  10  15  21  28  36  45  55  66  78  91 105 120 136 153
 171 190]
[ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15. 16. 17. 18.]


In [116]:
# sliding window - easier
from numpy.lib.stride_tricks import sliding_window_view

arr = np.arange(20)
print(sliding_window_view(arr, window_shape=3).mean(axis=-1))

[ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15. 16. 17. 18.]


#### 76. Consider a one-dimensional array Z, build a two-dimensional array whose first row is (Z[0],Z[1],Z[2]) and each subsequent row is  shifted by 1 (last row should be (Z[-3],Z[-2],Z[-1]) (★★★)

In [117]:
Z = np.array([1,2,3,4,5,6]) 
result = np.vstack([Z[i:i+3] for i in range(len(Z)-2)])
print(result)

[[1 2 3]
 [2 3 4]
 [3 4 5]
 [4 5 6]]


In [118]:
# similar to sliding window - easier
from numpy.lib.stride_tricks import sliding_window_view

Z = np.array([1,2,3,4,5,6])
out = sliding_window_view(Z, window_shape=3)

print(out)

[[1 2 3]
 [2 3 4]
 [3 4 5]
 [4 5 6]]


#### 77. How to negate a boolean, or to change the sign of a float inplace? (★★★)

In [119]:
arr = np.array([True, False, True])
np.logical_not(arr, out=arr)    # OR, arr ^= True # ^ in NumPy means bitwise XOR for int, logical XOR for bool
print(arr)

[False  True False]


In [120]:
arr = np.array([1, False, 3.01, -2])        # float64, false means 0
# arr = ~arr   # not inplace here error as its mixed types
# arr[:] = ~arr # inplace but here error as its mixed types
print (np.negative(arr))  # OR arr *= -1  # [-1.  , -0.  , -3.01,  2.  ]

[-1.   -0.   -3.01  2.  ]


#### 78. Consider 2 sets of points P0,P1 describing lines (2d) and a point p, how to compute distance from p to each line i (P0[i],P1[i])? (★★★)

#### 79. Consider 2 sets of points P0,P1 describing lines (2d) and a set of points P, how to compute distance from each point j (P[j]) to each line i (P0[i],P1[i])? (★★★)

#### 80. Consider an arbitrary array, write a function that extracts a subpart with a fixed shape and centered on a given element (pad with a `fill` value when necessary) (★★★)

#### 81. Consider an array Z = [1,2,3,4,5,6,7,8,9,10,11,12,13,14], how to generate an array R = [[1,2,3,4], [2,3,4,5], [3,4,5,6], ..., [11,12,13,14]]? (★★★)

In [121]:
Z = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14])    # generate, R = [[1,2,3,4], [2,3,4,5], [3,4,5,6], ..., [11,12,13,14]]

from numpy.lib.stride_tricks import sliding_window_view, as_strided
print(  sliding_window_view(Z, 4)  )

# R = [ Z[i:i+4] for i in range(len(Z)-3)   ]   # pythonic, so slow
# print(R)

[[ 1  2  3  4]
 [ 2  3  4  5]
 [ 3  4  5  6]
 [ 4  5  6  7]
 [ 5  6  7  8]
 [ 6  7  8  9]
 [ 7  8  9 10]
 [ 8  9 10 11]
 [ 9 10 11 12]
 [10 11 12 13]
 [11 12 13 14]]


#### 82. Compute a matrix rank (★★★)

In [122]:
rank = np.linalg.matrix_rank(Z)
print(rank)

1


#### 83. How to find the most frequent value in an array?

In [123]:
Z = rng.integers(10,20,10)
print(Z)
print(np.bincount(Z))   # index=integer, value=freq
print(np.bincount(Z).argmax())    # finds freq, then index of max freq

[14 18 12 14 17 10 18 18 17 16]
[0 0 0 0 0 0 0 0 0 0 1 0 1 0 2 0 1 2 3]
18


#### 84. Extract all the contiguous 3x3 blocks from a random 10x10 matrix (★★★)

In [124]:
arr = rng.integers(1,11,(10,10))
arr

array([[ 8,  8, 10,  5,  5,  7,  2,  9,  8,  7],
       [ 4,  4,  5,  7,  8,  7, 10, 10,  8,  4],
       [ 6,  5,  7,  9,  4,  6,  1,  8,  5,  5],
       [ 6,  3,  6,  8,  4,  2,  4,  2,  4,  7],
       [ 2,  8,  3,  9,  8,  3,  6,  2,  1, 10],
       [ 9,  2,  3,  2,  4,  3,  4,  6,  2,  9],
       [ 8,  8,  2,  8,  9,  2,  8,  4,  8,  5],
       [ 9,  2,  4,  3,  3,  4,  4,  2,  8,  6],
       [ 9,  8,  2,  1,  5,  5,  3,  8,  6,  8],
       [ 2,  9,  3,  5,  5,  9,  3,  3,  4,  4]])

In [125]:
print(np.lib.stride_tricks.sliding_window_view(arr, window_shape=(3, 3)))

[[[[ 8  8 10]
   [ 4  4  5]
   [ 6  5  7]]

  [[ 8 10  5]
   [ 4  5  7]
   [ 5  7  9]]

  [[10  5  5]
   [ 5  7  8]
   [ 7  9  4]]

  [[ 5  5  7]
   [ 7  8  7]
   [ 9  4  6]]

  [[ 5  7  2]
   [ 8  7 10]
   [ 4  6  1]]

  [[ 7  2  9]
   [ 7 10 10]
   [ 6  1  8]]

  [[ 2  9  8]
   [10 10  8]
   [ 1  8  5]]

  [[ 9  8  7]
   [10  8  4]
   [ 8  5  5]]]


 [[[ 4  4  5]
   [ 6  5  7]
   [ 6  3  6]]

  [[ 4  5  7]
   [ 5  7  9]
   [ 3  6  8]]

  [[ 5  7  8]
   [ 7  9  4]
   [ 6  8  4]]

  [[ 7  8  7]
   [ 9  4  6]
   [ 8  4  2]]

  [[ 8  7 10]
   [ 4  6  1]
   [ 4  2  4]]

  [[ 7 10 10]
   [ 6  1  8]
   [ 2  4  2]]

  [[10 10  8]
   [ 1  8  5]
   [ 4  2  4]]

  [[10  8  4]
   [ 8  5  5]
   [ 2  4  7]]]


 [[[ 6  5  7]
   [ 6  3  6]
   [ 2  8  3]]

  [[ 5  7  9]
   [ 3  6  8]
   [ 8  3  9]]

  [[ 7  9  4]
   [ 6  8  4]
   [ 3  9  8]]

  [[ 9  4  6]
   [ 8  4  2]
   [ 9  8  3]]

  [[ 4  6  1]
   [ 4  2  4]
   [ 8  3  6]]

  [[ 6  1  8]
   [ 2  4  2]
   [ 3  6  2]]

  [[ 1  8  5]
   [ 4  2  4]


#### 85. Create a 2D array subclass such that Z[i,j] == Z[j,i] (★★★)

In [126]:
Z = rng.integers(10,21,(4,4))
print(Z)
Z_sym = (Z + Z.T) / 2
Z_sym

[[11 11 12 15]
 [13 14 17 12]
 [20 18 11 13]
 [12 13 14 11]]


array([[11. , 12. , 16. , 13.5],
       [12. , 14. , 17.5, 12.5],
       [16. , 17.5, 11. , 13.5],
       [13.5, 12.5, 13.5, 11. ]])

#### 86. Consider a set of p matrices with shape (n,n) and a set of p vectors with shape (n,1). How to compute the sum of of the p matrix products at once? (result has shape (n,1)) (★★★)

In [127]:
p, n = 10, 20
M = rng.integers(1,11,(p,n,n))
V = rng.integers(0,21,(p,n,1))

result = (M @ V).sum(axis=0) # shape of M@V is (p, n, 1), after sum at axis0 (ie p) its (n,1)
result

array([[10837],
       [ 9678],
       [10797],
       [10426],
       [11108],
       [ 9718],
       [10984],
       [10612],
       [10864],
       [10119],
       [10264],
       [10621],
       [10488],
       [ 9753],
       [10963],
       [11139],
       [10688],
       [10171],
       [10008],
       [10518]])

#### 87. Consider a 16x16 array, how to get the block-sum (block size is 4x4)? So, they are non-overlaping ie Checkerboard pattern type (★★★)

In [128]:
ar = rng.integers(0,11,(16,16))

blocks = np.lib.stride_tricks.sliding_window_view(ar,(4,4))
print(blocks.shape) # shape (13, 13, 4, 4)

# Slice with a step of 4 to remove the overlaps
# result shape becomes (4, 4, 4, 4)
non_overlapping = blocks[::4, ::4]

# Sum the last two dimensions (the 4x4 windows themselves)
block_sum = non_overlapping.sum(axis=(-1, -2))
block_sum

(13, 13, 4, 4)


array([[ 76,  92,  73,  76],
       [ 72,  86,  98,  68],
       [ 93, 104,  75,  93],
       [ 81,  87,  73,  80]])

#### 88. How to implement the Game of Life using numpy arrays? (★★★)

#### 89. How to get the n largest values of an array (★★★)

In [129]:
n = 3

# regular way - time complexity is nlogn
ar = rng.integers(1,1001,(16,16))
sorted_ar = np.sort(ar.flatten())[::-1]
sorted_ar[:n]

array([1000,  992,  987])

In [130]:
# more efficient way - O(n) - using partition first
n = 3
ar = np.random.randint(1, 1001, (16, 16))

# Flatten first since it's a 2D array
flat = ar.flatten()

# 1. Partition: Everything to the right of index '-n' will be the largest
# This puts the n largest values at the end of the array (unsorted)
partitioned = np.partition(flat, -n)
print(partitioned)

# 2. Slice the last n and sort them if you want descending order
result = np.sort(partitioned[-n:])[::-1]

print(result)

[   1    1    2    9   12   13   18   25   34   37   38   39   39   41
   46   47   47   50   56   57   63   66   73   81   81   86   87   94
   98   99  103  112  113  114  115  124  125  126  126  127  128  134
  135  135  142  143  152  152  154  154  156  162  162  162  163  176
  179  179  186  191  200  202  206  211  211  214  214  216  217  222
  231  233  236  238  238  240  257  258  260  268  276  280  285  295
  300  300  308  310  310  316  319  321  322  322  329  334  334  343
  347  350  358  361  364  370  374  377  380  386  386  392  397  401
  409  430  436  439  443  446  453  468  472  498  505  507  511  513
  514  516  525  528  529  531  532  534  547  549  549  554  565  566
  566  569  573  576  581  587  588  599  601  605  638  638  638  639
  641  642  644  646  648  648  649  653  653  654  655  655  659  661
  662  666  667  673  676  682  683  693  701  703  705  707  710  712
  713  721  726  729  732  735  738  739  747  750  752  753  759  766
  769 

#### 90. Given an arbitrary number of vectors, build the cartesian product (every combination of every item) (★★★)

In [131]:
def cartesian_product(*arrays):
    # 1. Create a coordinate grid for every array that is basically cartesian product
    grid = np.meshgrid(*arrays, indexing='ij')
    
    print(np.stack(grid, axis=-1))
    # 2. Reshape each grid to a column and stack them
    return np.stack(grid, axis=-1).reshape(-1, len(arrays))

# Usage:
v1 = np.array([1, 2])
v2 = np.array([4, 5])
v3 = np.array([0])

result = cartesian_product(v1, v2, v3)
print(result)

[[[[1 4 0]]

  [[1 5 0]]]


 [[[2 4 0]]

  [[2 5 0]]]]
[[1 4 0]
 [1 5 0]
 [2 4 0]
 [2 5 0]]


#### 91. How to create a record array from a regular array? ie we want to see a particular column by arr.attribute (★★★)

In [132]:
structured_input = np.array([(1, 'Alice'), (2, 'Bob')],   dtype=[('id', 'i4'), ('name', 'U10')])

# Cast it to a record array
rec = structured_input.view(np.recarray)

print(rec.name) # Result: ['Alice' 'Bob']

['Alice' 'Bob']


#### 92. Consider a large vector Z, compute Z to the power of 3 using 3 different methods (★★★)

In [134]:
Z = np.array([0, -13523, 2097132, 2097142, 2097151]).astype(int)
print( Z**3 )

print( Z*Z*Z )

print( np.power(Z,3) )
print( np.einsum('i,i,i->i',Z,Z,Z) ) # Most optimized, Einstein summation -> subscripts, *operands

[                  0      -2472971686667 9223108156580683968
 9223240096088587288 9223358842721533951]
[                  0      -2472971686667 9223108156580683968
 9223240096088587288 9223358842721533951]
[                  0      -2472971686667 9223108156580683968
 9223240096088587288 9223358842721533951]
[                  0      -2472971686667 9223108156580683968
 9223240096088587288 9223358842721533951]


#### 93. Consider two arrays A and B of shape (8,3) and (2,2). How to find rows of A that contain elements of each row of B regardless of the order of the elements in B? (★★★)

In [135]:
A = np.array([
    [1,2,3],
    [4,5,6],
    [2,3,1],
    [7,8,9],
    [3,1,5],
    [2,4,1],
    [3,2,6],
    [1,9,8]
])

B = np.array([
    [1,3],
    [4,2]
])

print( np.isin(A, [1,3]) )
# print( np.isin(A, [1,3]).sum(axis=1) == len([1,3]) )  # basically A(each number) in [1,3] ?

# For each row of B, check membership in rows of A
matches = [(np.isin(A, b_row).sum(axis=1) == len(b_row)) for b_row in B]
matches

[[ True False  True]
 [False False False]
 [False  True  True]
 [False False False]
 [ True  True False]
 [False False  True]
 [ True False False]
 [ True False False]]


[array([ True, False,  True, False,  True, False, False, False]),
 array([False, False, False, False, False,  True, False, False])]

#### 94. Considering a 10x3 matrix, extract rows with unequal values (e.g. [2,2,3]) (★★★)

In [136]:
A = rng.integers(1,6,(10,3))

# print(A)
# print( A[:, [0]] )
# print(A == A[:, [0]])   # for each row comparing first value with all

rows_unequal = A[~np.all(A == A[:, [0]], axis=1)]
rows_unequal

array([[3, 5, 3],
       [3, 4, 5],
       [1, 3, 1],
       [3, 5, 5],
       [2, 2, 3],
       [2, 1, 2],
       [3, 3, 5],
       [3, 2, 4],
       [4, 3, 3],
       [2, 5, 5]])

#### 95. Convert a vector of ints into a matrix binary representation (★★★)

In [137]:
ar = rng.integers(1,11,(3,4))
print(ar)

# Vectorize the function to apply it to every element
v_bin = np.vectorize(np.binary_repr)

# Width=4 ensures all strings have the same length (padded with 0s)
bin_ar = v_bin(ar, width=4)
print(bin_ar)

[[ 2  4  3 10]
 [ 1  3  4  6]
 [ 3  6  8  9]]
[['0010' '0100' '0011' '1010']
 ['0001' '0011' '0100' '0110']
 ['0011' '0110' '1000' '1001']]


In [138]:
# FAST
ar = rng.integers(1,11,(3,4))
print(ar)

# The "Shift and Mask" trick
width = 4   # 4 bits: Max 2^4 - 1 = 15  <-- our max input num 10 fits here!
# Create a mask for each bit: like [8, 4, 2, 1]
powers = 1 << np.arange(width)[::-1]
print(powers)

# Unfold the array and compare bits
# Result is shape (3, 4, 4)
bin_matrix = (ar[..., None] & powers > 0).astype(int)

print(bin_matrix[0, 0]) # The binary bits for the first number

[[ 1  2  1 10]
 [ 7  6  8  6]
 [ 3  6  8  1]]
[8 4 2 1]
[0 0 0 1]


#### 96. Given a two dimensional array, how to extract unique rows? (★★★)

In [139]:
ar = rng.integers(1,3,(4,3))
print(ar)

# mask = ar[0,0]==ar[0] # this is for first row; i need to implement this for all rows
# mask.all(axis=-1)

# 1. Extract the first column but keep it 2D (Shape: 3, 1)
first_column = ar[:, [0]]
# 2. Compare the whole array to that first column
# This broadcasts the (3,1) against the (3,4)
mask = (ar == first_column)

# 3. Check if all values in a row are True
result = mask.all(axis=-1)
result

[[2 1 1]
 [1 1 1]
 [1 2 1]
 [1 2 1]]


array([False,  True, False, False])

#### 97. Considering 2 vectors A & B, write the einsum equivalent of inner, outer, sum, and mul function (★★★)

In [140]:
A = np.array([1, 2, 3])
B = np.array([4, 5, 6])

# 1. Inner Product (Dot Product)
# Returns: 1*4 + 2*5 + 3*6 = 32
inner = np.einsum('i,i->', A, B)

# 2. Outer Product
# Returns a grid of all combinations
outer = np.einsum('i,j->ij', A, B)

# 3. Element-wise Multiplication
# Returns: [1*4, 2*5, 3*6]
mul = np.einsum('i,i->i', A, B)

# 4. Sum
total = np.einsum('i->', A)

print(inner, outer, mul, total, sep='\n**************\n')

32
**************
[[ 4  5  6]
 [ 8 10 12]
 [12 15 18]]
**************
[ 4 10 18]
**************
6


#### 98. Considering a path described by two vectors (X,Y), how to sample it using equidistant samples (★★★)?

In [141]:
# 1. Setup an example path (e.g., a spiral)
t = np.linspace(0, 10, 100)
X = t * np.cos(t)
Y = t * np.sin(t)

# 2. Calculate the distance between consecutive points
dx = np.diff(X)
dy = np.diff(Y)
step_distances = np.sqrt(dx**2 + dy**2)

# 3. Build the cumulative distance (the "odometer" reading)
# Prepend 0 to match the original array length
cum_dist = np.concatenate(([0], np.cumsum(step_distances)))

# 4. Define your new equidistant points
# Let's say we want 50 points perfectly spaced from start to finish
num_samples = 50
new_dist = np.linspace(0, cum_dist[-1], num_samples)

# 5. Interpolate X and Y separately based on the distance
X_equi = np.interp(new_dist, cum_dist, X)
Y_equi = np.interp(new_dist, cum_dist, Y)

print(X_equi, Y_equi, sep='\n**************\n')

[ 0.          0.55325068 -0.0304268  -1.04382305 -2.03310452 -2.77963111
 -3.19952075 -3.27042732 -3.01539445 -2.4856575  -1.7292315  -0.81739066
  0.19365692  1.24549916  2.28492662  3.26760392  4.15682875  4.91673204
  5.52759184  5.97747936  6.26073015  6.35266471  6.27953509  6.03492167
  5.63167418  5.08433764  4.40732881  3.61653517  2.73837678  1.77972875
  0.7724668  -0.26771851 -1.32193838 -2.37012189 -3.39650029 -4.37944105
 -5.30619968 -6.16561444 -6.9461892  -7.63774729 -8.21933888 -8.6947168
 -9.05933771 -9.30954804 -9.44321808 -9.45971565 -9.35986045 -9.14586087
 -8.82123462 -8.39071529]
**************
[ 0.          0.75170123  1.58683234  1.81123278  1.47548265  0.7406851
 -0.22071378 -1.26866525 -2.28804644 -3.19649083 -3.92785067 -4.45331405
 -4.7478156  -4.80350336 -4.63345598 -4.25380025 -3.68785451 -2.95723796
 -2.0989419  -1.1466416  -0.13118429  0.91901932  1.97003535  2.99543879
  3.96889856  4.86985613  5.67746513  6.37458629  6.95797145  7.39701517
  7.70813755

#### 99. Given an integer n and a 2D array X, select from X the rows which can be interpreted as draws from a multinomial distribution with n degrees, i.e., the rows which only contain integers and which sum to n. (★★★)

So, we need to apply two distinct logical filters to the rows of $X$
1. Sum Constraint: The row must sum exactly to $n$.
2. The Integer Constraint: Every element in the row must be a whole number.

In [142]:
def find_multinomial_rows(X, n):
    # 1. Check if the sum of each row equals n
    # We use axis=1 to sum across the columns
    # sum_mask = np.sum(X, axis=1) == n  -> case like a row summing to 9.9999999999 instead of 10
    sum_mask = np.isclose(np.sum(X, axis=1), n)
    
    # 2. Check if all elements in the row are integers
    # np.equal(X, X.astype(int)) compares floats to their truncated versions
    # int_mask = np.all(np.equal(X, np.floor(X)), axis=1)   # extra computation, remains as float
    # int_mask = np.all(np.mod(X, 1) == 0, axis=1)   # fastest (but if root of 25 comes, then it might be 5.00000001 (not 5), resulting False)
    int_mask = np.all(np.isclose(np.mod(X, 1), 0, atol=0.00001), axis=1)
    
    # 3. Combine both constraints (AND logic)
    final_mask = sum_mask & int_mask
    
    # 4. Extract the matching rows
    return X[final_mask]

# Example Usage
n = 10
X = np.array([[5, 5, 0],    # Pass (sums to 10, all ints)
              [4.5, 5.5, 0],# Fail (sums to 10, but has floats)
              [1, 1, 1],    # Fail (all ints, but sums to 3)
              [10, 0, 0]])  # Pass

result = find_multinomial_rows(X, n)
print(result)

[[ 5.  5.  0.]
 [10.  0.  0.]]


#### 100. Compute bootstrapped 95% confidence intervals for the mean of a 1D array X (i.e., resample the elements of an array with replacement N times, compute the mean of each sample, and then compute percentiles over the means). (★★★)

In [143]:
def bootstrap_ci(X, n_bootstrap=10000, ci=95):
    # 1. Generate a matrix of random indices with replacement
    # Shape: (n_bootstrap, len(X))
    indices = rng.integers(0, len(X), size=(n_bootstrap, len(X)))
    
    # 2. Resample the data using the indices (Broadcasting)
    resamples = X[indices]
    
    # 3. Compute the mean for each bootstrap sample (Squish across rows)
    sample_means = np.mean(resamples, axis=1)
    
    # 4. Calculate percentiles for the Confidence Interval
    lower_bound = (100 - ci) / 2
    upper_bound = 100 - lower_bound
    
    return np.percentile(sample_means, [lower_bound, upper_bound])

# Usage
X = rng.normal(loc=50, scale=10, size=100)
ci_lower, ci_upper = bootstrap_ci(X)

print(f"95% CI for the mean: [{ci_lower:.2f}, {ci_upper:.2f}]")

95% CI for the mean: [45.82, 49.99]
