# 100 numpy exercises

This is a collection of exercises that have been collected in the numpy mailing list, on stack overflow
and in the numpy documentation. The goal of this collection is to offer a quick reference for both old
and new users but also to provide a set of exercises for those who teach.


If you find an error or think you've a better way to solve some of them, feel
free to open an issue at <https://github.com/rougier/numpy-100>.

File automatically generated. See the documentation to update questions/answers/hints programmatically.

Run the `initialise.py` module, then for each question you can query the
answer or an hint with `hint(n)` or `answer(n)` for `n` question number.

In [2]:
%run initialise.py

#### 1. Import the numpy package under the name `np` (★☆☆)

In [3]:
import numpy as np

#### 2. Print the numpy version and the configuration (★☆☆) !

In [4]:
print(np.__version__)
print(np.__config__)
np.show_config # XXXXXXXXXXXXXXXXXXXXXX
# what is the difference between show_config, __config__ ?
hint(2)

2.3.3
<module 'numpy.__config__' from '/home/abdulrahman/code/numpy_practice/.venv/lib/python3.12/site-packages/numpy/__config__.py'>
hint: np.__version__, np.show_config)


#### 3. Create a null vector of size 10 (★☆☆)

In [5]:
print(np.zeros(10))
hint(3)
answer(3)

# isn't it called a null vector? why it contains a number 0.

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
hint: np.zeros
Z = np.zeros(10)
print(Z)


#### 4. How to find the memory size of any array (★☆☆) !

In [6]:
print(np.zeros(10).size) # XXXXXXXXXXXXXXXXX - this is not in "Memory"
answer(4)

print(np.zeros(10).size * np.zeros(10).itemsize)
print(np.zeros(10).nbytes)

10
Z = np.zeros((10,10))
print("%d bytes" % (Z.size * Z.itemsize))

# Simpler alternative
print("%d bytes" % Z.nbytes)
80
80


#### 5. How to get the documentation of the numpy add function from the command line? (★☆☆) !

In [7]:
print(np.add.__doc__) # XXXXXXXXXXXX - not in Bash
answer(5)

# !python -c "import numpy; numpy.info(numpy.add)"

add(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature])

Add arguments element-wise.

Parameters
----------
x1, x2 : array_like
    The arrays to be added.
    If ``x1.shape != x2.shape``, they must be broadcastable to a common
    shape (which becomes the shape of the output).
out : ndarray, None, or tuple of ndarray and None, optional
    A location into which the result is stored. If provided, it must have
    a shape that the inputs broadcast to. If not provided or None,
    a freshly-allocated array is returned. A tuple (possible only as a
    keyword argument) must have length equal to the number of outputs.
where : array_like, optional
    This condition is broadcast over the input. At locations where the
    condition is True, the `out` array will be set to the ufunc result.
    Elsewhere, the `out` array will retain its original value.
    Note that if an uninitialized `out` array is created via the default
    ``out=None``,

#### 6. Create a null vector of size 10 but the fifth value which is 1 (★☆☆)

In [8]:
vec = np.zeros(10)
vec[4] = 1
vec

array([0., 0., 0., 0., 1., 0., 0., 0., 0., 0.])

#### 7. Create a vector with values ranging from 10 to 49 (★☆☆)

In [9]:
vec = np.arange(10,50)
vec

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
       27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
       44, 45, 46, 47, 48, 49])

#### 8. Reverse a vector (first element becomes last) (★☆☆) !

In [10]:
# vec[-1:0:-1] # where is the first element ?! # XXXXXXXXX

vec[::-1]

array([49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33,
       32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16,
       15, 14, 13, 12, 11, 10])

#### 9. Create a 3x3 matrix with values ranging from 0 to 8 (★☆☆)

In [11]:
vec = np.arange(9).reshape(3,3)
vec

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

#### 10. Find indices of non-zero elements from [1,2,0,0,4,0] (★☆☆) !

In [12]:
# vec = np.array([1,2,0,0,4,0])
# vec[vec != 0] # XXXXXXXXXXXXX -> this will get the values themselves not the indices

np.nonzero([1,2,0,0,4,0])

(array([0, 1, 4]),)

#### 11. Create a 3x3 identity matrix (★☆☆) !

In [13]:
np.identity(3) # also: np.eye(N, M=None, k=0, dtype=None) -> it's more versatile
# N = rows, M = cols, k = diagonal position.

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

#### 12. Create a 3x3x3 array with random values (★☆☆)

In [14]:
np.random.randn(3,3,3)

array([[[-0.56308413,  0.68440302,  1.09689101],
        [ 1.57437778,  0.55434083,  0.12805035],
        [ 1.05284538,  1.08807741, -0.69784261]],

       [[-0.00328146,  1.29544992, -0.91586725],
        [-0.62200601,  0.49626939, -0.08068791],
        [ 0.41430022, -0.67488356, -0.73490328]],

       [[ 0.45257755,  0.63309757,  0.37317321],
        [ 0.91126671, -0.09628005,  0.50033414],
        [-0.14203455,  0.61068895, -1.22693682]]])

#### 13. Create a 10x10 array with random values and find the minimum and maximum values (★☆☆)

In [15]:
vec = np.random.randn(10,10)
print(vec)
print(vec.max(), vec.min())

[[ 0.2075257   0.0608442  -1.18405131 -0.04799426  0.55620239  0.91936085
  -0.16759357 -0.52781304  0.06724732 -0.16597325]
 [ 0.04322133  2.04826995  0.88452836 -0.28445202 -0.14983675 -0.54095281
  -1.21491527  2.05449311 -1.61686927  0.18519617]
 [ 0.74983029 -0.15555639  1.27050971  0.83280813 -0.54012992 -0.25715094
  -0.27429965 -0.93112366  0.48904202  0.78421985]
 [-2.8115385  -0.23221599  0.56015715  1.362401   -0.63536551 -0.13677162
  -0.67268532  1.47534371 -1.24183912 -0.36184485]
 [ 0.61640114 -0.08369057 -0.7495363  -0.5489775   0.79432383  0.32589686
   0.01039242 -0.11659614  0.24293633  0.33672862]
 [ 0.52607326 -0.57810409  0.11753433 -0.87946769  0.50958895 -0.66740492
   0.59218608 -1.01187603  0.79375664  0.49201399]
 [-1.11365229  0.50735269  0.79730157 -1.84119797  0.70635721 -1.88820162
  -1.62924537  1.27461861 -1.01339347  0.87434852]
 [ 0.16768714 -0.26449715 -1.0471995   0.67969945 -0.06738676 -0.5322639
   0.01807947 -0.34653674 -1.07402489  1.23488845]
 

#### 14. Create a random vector of size 30 and find the mean value (★☆☆)

In [16]:
np.random.randn(30).mean()

np.float64(-0.29162491888392383)

#### 15. Create a 2d array with 1 on the border and 0 inside (★☆☆) !

In [17]:
vec = np.ones([10,10])

vec[1:-1,1:-1] = 0

vec

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

#### 16. How to add a border (filled with 0's) around an existing array? (★☆☆) !

In [18]:
# ??

vec = np.ones([10,10])
vec = np.pad(vec, pad_width=1, mode='constant', constant_values=[0])
vec

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

#### 17. What is the result of the following expression? (★☆☆) !
```python
0 * np.nan
np.nan == np.nan
np.inf > np.nan
np.nan - np.nan
np.nan in set([np.nan])
0.3 == 3 * 0.1
```

In [19]:
print(0 * np.nan)
print(np.nan == np.nan)
print(np.inf > np.nan)
print(np.nan - np.nan)
print(np.nan in set([np.nan]))
print(0.3 == 3 * 0.1)

nan
False
False
nan
True
False


#### 18. Create a 5x5 matrix with values 1,2,3,4 just below the diagonal (★☆☆) !

In [20]:
vec = np.zeros([5,5])

vec[4,0:4] = [1,2,3,4]

vec

# another solution: 
np.diag(1+np.arange(4), k=-1) 
# 1. this first creates a diagonal below the original one (at k=0) 
# 2. the content of it is [1,2,3,4] which is equal to (1 + np.arange(4))
# 3. the size of the array is (5*5) because the diagonal we created at (k=-1) have 4 elements so the original one will have 5 elements
# 4. the rest of the elements are zero default

array([[0, 0, 0, 0, 0],
       [1, 0, 0, 0, 0],
       [0, 2, 0, 0, 0],
       [0, 0, 3, 0, 0],
       [0, 0, 0, 4, 0]])

#### 19. Create a 8x8 matrix and fill it with a checkerboard pattern (★☆☆) ! 

In [21]:
# ??
vec = np.zeros([8,8])
vec[1::2, ::2] = 1
vec[::2, 1::2] = 1

vec


array([[0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.]])

#### 20. Consider a (6,7,8) shape array, what is the index (x,y,z) of the 100th element? (★☆☆)

In [22]:
vec = np.zeros([6,7,8])

flat = vec.flatten()

flat[99] = 6

vec = flat.reshape(6,7,8)

np.where(vec == 6)


# another solution: 
np.unravel_index(99,(6,7,8)) # finding the corresponding index to a flat index in a multidimensional shape.

(np.int64(1), np.int64(5), np.int64(3))

#### 21. Create a checkerboard 8x8 matrix using the tile function (★☆☆)

In [23]:
np.tile([0,1], [8,4])

array([[0, 1, 0, 1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0, 1, 0, 1]])

#### 22. Normalize a 5x5 random matrix (★☆☆)

In [24]:
val = np.random.randn(5,5)

print((val - val.min())/(val.max() - val.min()) )

[[0.78824081 0.36484441 0.59829518 0.97779463 0.19576494]
 [0.41842877 0.         1.         0.39042216 0.55703471]
 [0.75781057 0.43496554 0.71797226 0.69338056 0.4740786 ]
 [0.66889274 0.58926184 0.19610862 0.448447   0.64038315]
 [0.89621302 0.14511067 0.40913038 0.89203691 0.26675122]]


#### 23. Create a custom dtype that describes a color as four unsigned bytes (RGBA) (★☆☆)

In [25]:
# !
color = np.dtype([
    ('R', 'u1'),
    ('G', 'uint8'),
    ('B', 'uint8'),
    ('A', 'uint8')
])
# u1 = uint8

colors = np.empty(3, dtype=color)

colors

array([(156, 188, 1, 0), (  0,   0, 0, 0), (  0,   0, 0, 0)],
      dtype=[('R', 'u1'), ('G', 'u1'), ('B', 'u1'), ('A', 'u1')])

#### 24. Multiply a 5x3 matrix by a 3x2 matrix (real matrix product) (★☆☆)

In [26]:
val = np.random.randint(1,10, [5,3]) 
val2 = np.random.randint(1,10, [3,2])

print(val)

print(val2)

val @ val2 #!

[[8 8 8]
 [9 7 2]
 [3 6 2]
 [7 4 6]
 [8 5 9]]
[[4 4]
 [1 6]
 [3 6]]


array([[ 64, 128],
       [ 49,  90],
       [ 24,  60],
       [ 50,  88],
       [ 64, 116]])

#### 25. Given a 1D array, negate all elements which are between 3 and 8, in place. (★☆☆)

In [27]:
val = np.arange(10)

val[4:8] = 10 - val[4:8]

val

# answer(25)
# # Author: Evgeni Burovski

# Z = np.arange(11)
# Z[(3 < Z) & (Z < 8)] *= -1 # !
# print(Z)

array([0, 1, 2, 3, 6, 5, 4, 3, 8, 9])

#### 26. What is the output of the following script? (★☆☆)
```python
# Author: Jake VanderPlas

print(sum(range(5),-1))
from numpy import *
print(sum(range(5),-1))
```

In [28]:
14 # !
15 # !

15

#### 27. Consider an integer vector Z, which of these expressions are legal? (★☆☆)
```python
Z**Z
2 << Z >> 2
Z <- Z
1j*Z
Z/1/1
Z<Z>Z
```

In [29]:
# !

I = 3
2 << I >> 2
I <- I # = I < - I

answer(27)
Z = 3
print(Z**Z)
print(2 << Z >> 2)
print(Z <- Z)
print(1j*Z)
print(Z/1/1)
print(Z<Z>Z)

Z**Z
2 << Z >> 2
Z <- Z
1j*Z
Z/1/1
Z<Z>Z
27
4
False
3j
3.0
False


#### 28. What are the result of the following expressions? (★☆☆)
```python
np.array(0) / np.array(0)
np.array(0) // np.array(0)
np.array([np.nan]).astype(int).astype(float)
```

In [30]:
# !
print(np.array(0) / np.array(0))
print(np.array(0) // np.array(0))
print(np.array([np.nan]).astype(int).astype(float))

nan
0
[-9.22337204e+18]


  print(np.array(0) / np.array(0))
  print(np.array(0) // np.array(0))
  print(np.array([np.nan]).astype(int).astype(float))


#### 29. How to round away from zero a float array ? (★☆☆)

In [31]:
answer(29) # I don't understand the first answer

# Author: Charles R Harris

Z = np.random.uniform(-10,+10,10)
print(np.copysign(np.ceil(np.abs(Z)), Z))

# More readable but less efficient
print(np.where(Z>0, np.ceil(Z), np.floor(Z)))


#### 30. How to find common values between two arrays? (★☆☆)

In [32]:
answer(30) # ok !

Z1 = np.random.randint(0,10,10)
Z2 = np.random.randint(0,10,10)
print(np.intersect1d(Z1,Z2))


#### 31. How to ignore all numpy warnings (not recommended)? (★☆☆)

In [33]:
# with np.errstate(all="ignore"): # ! (that's a context manager)
#     np.arange(3) / 0

#### 32. Is the following expressions true? (★☆☆)
```python
np.sqrt(-1) == np.emath.sqrt(-1)
```

In [34]:
# I think so.

np.sqrt(-1) == np.emath.sqrt(-1)

# it's wrong ! why?

# np.sqrt(-1) # floating number output (nan)
# np.emath.sqrt(-1) # complex number output (1j)


# 🧮 What is numpy.lib.scimath?
# The module is officially called numpy.lib.scimath. It's often imported under the convenient alias np.emath. Its purpose is to provide versions of mathematical functions that behave more usefully when dealing with complex numbers than the standard NumPy functions.

# The key difference is in how they handle inputs that would be invalid for real numbers. For example, what should the function do if you try to take the square root of a negative number?

  np.sqrt(-1) == np.emath.sqrt(-1)


np.False_

#### 33. How to get the dates of yesterday, today and tomorrow? (★☆☆)

In [35]:
np.datetime64('now') # wow that's correct !
np.datetime64('now') - np.timedelta64(1,'D') # yesterday
np.datetime64('now') + np.timedelta64(1,'D') # tomorrow

np.datetime64('2025-10-06T08:01:52')

#### 34. How to get all the dates corresponding to the month of July 2016? (★★☆)

In [36]:
first = np.datetime64('2016-07-01')
while np.datetime64(first, 'M').astype(int) % 12 + 1 == 7:
    print(first)
    first+=np.timedelta64(1,'D')

answer(34)

dates = np.arange('2016-07', '2016-08', dtype='datetime64[D]') # the 'D' symbol is important here.
print(dates)

2016-07-01
2016-07-02
2016-07-03
2016-07-04
2016-07-05
2016-07-06
2016-07-07
2016-07-08
2016-07-09
2016-07-10
2016-07-11
2016-07-12
2016-07-13
2016-07-14
2016-07-15
2016-07-16
2016-07-17
2016-07-18
2016-07-19
2016-07-20
2016-07-21
2016-07-22
2016-07-23
2016-07-24
2016-07-25
2016-07-26
2016-07-27
2016-07-28
2016-07-29
2016-07-30
2016-07-31
Z = np.arange('2016-07', '2016-08', dtype='datetime64[D]')
print(Z)
['2016-07-01' '2016-07-02' '2016-07-03' '2016-07-04' '2016-07-05'
 '2016-07-06' '2016-07-07' '2016-07-08' '2016-07-09' '2016-07-10'
 '2016-07-11' '2016-07-12' '2016-07-13' '2016-07-14' '2016-07-15'
 '2016-07-16' '2016-07-17' '2016-07-18' '2016-07-19' '2016-07-20'
 '2016-07-21' '2016-07-22' '2016-07-23' '2016-07-24' '2016-07-25'
 '2016-07-26' '2016-07-27' '2016-07-28' '2016-07-29' '2016-07-30'
 '2016-07-31']


#### 35. How to compute ((A+B)*(-A/2)) in place (without copy)? (★★☆)

In [37]:
# KEY LEARNING SUMMARY FROM NUMPY IN-PLACE OPERATIONS:

# 1. MEMORY MANAGEMENT
# - Normal operations (A + B) create new arrays each time
# - In-place operations (A += B) modify existing arrays, saving memory
A = np.array([1, 2, 3])
B = np.array([4, 5, 6])
C = A + B  # New array created: [5, 7, 9]
A += B     # A modified in-place: [5, 7, 9] (no new array)

# 2. OUT PARAMETER USAGE  
# - np.add(A, B, out=result) stores result in pre-allocated memory
result = np.empty_like(A)
np.add(A, B, out=result)  # result = [5, 7, 9] (no temporary)
np.add(A, B, out=A)       # A = [5, 7, 9] (overwrite original)

# 3. OPERATORS VS FUNCTIONS
# - Operators for simplicity, functions for control
C = A * B                    # Simple: [20, 35, 54]  
np.multiply(A, B, out=A)     # Controlled: A = [20, 35, 54]
np.multiply(A, B, where=[True,False,True])  # Conditional

# 4. TEMPORARY VARIABLE STRATEGY
# - One copy avoids multiple intermediates
A = np.array([1, 2, 3], dtype=float); B = np.array([4, 5, 6])
temp = A.astype(float)     # temp = [1, 2, 3] # without the .astype() -> the expression (Temp = A) will create an alias not a copy and when you multiply it by a fraction (.5) it will raise an error
A += B              # A = [5, 7, 9] 
temp *= -0.5        # temp = [-0.5, -1, -1.5]
A *= temp           # A = [-2.5, -7, -13.5] (only 3 arrays total)

# 5. EXPRESSION EXECUTION FLOW
# - Complex expressions create hidden temporaries
result = ((A + B) * (-A / 2))
# Actually creates: temp1=A+B, temp2=-A, temp3=temp2/2, result=temp1*temp3
# Total: 4 new arrays instead of 1!

# 6. EINSTEIN SUMMATION (EINSUM)
# - Compact notation for complex operations
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result = np.einsum('ij,jk->ik', A, B)  # Matrix multiplication: [[19, 22], [43, 50]]

# 7. PERFORMANCE TRADEOFFS
# - Readable code vs Memory efficiency
# result = A + B - C * D  # Readable but creates 3 temporaries
# vs
# result = A.copy(); result += B; temp = C.copy(); temp *= D; result -= temp  # Efficient but complex

answer(35)

A = np.ones(3)*1
B = np.ones(3)*2
np.add(A,B,out=B)
np.divide(A,2,out=A)
np.negative(A,out=A)
np.multiply(A,B,out=A)


#### 36. Extract the integer part of a random array of positive numbers using 4 different methods (★★☆)

In [38]:
arr = np.random.uniform(0,10, 10)
print((arr//1))
print(np.floor(arr))
print(np.trunc(arr))
print((arr - arr%1))
print(arr.astype(int))
print(arr)

answer(36)

[3. 5. 7. 8. 4. 3. 8. 5. 1. 8.]
[3. 5. 7. 8. 4. 3. 8. 5. 1. 8.]
[3. 5. 7. 8. 4. 3. 8. 5. 1. 8.]
[3. 5. 7. 8. 4. 3. 8. 5. 1. 8.]
[3 5 7 8 4 3 8 5 1 8]
[3.02552607 5.98468179 7.94333511 8.11380112 4.51639871 3.22884281
 8.91809649 5.44207976 1.46000817 8.08143416]
Z = np.random.uniform(0,10,10)

print(Z - Z%1)
print(Z // 1)
print(np.floor(Z))
print(Z.astype(int))
print(np.trunc(Z))


#### 37. Create a 5x5 matrix with row values ranging from 0 to 4 (★★☆)

In [39]:
print(np.ones([5,5]) * np.arange(5)) 


answer(37)

print(np.zeros([5,5]) + np.arange(5)) # just like my solution
print(10*'#')
print(np.tile(np.arange(5), [5,1]))

# both np.ones, np.zeros produce floating point numbers

[[0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]]
Z = np.zeros((5,5))
Z += np.arange(5)
print(Z)

# without broadcasting
Z = np.tile(np.arange(0, 5), (5,1))
print(Z)
[[0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]
 [0. 1. 2. 3. 4.]]
##########
[[0 1 2 3 4]
 [0 1 2 3 4]
 [0 1 2 3 4]
 [0 1 2 3 4]
 [0 1 2 3 4]]


#### 38. Consider a generator function that generates 10 integers and use it to build an array (★☆☆)

In [40]:
def fun():
    for i in range(10):
        yield i
# np.array(fun()) # that's wrong

print(np.fromiter(fun(), dtype=float)) # adding ", count=-1" doesn't affect the result?

answer(38)

[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
def generate():
    for x in range(10):
        yield x
Z = np.fromiter(generate(),dtype=float,count=-1)
print(Z)


#### 39. Create a vector of size 10 with values ranging from 0 to 1, both excluded (★★☆)

In [41]:
print(np.random.uniform(0,1, 10)) # is this actually excluding 0, 1 ?

answer(39)

print(np.linspace(0,1,11, endpoint=False)[1:])
# linspace produces an array that have equal difference between its consequetive elements
# the endpoint=False part is for excluding the "1" from the array
# the [1:] part is for excluding the "0" and therefore we need another element to make them = 10 elements so we produce 11.

print(np.linspace(0,1,12)[1:-1]) # the same result as above

[0.85377128 0.99099821 0.33624922 0.06541582 0.69163416 0.49878551
 0.40017859 0.91683204 0.32695294 0.42486525]
Z = np.linspace(0,1,11,endpoint=False)[1:]
print(Z)
[0.09090909 0.18181818 0.27272727 0.36363636 0.45454545 0.54545455
 0.63636364 0.72727273 0.81818182 0.90909091]
[0.09090909 0.18181818 0.27272727 0.36363636 0.45454545 0.54545455
 0.63636364 0.72727273 0.81818182 0.90909091]


#### 40. Create a random vector of size 10 and sort it (★★☆)

In [42]:
r = np.random.rand(10); r.sort(); print(r)  # in place sorting

print( np.sort( np.random.rand(10) ) ) # returns a new sorted array

# # np.argsort() - Returns indices that would sort array
# indices = np.argsort(arr)                    # [1, 3, 6, 0, 2, 4, 5]
# sorted_via_indices = arr[indices]            # Same as np.sort(arr)


# # Sort entire array flattened
# flat_sorted = np.sort(arr_2d, axis=None)     # [1, 1, 2, 3, 4, 5, 5, 6, 9]

answer(40)

[0.03083735 0.03667817 0.09669425 0.10221864 0.15532864 0.18004099
 0.32983835 0.352561   0.49249289 0.79312163]
[0.03103105 0.05143477 0.12882467 0.16073008 0.21170144 0.24235983
 0.26616485 0.43007777 0.52412383 0.81702645]
Z = np.random.random(10)
Z.sort()
print(Z)


#### 41. How to sum a small array faster than np.sum? (★★☆)

In [43]:
# I don't know !

answer(41)
print('#'*10)
print( np.add.reduce([3,10,3]) ) 

# Author: Evgeni Burovski

Z = np.arange(10)
np.add.reduce(Z)
##########
16


#### 42. Consider two random arrays A and B, check if they are equal (★★☆)

In [44]:
a = np.random.rand(10)
b = np.random.rand(10)

print( a == b )
print('#'*10)

answer(42)

print( np.allclose(a,b) )

print( np.array_equal(a,b) )

[False False False False False False False False False False]
##########
A = np.random.randint(0,2,5)
B = np.random.randint(0,2,5)

# Assuming identical shape of the arrays and a tolerance for the comparison of values
equal = np.allclose(A,B)
print(equal)

# Checking both the shape and the element values, no tolerance (values have to be exactly equal)
equal = np.array_equal(A,B)
print(equal)
False
False


#### 43. Make an array immutable (read-only) (★★☆)

In [45]:
# what ?

# for more about the topic: https://note.nkmk.me/en/python-numpy-ndarray-immutable-read-only/

answer(43)

# another way = 
# z = np.zeros(10)
# z.setflags(write=False)
# z[1] = 1

Z = np.zeros(10)
Z.flags.writeable = False
Z[0] = 1


#### 44. Consider a random 10x2 matrix representing cartesian coordinates, convert them to polar coordinates (★★☆)

In [46]:
original = np.random.randint(10, size=[10,2])
print(original)
print('#'*10)

# useful: https://github.com/numpy/numpy/issues/5228

# for example: 

# def cart2pol(x, y):
#     """Converts Cartesian coordinates to polar coordinates."""
#     rho = np.hypot(x, y)
#     phi = np.arctan2(y, x)
#     return rho, phi

# def pol2cart(rho, phi):
#     """Converts polar coordinates to Cartesian coordinates."""
#     x = rho * np.cos(phi)
#     y = rho * np.sin(phi)
#     return x, y

# print(original[:,1]) # getting all the rows but only the second column (number: 1)
# we use arctan2 not arctan because it have better handling for dividing by zero issues

answer(44)

[[4 5]
 [3 5]
 [0 5]
 [4 2]
 [5 3]
 [5 9]
 [8 2]
 [6 5]
 [8 5]
 [4 2]]
##########
Z = np.random.random((10,2))
X,Y = Z[:,0], Z[:,1]
R = np.sqrt(X**2+Y**2)
T = np.arctan2(Y,X)
print(R)
print(T)


#### 45. Create random vector of size 10 and replace the maximum value by 0 (★★☆)

In [47]:
arr = np.random.randint(10,size=[10])
print(arr)
print('#'*10)

arr[ np.where(arr == arr.max())[0] ] = 0

print(arr)

answer(45)

# so, arr.argmax() = np.where(arr == arr.max())[0]

[7 7 3 8 4 2 3 5 6 7]
##########
[7 7 3 0 4 2 3 5 6 7]
Z = np.random.random(10)
Z[Z.argmax()] = 0
print(Z)


#### 46. Create a structured array with `x` and `y` coordinates covering the [0,1]x[0,1] area (★★☆)

In [48]:
arr = np.array([[0,0], [0,1], [1,0], [1,1]])
print(arr)

answer(46)

arr = np.zeros((5,5), [('x', float), ('y', float)]) # naming columns - that's a structured array!
print(arr)
print(arr['x'], arr['y'], sep='\n')
print('#'*10)

arr['x'], arr['y'] = np.meshgrid(np.linspace(0,1,5), np.linspace(0,1,5)) # creating all possible combinations between the two 1D arrays (np.linspace) to create one 2D array.

print(arr)
print(arr['x'], arr['y'], sep='\n')


# first, the question can be solved without specifying exactly 5 points in each direction but OK !
# second, I've learned that I can name columns for a numpy array !
# third, the 'meshgrid' function can be used to create a 2D array with all possbile combinations between two 1D arrays.

[[0 0]
 [0 1]
 [1 0]
 [1 1]]
Z = np.zeros((5,5), [('x',float),('y',float)])
Z['x'], Z['y'] = np.meshgrid(np.linspace(0,1,5),
                             np.linspace(0,1,5))
print(Z)
[[(0., 0.) (0., 0.) (0., 0.) (0., 0.) (0., 0.)]
 [(0., 0.) (0., 0.) (0., 0.) (0., 0.) (0., 0.)]
 [(0., 0.) (0., 0.) (0., 0.) (0., 0.) (0., 0.)]
 [(0., 0.) (0., 0.) (0., 0.) (0., 0.) (0., 0.)]
 [(0., 0.) (0., 0.) (0., 0.) (0., 0.) (0., 0.)]]
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
##########
[[(0.  , 0.  ) (0.25, 0.  ) (0.5 , 0.  ) (0.75, 0.  ) (1.  , 0.  )]
 [(0.  , 0.25) (0.25, 0.25) (0.5 , 0.25) (0.75, 0.25) (1.  , 0.25)]
 [(0.  , 0.5 ) (0.25, 0.5 ) (0.5 , 0.5 ) (0.75, 0.5 ) (1.  , 0.5 )]
 [(0.  , 0.75) (0.25, 0.75) (0.5 , 0.75) (0.75, 0.75) (1.  , 0.75)]
 [(0.  , 1.  ) (0.25, 1.  ) (0.5 , 1.  ) (0.75, 1.  ) (1.  , 1.  )]]
[[0.   0.25 0.5  0.75 1.  ]
 [0.   0.25 0.

#### 47. Given two arrays, X and Y, construct the Cauchy matrix C (Cij =1/(xi - yj)) (★★☆)

In [49]:
x = np.array([1,2,3])
y = np.array([4,5,6])


C = np.zeros((3,3), [('x', float), ('y', float)])
C['x'], C['y'] = np.meshgrid(x,y)
C = 1 / (C['x'] - C['y'])
print(C, '#'*10, sep='\n')

answer(47)

X = np.array([1,2,3])
Y = np.array([4,5,6])
print(X,Y)
C = 1.0 / np.subtract.outer(X, Y)
print(C)
print(np.linalg.det(C))

[[-0.33333333 -0.5        -1.        ]
 [-0.25       -0.33333333 -0.5       ]
 [-0.2        -0.25       -0.33333333]]
##########
# Author: Evgeni Burovski

X = np.arange(8)
Y = X + 0.5
C = 1.0 / np.subtract.outer(X, Y)
print(np.linalg.det(C))
[1 2 3] [4 5 6]
[[-0.33333333 -0.25       -0.2       ]
 [-0.5        -0.33333333 -0.25      ]
 [-1.         -0.5        -0.33333333]]
0.00046296296296296146


#### 48. Print the minimum and maximum representable values for each numpy scalar type (★★☆)

In [50]:
answer(48)

types = [np.int8, np.int16, np.int32, np.int64, np.uint8, np.uint16, np.uint32, np.uint64]
types_f = [np.float16, np.float32, np.float64]
for dtype in types:
    print(np.iinfo(dtype))
for dtype in types_f:
    print(np.finfo(dtype))
    
print(np.iinfo(np.intp).max, np.iinfo(np.uintp).max, sep='\n')

for dtype in [np.int8, np.int32, np.int64]:
   print(np.iinfo(dtype).min)
   print(np.iinfo(dtype).max)
for dtype in [np.float32, np.float64]:
   print(np.finfo(dtype).min)
   print(np.finfo(dtype).max)
   print(np.finfo(dtype).eps)
Machine parameters for int8
---------------------------------------------------------------
min = -128
max = 127
---------------------------------------------------------------

Machine parameters for int16
---------------------------------------------------------------
min = -32768
max = 32767
---------------------------------------------------------------

Machine parameters for int32
---------------------------------------------------------------
min = -2147483648
max = 2147483647
---------------------------------------------------------------

Machine parameters for int64
---------------------------------------------------------------
min = -9223372036854775808
max = 9223372036854775807
---------------------------------------------------------------

Ma

#### 49. How to print all the values of an array? (★★☆)

In [51]:
print(np.zeros([40,40])) # the default threshold is 1000

answer(49)

[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
np.set_printoptions(threshold=float("inf"))
Z = np.zeros((40,40))
print(Z)


In [52]:
np.set_printoptions(threshold=float('inf')) # = np.inf = sys.maxsize
print(np.zeros([40,40]))

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


#### 50. How to find the closest value (to a given scalar) in a vector? (★★☆)

In [53]:
arr = np.random.rand(10)
value = 4.5
diff = np.abs(arr - value)
print(arr[diff.argmin()]) # not min() -> argmin is the index while min is the value

0.9746078801457974


#### 51. Create a structured array representing a position (x,y) and a color (r,g,b) (★★☆)

In [54]:
np.set_printoptions(threshold=1000) # = np.inf = sys.maxsize


arr = np.zeros((5,5), [('x', int), ('y', int)])
print(arr)

arr = np.zeros((128,128,128), [('r', int), ('g', int), ('b', int)])
print(arr)

answer(51)


# to make them in one array: Z = np.zeros(10, dtype = [ ('position', [ ('x', float, 1), ('y', float, 1)]),
                            #    ('color',    [ ('r', float, ('g', float, 1), ('b', float, 1)])])

[[(0, 0) (0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0) (0, 0)]
 [(0, 0) (0, 0) (0, 0) (0, 0) (0, 0)]]
[[[(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]
  [(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]
  [(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]
  ...
  [(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]
  [(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]
  [(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]]

 [[(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]
  [(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]
  [(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]
  ...
  [(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]
  [(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) (0, 0, 0) (0, 0, 0)]
  [(0, 0, 0) (0, 0, 0) (0, 0, 0) ... (0, 0, 0) 

#### 52. Consider a random vector with shape (100,2) representing coordinates, find point by point distances (★★☆)

In [55]:
answer(52)

z = np.random.random((10,2))
x, y = np.atleast_2d(z[:,0], z[:,1])
d = np.sqrt( (x - x.T)**2 + (y - y.T)**2 )
print(d)

# no need to fully understand this !

Z = np.random.random((10,2))
X,Y = np.atleast_2d(Z[:,0], Z[:,1])
D = np.sqrt( (X-X.T)**2 + (Y-Y.T)**2)
print(D)

# Much faster with scipy
import scipy
# Thanks Gavin Heverly-Coulson (#issue 1)
import scipy.spatial

Z = np.random.random((10,2))
D = scipy.spatial.distance.cdist(Z,Z)
print(D)
[[0.         0.57870328 0.17245881 0.40883489 0.31230457 0.24036319
  0.07574199 0.37359637 0.25575967 0.62818464]
 [0.57870328 0.         0.56635553 0.68868117 0.28953257 0.387331
  0.61243397 0.54706104 0.32731799 0.07272055]
 [0.17245881 0.56635553 0.         0.24029478 0.27686871 0.17905282
  0.24732904 0.52226972 0.2963944  0.59853767]
 [0.40883489 0.68868117 0.24029478 0.         0.42759361 0.34481229
  0.48079555 0.76043336 0.4992788  0.69747258]
 [0.31230457 0.28953257 0.27686871 0.42759361 0.         0.09801006
  0.36427719 0.44980806 0.12728891 0.32566749]
 [0.24036319 0.387331   0.17905282 0.34481229 0.09801006 0.
  0.303918   0.46281117 0.15904905 0.42069309]
 [0.07574199 0.61243397 0.247

#### 53. How to convert a float (32 bits) array into an integer (32 bits) array in place?

In [56]:
z = np.arange(10, dtype=np.float32)
z.dtype = np.int32
z.dtype

answer(53)

# 1. ORIGINAL ARRAY: [1.0, 2.7, 4.5] as float32
#    Memory: [3F800000, 402CCCCD, 40900000] (float bit patterns)

# 2. DANGEROUS: z.dtype = np.int32
#    SAME memory, just reinterpreted: [1065353216, 1075000115, 1083179008]
#    WRONG values - no actual conversion!

# 3. SAFE BUT MEMORY-HEAVY: z.astype(np.int32)  
#    CREATES NEW array: [1, 2, 4]
#    ORIGINAL memory unchanged, NEW memory allocated

# 4. MEMORY-EFFICIENT: Y = Z.view(np.int32); Y[:] = Z
#    Step 1: Y views SAME memory as int32: [1065353216, 1075000115, 1083179008]
#    Step 2: Y[:] = Z OVERWRITES memory with converted values: [1, 2, 4]
#    Result: Same memory reused, proper conversion, no extra allocation

# Thanks Vikas (https://stackoverflow.com/a/10622758/5989906)
# & unutbu (https://stackoverflow.com/a/4396247/5989906)
Z = (np.random.rand(10)*100).astype(np.float32)
Y = Z.view(np.int32)
Y[:] = Z
print(Y)


#### 54. How to read the following file? (★★☆)
```
1, 2, 3, 4, 5
6,  ,  , 7, 8
 ,  , 9,10,11
```

In [57]:
f = np.genfromtxt('file.csv', delimiter=',')
# =============================================
# SUMMARY: READING FILES IN NUMPY
# =============================================

# 1. For simple text files (CSV, TSV):
# ------------------------------------
# data = np.loadtxt('file.txt')                    # Space-separated
# data = np.loadtxt('file.csv', delimiter=',')     # CSV files
# data = np.loadtxt('file.csv', delimiter=',', skiprows=1, dtype=int)  # Skip header, specify type
# data = np.loadtxt('file.csv', delimiter=',', usecols=(0, 2))         # Select specific columns

# 2. For text files with missing values:
# --------------------------------------
# data = np.genfromtxt('file.csv', delimiter=',')           # Auto-fill missing with NaN
# data = np.genfromtxt('file.csv', delimiter=',', filling_values=0)  # Fill missing with 0
# data = np.genfromtxt('file.csv', delimiter=',', usemask=True)      # Create masked array

# 3. For NumPy binary files (.npy, .npz):
# ---------------------------------------
# array = np.load('data.npy')                                # Single array
# with np.load('archive.npz') as data:                       # Multiple arrays
#     array1 = data['array1']
#     array2 = data['array2']

# 4. For memory-mapped large arrays:
# ----------------------------------
# large_array = np.load('large_array.npy', mmap_mode='r')

# 5. For raw binary data (use with caution):
# ------------------------------------------
# data = np.fromfile('binary_data.bin', dtype=np.float32)

# =============================================
# RECOMMENDATIONS:
# - Use .npy/.npz format for reliable storage
# - Avoid fromfile() for long-term storage
# - Use genfromtxt() for messy data with missing values
# =============================================

answer(55)


Z = np.arange(9).reshape(3,3)
for index, value in np.ndenumerate(Z):
    print(index, value)
for index in np.ndindex(Z.shape):
    print(index, Z[index])


In [58]:
# =============================================
# SUMMARY: CREATING NUMPY .NPY/.NPZ FILES
# =============================================

# 1. Saving single array to .npy:
# -------------------------------
# np.save('filename.npy', array)           # Saves single array efficiently
# loaded_array = np.load('filename.npy')   # Load it back

# 2. Saving multiple arrays to .npz:
# ----------------------------------
# np.savez('archive.npz', key1=array1, key2=array2)  # Uncompressed
# np.savez_compressed('archive.npz', key1=array1, key2=array2)  # Compressed

# 3. Loading .npz files:
# ----------------------
# data = np.load('archive.npz')
# array1 = data['key1']                    # Access by key name
# array2 = data['key2']
# print(data.files)                        # List all keys

# =============================================
# CREATING CSV FILES FROM NUMPY
# =============================================

# 1. Basic CSV export:
# --------------------
# np.savetxt('data.csv', array, delimiter=',')           # Comma-separated
# np.savetxt('data.tsv', array, delimiter='\t')          # Tab-separated

# 2. CSV with header and formatting:
# ----------------------------------
# np.savetxt('data.csv', array, delimiter=',', 
#            header='col1,col2,col3',        # Column names
#            fmt='%.2f',                     # Float formatting (2 decimal places)
#            comments='')                    # Remove '#' from header

# 3. CSV with specific data types:
# --------------------------------
# np.savetxt('data.csv', array, delimiter=',', 
#            fmt='%d')                       # Integer format

# 4. For structured arrays with mixed data:
# -----------------------------------------
# Use pandas: pd.DataFrame(array).to_csv('data.csv', index=False)

# =============================================
# QUICK EXAMPLES:
# =============================================

# Create and save arrays
# array = np.random.rand(5, 3)
# np.save('data.npy', array)                           # Binary format
# np.savetxt('data.csv', array, delimiter=',')         # CSV format

# Create structured data example
# structured_array = np.array([(1, 2.5, 'A'), (2, 3.7, 'B')], 
#                            dtype=[('id', 'i4'), ('value', 'f4'), ('label', 'U1')])
# Use pandas for mixed data types: 
# pd.DataFrame(structured_array).to_csv('structured.csv', index=False)

# =============================================
# KEY POINTS:
# - Use .npy/.npz for NumPy-to-NumPy (fast, preserves everything)
# - Use CSV for sharing/exchanging data with other programs
# - Use pandas for complex/mixed data types in CSV
# =============================================

#### 55. What is the equivalent of enumerate for numpy arrays? (★★☆)

In [59]:
arr = np.random.randint(1, 10, 10)
size =0
for i in np.ndenumerate(arr): # if you want only indexes use: ndindex() instead
    print(i[1]); size+= i[1]
print(size)

answer(55)

8
5
9
4
2
4
9
8
2
1
52
Z = np.arange(9).reshape(3,3)
for index, value in np.ndenumerate(Z):
    print(index, value)
for index in np.ndindex(Z.shape):
    print(index, Z[index])


#### 56. Generate a generic 2D Gaussian-like array (★★☆)

In [60]:
print(np.random.normal(loc=0.0, scale=1.0, size=[100,2]))

answer(56)

# G(x,y) = exp(-(x² + y²) / (2σ²)) is equivalent to this

[[ 1.38911504 -1.42013748]
 [ 0.20322255 -1.61682412]
 [ 0.94451182 -0.68590711]
 [ 0.51364478  0.60033878]
 [-0.43408285  0.33189599]
 [-0.31912234  0.43451177]
 [ 0.39046452 -0.00742856]
 [-0.75012489 -1.72207791]
 [-1.1507511  -1.15423603]
 [-0.37090728 -0.4793732 ]
 [-0.44050571  0.99369674]
 [-0.4707938   0.58799988]
 [-0.48836157 -0.87893327]
 [-0.53861051 -1.20758338]
 [ 1.70616321  0.58151797]
 [-0.43820925  0.75632754]
 [-0.03235427  1.31527192]
 [ 1.72466345 -0.74122612]
 [-0.0255818  -0.02625289]
 [ 0.4462751  -0.39590596]
 [-0.06026679 -0.50096646]
 [-0.70295338  1.27733039]
 [-0.24741266  0.50770058]
 [ 1.8131512  -0.67711528]
 [ 0.97846201 -1.9756037 ]
 [-0.09043482  0.56028558]
 [-1.01605802  0.15719728]
 [-0.78198932 -0.42968742]
 [ 1.04672418 -0.98820412]
 [-0.06141112  0.015029  ]
 [-1.57639818  0.29331953]
 [ 0.71482833  0.70024087]
 [ 0.31468115 -0.58820594]
 [-0.31703548  0.96914046]
 [-0.65685851 -0.19832422]
 [-0.6378845  -0.36887553]
 [-0.1209582  -0.23575728]
 

#### 57. How to randomly place p elements in a 2D array? (★★☆)

In [61]:
answer(57)

arr = np.full(shape=[10,10], fill_value=0) # creating an array with a custom shape and default value
p = 3


# the following is useful for choosing a random number (size) of items from a max limit of (a)
# replace=False means that no choice can be chosen twice.
# in this case, we use this function to choose p (size) indicies from an array that have (arr.size=100) total indicies
# .
indicies = np.random.choice(a=arr.size, size=p, replace=False)
print(indicies)


# int this function np.put,  works with the random indicies we choosed, the values to put in them, and the array to make the changes to
# the value (v) column when we want 3 values for example can work with 1, 2, or 3 values normally!
# .
np.put(a=arr, ind=indicies, v=[20])
print(arr)


# Author: Divakar

n = 10
p = 3
Z = np.zeros((n,n))
np.put(Z, np.random.choice(range(n*n), p, replace=False),1)
print(Z)
[86 94 96]
[[ 0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0 20  0  0  0]
 [ 0  0  0  0 20  0 20  0  0  0]]


#### 58. Subtract the mean of each row of a matrix (★★☆)

In [62]:
arr = np.random.rand(4,4)
l = []

# mean(axis=1) can replace all of the following
for i in range(arr.shape[1]): 
    l.append(arr[:,i].mean())

print(l)
print('#'*10)
print(arr)
print('#'*10)

answer(58)


old_arr = np.random.rand(5, 10)

# Recent versions of numpy
new_arr = old_arr - old_arr.mean(axis=1, keepdims=True)

# Older versions of numpy
new_arr = old_arr - old_arr.mean(axis=1).reshape(-1, 1)


print('#'*10)
print(old_arr)
print('#'*10)
print(new_arr)
# =============================================
# NUMPY keepdims=True - KEY POINTS
# =============================================

# 1. WHAT IT DOES:
# ---------------
# keepdims=True preserves array dimensions during reduction operations
# Without it: (5, 10) → mean(axis=1) → (5,)    # Dimension lost!
# With it:    (5, 10) → mean(axis=1, keepdims=True) → (5, 1)  # Dimension kept!

# 2. WHY IT MATTERS FOR BROADCASTING:
# -----------------------------------
# X.shape = (5, 10)
# 
# Without keepdims:
# mean = X.mean(axis=1)        # shape (5,) 
# Y = X - mean                 # ERROR! (5,10) vs (5,) - incompatible
#
# With keepdims:
# mean = X.mean(axis=1, keepdims=True)  # shape (5, 1)
# Y = X - mean                          # WORKS! (5,10) vs (5,1) - broadcasts

# 3. PRACTICAL EXAMPLES:
# ----------------------
# Standardization:
# normalized = (X - X.mean(axis=1, keepdims=True)) / X.std(axis=1, keepdims=True)

# Centering:
# centered = X - X.mean(axis=0, keepdims=True)  # Center columns

# 4. ALTERNATIVE WITHOUT keepdims:
# --------------------------------
# mean_manual = X.mean(axis=1).reshape(-1, 1)  # Manual reshaping needed
# Y_manual = X - mean_manual

# 5. BOTTOM LINE:
# ---------------
# Use keepdims=True for cleaner code and automatic broadcasting compatibility
# Especially useful with: mean(), sum(), max(), min(), std(), var()
# =============================================
# reshape(-1,1) means: keep the number of rows as it's but force the number of columns to be 1

[np.float64(0.4813287148335565), np.float64(0.39814656509649915), np.float64(0.262728471364426), np.float64(0.23259706535700708)]
##########
[[0.13895868 0.35695373 0.68778427 0.34612549]
 [0.96041823 0.25222112 0.27108971 0.12594874]
 [0.46322457 0.15772706 0.04843935 0.02603067]
 [0.36271338 0.82568435 0.04360054 0.43228337]]
##########
# Author: Warren Weckesser

X = np.random.rand(5, 10)

# Recent versions of numpy
Y = X - X.mean(axis=1, keepdims=True)

# Older versions of numpy
Y = X - X.mean(axis=1).reshape(-1, 1)

print(Y)
##########
[[0.69006014 0.98989904 0.4947872  0.83232398 0.99093792 0.05943735
  0.35247217 0.56209827 0.434965   0.28450008]
 [0.85556265 0.15798301 0.78615639 0.78305643 0.20224852 0.06818676
  0.57551701 0.75921576 0.8581126  0.75174616]
 [0.95236336 0.85027009 0.12580083 0.56097098 0.69075078 0.31266634
  0.44770154 0.27724411 0.81491723 0.63672179]
 [0.50048345 0.48322089 0.97080722 0.72426595 0.92038495 0.52786653
  0.0671676  0.34606795 0.18865921 0.600

#### 59. How to sort an array by the nth column? (★★☆)

In [63]:
arr = np.random.randint(0,10,[10,10])
print(arr)
print('#'*10)
n = 3
# arr.sort(axis=0, order=3)

answer(59)

print(arr[ arr[:,n].argsort() ]) # getting the indexes of the sorted nth column then "arr[sorted_column]" sort the whole array by it.

# =============================================
# NUMPY SORTING - COMPLETE SUMMARY
# =============================================

# 1. BASIC SORTING FUNCTIONS:
# ---------------------------
# sorted_arr = np.sort(arr)           # Returns sorted COPY (all dimensions)
# arr.sort()                          # In-place sort (modifies original)
# indices = np.argsort(arr)           # Returns indices that sort the array
# sorted_arr = arr[indices]           # Sort using indices

# 2. MULTI-DIMENSIONAL SORTING:
# -----------------------------
# np.sort(arr_2d, axis=0)            # Sort along ROWS (each column)
# np.sort(arr_2d, axis=1)            # Sort along COLUMNS (each row)  
# np.sort(arr_2d, axis=None)         # Flatten and sort entire array

# 3. PARTIAL SORTING (Efficient for top/bottom k):
# ------------------------------------------------
# 1. partial = np.partition(arr, k)      # Elements before k are smaller, after are larger
# 2. indices = np.argpartition(arr, k)   # Indices for partial sort
# 3. Get 3 largest elements:
    # Step 1: np.partition(arr, -3) - This rearranges the array so that the 3rd largest element 
    #   is in the 3rd position from the end, with all larger elements to its right
    # Step 2: [-3:] - This takes a slice of the last 3 elements from the rearranged array
    # Example: If arr = [5, 2, 8, 1, 9, 3, 7]
    #          np.partition(arr, -3) might give: [1, 2, 3, 5, 7, 8, 9]  
    #          Then [-3:] takes: [7, 8, 9] which are the 3 largest

# 4. SORTING ALGORITHMS (kind parameter):
# ---------------------------------------
# np.sort(arr, kind='quicksort')     # Fastest, not stable (DEFAULT)
# np.sort(arr, kind='mergesort')     # Stable (preserves equal element order)
# np.sort(arr, kind='heapsort')      # Good worst-case performance

# 5. STRUCTURED ARRAYS (sort by fields):
# --------------------------------------
# dt = [('name', 'U10'), ('age', int)]
# people = np.array([('Alice',25), ('Bob',20)], dtype=dt)
# by_age = np.sort(people, order='age')           # Sort by age field
# by_multi = np.sort(people, order=['age','name'])# Sort by multiple fields

# 6. SPECIAL DATA TYPES:
# ----------------------
# Complex: sorted by real part, then imaginary
# Strings: lexicographical order
# Custom: use argsort with custom key function

# 7. COMMON PATTERNS:
# -------------------
# Descending order: np.sort(arr)[::-1]
# Sort rows by column: data[data[:,0].argsort()]
# Get top k: arr[np.argpartition(arr, -k)[-k:]]
# Stable sort for complex cases: kind='mergesort'

# =============================================
# QUICK REFERENCE - WHEN TO USE WHAT:
# =============================================

# np.sort()        - Need sorted copy, any dimension
# .sort()          - In-place, memory efficient  
# np.argsort()     - Need sorting order for multiple arrays
# np.partition()   - Only care about top/bottom k elements (FAST)
# kind='mergesort' - When equal elements should keep original order
# axis parameter   - Control sorting direction in multi-dim arrays

# =============================================
# PERFORMANCE TIPS:
# =============================================
# - Use partition() instead of sort() for top/bottom k
# - quicksort is fastest for most cases
# - mergesort when stability matters
# - argsort + indexing for sorting multiple related arrays
# =============================================

[[5 3 6 3 5 8 6 2 4 8]
 [4 5 1 3 8 1 9 4 9 9]
 [2 1 4 7 3 7 6 5 1 2]
 [8 6 1 0 5 6 6 7 6 4]
 [2 6 6 0 8 2 0 0 0 6]
 [9 8 4 3 1 3 5 8 2 7]
 [7 8 3 7 6 0 4 7 0 2]
 [9 5 9 5 3 0 5 5 2 0]
 [9 8 9 8 0 5 0 4 6 6]
 [7 9 0 1 6 8 4 9 0 1]]
##########
# Author: Steve Tjoa

Z = np.random.randint(0,10,(3,3))
print(Z)
print(Z[Z[:,1].argsort()])
[[8 6 1 0 5 6 6 7 6 4]
 [2 6 6 0 8 2 0 0 0 6]
 [7 9 0 1 6 8 4 9 0 1]
 [4 5 1 3 8 1 9 4 9 9]
 [9 8 4 3 1 3 5 8 2 7]
 [5 3 6 3 5 8 6 2 4 8]
 [9 5 9 5 3 0 5 5 2 0]
 [2 1 4 7 3 7 6 5 1 2]
 [7 8 3 7 6 0 4 7 0 2]
 [9 8 9 8 0 5 0 4 6 6]]


#### 60. How to tell if a given 2D array has null columns? (★★☆)

In [64]:
print(f[f is None]) # wrong!

answer(60)

# =============================================
# CHECKING FOR NULL/NaN VALUES IN NUMPY ARRAYS
# =============================================

# 1. BASIC NaN CHECKING:
# ----------------------
# nan_mask = np.isnan(arr)            # Returns Boolean array (True where NaN exists)
# has_any_nan = np.any(np.isnan(arr)) # Returns True if ANY NaN found in array
# all_finite = np.all(np.isfinite(arr)) # Returns True if ALL elements are valid numbers

# 2. PRACTICAL EXAMPLES:
# ----------------------
# arr = np.array([1.0, 2.0, np.nan, 4.0])
# 
# np.isnan(arr)     → [False, False, True, False]  # See WHERE NaNs are
# np.any(np.isnan(arr)) → True                     # Check IF any NaN exists
# np.all(np.isfinite(arr)) → False                 # Check if ALL are valid numbers

# 3. IMPORTANT NOTES:
# -------------------
# - Integer arrays CANNOT contain NaN values (only works with Floats)
# - Use np.isnan(np.sum(arr)) for efficient yes/no checking
# - Use np.isnan(arr) to locate exact positions of NaNs
# - np.isfinite() also checks for infinite values (inf, -inf)

# 4. QUICK REFERENCE:
# -------------------
# "Is there any NaN?" → np.any(np.isnan(arr))
# "Where are the NaNs?" → np.isnan(arr)  
# "Are all values valid numbers?" → np.all(np.isfinite(arr))
# =============================================

print('#'*10)
# checking int 2D arrays for all-zero columns: ~arr.any(axis=0).any()
# checking float 2D arrays for None columns: np.isnan(arr).all(axis=0)
# .


[]
# Author: Warren Weckesser

# null : 0 
Z = np.random.randint(0,3,(3,10))
print((~Z.any(axis=0)).any())

# null : np.nan
Z=np.array([
    [0,1,np.nan],
    [1,2,np.nan],
    [4,5,np.nan]
])
print(np.isnan(Z).all(axis=0))
##########


#### 61. Find the nearest value from a given value in an array (★★☆)

In [65]:
answer(61)
arr = np.random.uniform(0,1,10)
value = 0.5
print('#'*10)
print(arr)
print('#'*10)

print(arr[np.abs(arr - value).argmin()]) 
# if we want to use the same function for nD array (1D, 2D, 3D) we can use the same logic with arr.flat[] instead of arr[]
# the code first subtract the "value" from all the "arr" elements ... the closest element in the "arr" to the "value" will have the minimum subtraction value
#   by argmin we get the index of that subtraction value (instead of the opposite that happens with min())

Z = np.random.uniform(0,1,10)
z = 0.5
m = Z.flat[np.abs(Z - z).argmin()]
print(m)
##########
[0.70597176 0.65646214 0.68223378 0.062006   0.07320623 0.58485328
 0.56510102 0.82144679 0.61666512 0.28245176]
##########
0.5651010185739924


#### 62. Considering two arrays with shape (1,3) and (3,1), how to compute their sum using an iterator? (★★☆)

In [66]:
hint(62)
print('#'*10)
a = np.random.rand(1,3)
b = np.random.rand(3,1)
print(a,b)
sum =0
for i in np.nditer(a):
    sum += i
print(sum)
print('#'*10)
answer(62)

hint: np.nditer
##########
[[0.21037845 0.14174329 0.33271039]] [[0.3696795 ]
 [0.14794074]
 [0.4389596 ]]
0.6848321361837506
##########
A = np.arange(3).reshape(3,1)
B = np.arange(3).reshape(1,3)
it = np.nditer([A,B,None])
for x,y,z in it: z[...] = x + y
print(it.operands[2])


#### 63. Create an array class that has a name attribute (★★☆)

In [67]:
class arr:
    name: int

answer(63)

class arr2(np.ndarray):
    def __new__():
        pass


class NamedArray(np.ndarray):
    def __new__(cls, array, name="no name"):
        obj = np.asarray(array).view(cls)
        obj.name = name
        return obj
    def __array_finalize__(self, obj):
        if obj is None: return
        self.name = getattr(obj, 'name', "no name")

Z = NamedArray(np.arange(10), "range_10")
print (Z.name)


#### 64. Consider a given vector, how to add 1 to each element indexed by a second vector (be careful with repeated indices)? (★★★)

In [68]:
# I think the questions asks about making two random int arrays then 
#   add one two each element after reindexing the first array by a second array.

a = np.ones(10)
b = np.random.randint(0,len(a),10)
c = np.random.randint(0,len(a),10)

# a.index_by(b)

answer(64)
# what actually the question is asking about is: extracting the (index:frequency) values from a random array then adding one two them (these indicies).


# that is my solution: 
d = np.bincount(b, minlength=10)
print(d+1.0)
print('#'*10)

# another solution:
d = np.bincount(b, minlength=10) # By default, numpy.bincount creates an output array with a length of one more than the maximum value in your input array. The minlength parameter is useful when you need the output to be a specific size, regardless of the values in your input 
print(d+a)
print('#'*10)

# another:
np.add.at(a, b, 1)
print(a)

# Author: Brett Olsen

Z = np.ones(10)
I = np.random.randint(0,len(Z),20)
Z += np.bincount(I, minlength=len(Z))
print(Z)

# Another solution
# Author: Bartosz Telenczuk
np.add.at(Z, I, 1)
print(Z)
[2. 2. 2. 1. 1. 2. 1. 2. 3. 4.]
##########
[2. 2. 2. 1. 1. 2. 1. 2. 3. 4.]
##########
[2. 2. 2. 1. 1. 2. 1. 2. 3. 4.]


#### 65. How to accumulate elements of a vector (X) to an array (F) based on an index list (I)? (★★★)

In [69]:
x = np.random.randint(0,10,10)
f = np.ones(10)
l = np.random.randint(0,10,10)

print('x:', x)
print('f:', f)
print('l: ', l)

print('#'*10)

f[l] += x # if I have two indicies in (l) with the same value it takes the last one of them not both .. that's wrong.

print(f)
print('#'*10)

answer(65)
print('#'*10)

print(1.0 + np.bincount(l, weights=x)) # now that's correct!
# what have happened here is that each time an element (e) appears in the array (l), it's counted as one "custom weight" not "1" as the normal way.
# by doing so: we found ourselves with an array that in position (0) for example have not the number of times the element (0) occured in (l) only but: the number of times it appeared * its value in x.
# this will force elements of (x) to appear in their correct position in (l) and if they appeared multiple times in (l) the result array will accumulate the values of each (x) in its correct position.

x: [0 8 1 4 0 6 6 9 2 4]
f: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
l:  [0 9 6 5 9 4 5 7 3 8]
##########
[ 1.  1.  1.  3.  7.  7.  2. 10.  5.  1.]
##########
# Author: Alan G Isaac

X = [1,2,3,4,5,6]
I = [1,3,9,3,4,1]
F = np.bincount(I,X)
print(F)
##########
[ 1.  1.  1.  3.  7. 11.  2. 10.  5.  9.]


#### 66. Considering a (w,h,3) image of (dtype=ubyte), compute the number of unique colors (★★☆)

In [70]:
# ???
# I think: 255 * 255 * 255.

# no the question asks about if I have a real image with the 3D representation: (weight, height, 3 "for colors") how to know the unique combinations of colors available.
image = np.random.randint(0,255, size=[100,10,3], dtype=np.ubyte)

# the first solution by deepseek:
#   reshape -> calculate unique colors/rows -> len
pixels = image.reshape(-1, 3) # Reshape to (w*h, 3)
unique_colors = np.unique(pixels, axis=0) # Find unique colors/rows
number_of_unique_colors = len(unique_colors)

print(f"The image has {number_of_unique_colors} unique colors.")

# an efficient solution by deepseek:
#   reshape -> convert to void -> unique -> len
pixels = image.reshape(-1, 3)
void_view = np.ascontiguousarray(pixels).view(np.dtype((np.void, pixels.dtype.itemsize * pixels.shape[1])))
unique_colors = np.unique(void_view)

print('from void: ', len(unique_colors))
print('#'*10)

answer(66)
print('#'*10)


w, h = 256, 256
I = np.random.randint(0,4,(h,w,3), dtype=np.uint8) # all the possible values in the 3 channels are just 4 ... which makes all possible combinations: 4 * 4 * 4= 64 -> this makes validating the answer easier.

# View each pixel as a single 24-bit integer, rather than three 8-bit bytes
I24 = np.dot(I.astype(np.uint32),[1,256,65536])

# how to previous line works?: 
    # [1, 256, 65536] are powers of 256: 256⁰, 256¹, 256²
    # Each RGB triple [R, G, B] becomes: R×1 + G×256 + B×65536
    # This packs the 3 bytes into one 24-bit integer
    # finally np.dot produces one answer by adding all the (multiplied results) together: a1*b1 + a2*b2 + ... an*bn



# Count unique colours
n = len(np.unique(I24))
print(n)

The image has 1000 unique colors.
from void:  1000
##########
# Author: Fisher Wang

w, h = 256, 256
I = np.random.randint(0, 4, (h, w, 3)).astype(np.ubyte)
colors = np.unique(I.reshape(-1, 3), axis=0)
n = len(colors)
print(n)

# Faster version
# Author: Mark Setchell
# https://stackoverflow.com/a/59671950/2836621

w, h = 256, 256
I = np.random.randint(0,4,(h,w,3), dtype=np.uint8)

# View each pixel as a single 24-bit integer, rather than three 8-bit bytes
I24 = np.dot(I.astype(np.uint32),[1,256,65536])

# Count unique colours
n = len(np.unique(I24))
print(n)
##########
64


#### 67. Considering a four dimensions array, how to get sum over the last two axis at once? (★★★)

In [71]:
arr = np.random.randint(0,10, [4,4,4,4])
print(arr)
print('#'*10)

print(np.sum((np.sum(arr, axis=3)), axis=2)) # that's correct !! :)
print('#'*10)

answer(67)
print('#'*10)


# solution by passing a tuple of axes (introduced in numpy 1.7.0)
sum = arr.sum(axis=(-2,-1))
print(sum)
# solution by flattening the last two dimensions into one
# (useful for functions that don't accept tuples for axis argument)
sum = arr.reshape(arr.shape[:-2] + (-1,)).sum(axis=-1)

# the meaning of the previous code: 
# 1. The numpy.reshape function can take the new shape either as a tuple of integers (dim1, dim2, ...) or as a series of separated integer arguments dim1, dim2, ...
# 2. the function written: arr.shape[:-2] + (-1,) -> produces a tuple by concatenating two tuples
# 3. arr.shape[:-2]: this is a tuple that contains the shape of all the array except the last two dimensions.
# 4. (-1,): this is also a tuple that contains only the number -1 and what it do is commanding numpy to calculate the size of the last dimension (in this case, the dimension size = the product of the removed two dimensions at the end of the array)
# 5. .sum(axis=-1): sum by the last dimension: which is automatically calculated in this case = original dim3 * original dim4.
# .
print(sum)

[[[[1 3 2 8]
   [9 7 9 8]
   [3 8 4 0]
   [1 6 2 1]]

  [[6 6 7 8]
   [4 9 9 5]
   [8 7 5 2]
   [9 3 9 3]]

  [[9 5 5 7]
   [8 6 7 3]
   [6 5 2 3]
   [5 9 5 8]]

  [[6 9 6 1]
   [0 5 5 0]
   [1 7 9 3]
   [2 4 1 6]]]


 [[[0 4 1 2]
   [4 7 8 8]
   [3 3 6 2]
   [3 3 6 9]]

  [[5 9 4 9]
   [9 4 8 4]
   [6 9 3 8]
   [6 0 7 7]]

  [[5 0 3 0]
   [1 5 9 4]
   [8 8 5 2]
   [6 6 3 7]]

  [[0 7 1 4]
   [0 4 3 3]
   [9 7 7 8]
   [6 8 1 3]]]


 [[[9 0 2 9]
   [9 0 2 9]
   [5 3 5 4]
   [6 0 6 8]]

  [[4 4 2 0]
   [2 6 4 3]
   [0 3 2 3]
   [9 1 3 9]]

  [[8 9 6 9]
   [9 6 7 7]
   [1 3 0 1]
   [7 7 6 9]]

  [[4 6 5 7]
   [8 5 0 1]
   [5 0 7 6]
   [7 0 5 2]]]


 [[[3 8 8 8]
   [9 3 9 7]
   [7 8 3 2]
   [8 7 2 3]]

  [[4 3 4 0]
   [3 1 8 3]
   [8 6 9 2]
   [0 9 4 5]]

  [[6 7 0 2]
   [6 1 4 9]
   [4 9 1 5]
   [6 5 4 3]]

  [[6 2 8 0]
   [6 8 9 0]
   [9 9 2 8]
   [4 7 9 3]]]]
##########
[[ 72 100  93  65]
 [ 69  98  72  71]
 [ 77  55  95  68]
 [ 95  69  72  90]]
##########
A = np.random.randint(0,10,(3,

#### 68. Considering a one-dimensional vector D, how to compute means of subsets of D using a vector S of same size describing subset  indices? (★★★)

In [72]:
DVector = np.random.randint(0,10,10)
SSubset = np.random.randint(0,5,10)

# the solution is correct !
print(np.bincount(SSubset, DVector)/ np.bincount(SSubset)) # this solution is by me .. also most of the solutions that comes before [answer()] is by me
print('#'*10)

answer(68)

# import pandas as pd
# print(pd.Series(DVector).groupby(SSubset).mean())

[3.         5.33333333 8.         4.75       3.        ]
##########
# Author: Jaime Fernández del Río

D = np.random.uniform(0,1,100)
S = np.random.randint(0,10,100)
D_sums = np.bincount(S, weights=D)
D_counts = np.bincount(S)
D_means = D_sums / D_counts
print(D_means)

# Pandas solution as a reference due to more intuitive code
import pandas as pd
print(pd.Series(D).groupby(S).mean())


#### 69. How to get the diagonal of a dot product? (★★★)

In [73]:
a = np.random.randint(0,10,[5,5])
b = np.random.randint(0,10,[5,5])
c = np.dot(a,b)

print(a,b,c, sep='\n') # is there a diagonal for a dot product ? -> I think it will if I changed the shapes to 2D
print('#'*10)

answer(69)
print('#'*10)

print(np.diag(c)) # [i,i] = Σⱼ A[i,j] * B[j,i]

[[2 4 9 5 1]
 [9 3 2 9 7]
 [6 2 5 5 4]
 [7 8 7 3 9]
 [6 5 4 4 7]]
[[8 6 4 4 7]
 [6 3 1 5 0]
 [8 1 3 5 7]
 [2 7 0 9 8]
 [4 2 7 3 6]]
[[126  70  46 121 123]
 [152 142  94 163 191]
 [126  90  69 116 141]
 [202 112 120 157 176]
 [146  97  90 126 144]]
##########
# Author: Mathieu Blondel

A = np.random.uniform(0,1,(5,5))
B = np.random.uniform(0,1,(5,5))

# Slow version
np.diag(np.dot(A, B))

# Fast version
np.sum(A * B.T, axis=1)

# Faster version
np.einsum("ij,ji->i", A, B)
##########
[126 142  69 157 144]


#### 70. Consider the vector [1, 2, 3, 4, 5], how to build a new vector with 3 consecutive zeros interleaved between each value? (★★★)

In [74]:
arr = [1, 2, 3, 4, 5]
zeros = 3
# arr2= np.array(arr[0])
# for i in arr[1:]:
#     arr2 = np.concatenate([arr2, np.array([0, 0, 0, i])])
# print(arr2)

answer(70)
print('#'*10)

new_arr = np.zeros(len(arr) + (len(arr)-1)*zeros) # the size of the new array is the size of actual elements + the zeros in between (3 zeros per each actual element except the last element.)
new_arr[::zeros+1] = arr # this assigns a value of new_arr to an actual element in (arr).
print(new_arr)


# Author: Warren Weckesser

Z = np.array([1,2,3,4,5])
nz = 3
Z0 = np.zeros(len(Z) + (len(Z)-1)*(nz))
Z0[::nz+1] = Z
print(Z0)
##########
[1. 0. 0. 0. 2. 0. 0. 0. 3. 0. 0. 0. 4. 0. 0. 0. 5.]


#### 71. Consider an array of dimension (5,5,3), how to multiply it by an array with dimensions (5,5)? (★★★)

In [75]:
# by making the array with dimensions (5,5) -> (5,5,1)
a = np.ones((5,5,3))
b = 2*np.ones((5,5))
c = b.reshape(5,5,1)
print(a,b,a*c, sep='\n')
print('#'*10)

answer(71)
print('#'*10)

print(a * b[:,:,None])

# ========================
# RESHAPE vs NEWAXIS (None)
# ========================

# RESHAPE USE CASES:
# - Complete dimension reorganization: arr.reshape(2,3,4)
# - When total elements must be preserved: 24 elements -> (2,3,4)
# - Using -1 for auto-calculation: arr.reshape(5,-1,2)
# - Flattening arrays: arr.reshape(-1)
# - Converting between different dimensional layouts

# NEWAXIS (None) USE CASES:
# - Simply adding single dimensions: arr[:,None] or arr[None,:]
# - Preparing for broadcasting: A * B[:,None]
# - Converting 1D to column/row vectors: 
#   col_vector = arr[:,None]  # (n,) -> (n,1)
#   row_vector = arr[None,:]  # (n,) -> (1,n)
# - Quick dimension addition without full reshape

# KEY DIFFERENCES:
# reshape() - Manual control, must specify all dimensions
# newaxis   - Automatic, preserves existing dimensions
# reshape() - Can fail if element count doesn't match
# newaxis   - Always works for dimension addition
# reshape() - More explicit but verbose
# newaxis   - Cleaner syntax for simple dimension adding

# PERFORMANCE:
# Both typically return views (memory efficient)
# No significant performance difference for most cases

# CODE EXAMPLES:
# arr.reshape(5,5,1)    # Explicit full reshape
# arr[:,:,None]         # Clean dimension addition
# arr.reshape(-1,1)     # Auto-calc with reshape
# arr[:,None]           # Same result with newaxis

[[[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]]
[[2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2.]]
[[[2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]]

 [[2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]]

 [[2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]]

 [[2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]]

 [[2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]]]
##########
A = np.ones((5,5,3))
B = 2*np.ones((5,5))
print(A * B[:,:,None])
##########
[[[2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]]

 [[2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]
  [2. 2. 2.]]

 [[2. 2. 2.]
  [2.

#### 72. How to swap two rows of an array? (★★★)

In [76]:
# using a temp var ?
answer(72)

a = np.arange(25).reshape((5,5))

a[[0,1]] = a[[1,0]] # just like what you can do in normal python.
print(a)

# Author: Eelco Hoogendoorn

A = np.arange(25).reshape(5,5)
A[[0,1]] = A[[1,0]]
print(A)
[[ 5  6  7  8  9]
 [ 0  1  2  3  4]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]


#### 73. Consider a set of 10 triplets describing 10 triangles (with shared vertices), find the set of unique line segments composing all the  triangles (★★★)

In [None]:
# there is a solution using: reshape, meshgrid, sort, unique ... all together ... but I won't write it.

answer(73)

faces = np.random.randint(0,100,(10,3)) # per each edge of any triangle there is only (one value)

F = np.roll(faces.repeat(2,axis=1),-1,axis=1) # all possible combinations of points (=line segments of each triangle) ... this doubles the number of points then roll them: 
# faces.repeat(2,axis=1) doubles each vertex: [a,b,c] → [a,a,b,b,c,c]
# np.roll(...,-1,axis=1) shifts left by 1: [a,a,b,b,c,c] → [a,b,b,c,c,a]
# Result: adjacent pairs representing edges: [(a,b), (b,c), (c,a)]

F = F.reshape(len(F)*3,2) # (10,6) to (30,2)
F = np.sort(F,axis=1) # Ensures [3,1] and [1,3] both become [1,3]

G = F.view( dtype=[('p0',F.dtype),('p1',F.dtype)] )
# view() reinterprets each [p0,p1] pair as a single structured element with two fields
# This allows np.unique() to treat entire edge pairs as atomic units for comparison
# Without this trick, np.unique() would treat the 2D array differently


G = np.unique(G) # without the previous step this would be: np.unique(G, axis=0)
print(G)

# Author: Nicolas P. Rougier

faces = np.random.randint(0,100,(10,3))
F = np.roll(faces.repeat(2,axis=1),-1,axis=1)
F = F.reshape(len(F)*3,2)
F = np.sort(F,axis=1)
G = F.view( dtype=[('p0',F.dtype),('p1',F.dtype)] )
G = np.unique(G)
print(G)


#### 74. Given a sorted array C that corresponds to a bincount, how to produce an array A such that np.bincount(A) == C? (★★★)

#### 75. How to compute averages using a sliding window over an array? (★★★)

#### 76. Consider a one-dimensional array Z, build a two-dimensional array whose first row is (Z[0],Z[1],Z[2]) and each subsequent row is  shifted by 1 (last row should be (Z[-3],Z[-2],Z[-1]) (★★★)

#### 77. How to negate a boolean, or to change the sign of a float inplace? (★★★)

#### 78. Consider 2 sets of points P0,P1 describing lines (2d) and a point p, how to compute distance from p to each line i (P0[i],P1[i])? (★★★)

#### 79. Consider 2 sets of points P0,P1 describing lines (2d) and a set of points P, how to compute distance from each point j (P[j]) to each line i (P0[i],P1[i])? (★★★)

#### 80. Consider an arbitrary array, write a function that extracts a subpart with a fixed shape and centered on a given element (pad with a `fill` value when necessary) (★★★)

#### 81. Consider an array Z = [1,2,3,4,5,6,7,8,9,10,11,12,13,14], how to generate an array R = [[1,2,3,4], [2,3,4,5], [3,4,5,6], ..., [11,12,13,14]]? (★★★)

#### 82. Compute a matrix rank (★★★)

#### 83. How to find the most frequent value in an array?

#### 84. Extract all the contiguous 3x3 blocks from a random 10x10 matrix (★★★)

#### 85. Create a 2D array subclass such that Z[i,j] == Z[j,i] (★★★)

#### 86. Consider a set of p matrices with shape (n,n) and a set of p vectors with shape (n,1). How to compute the sum of of the p matrix products at once? (result has shape (n,1)) (★★★)

#### 87. Consider a 16x16 array, how to get the block-sum (block size is 4x4)? (★★★)

#### 88. How to implement the Game of Life using numpy arrays? (★★★)

#### 89. How to get the n largest values of an array (★★★)

#### 90. Given an arbitrary number of vectors, build the cartesian product (every combination of every item) (★★★)

#### 91. How to create a record array from a regular array? (★★★)

#### 92. Consider a large vector Z, compute Z to the power of 3 using 3 different methods (★★★)

#### 93. Consider two arrays A and B of shape (8,3) and (2,2). How to find rows of A that contain elements of each row of B regardless of the order of the elements in B? (★★★)

#### 94. Considering a 10x3 matrix, extract rows with unequal values (e.g. [2,2,3]) (★★★)

#### 95. Convert a vector of ints into a matrix binary representation (★★★)

#### 96. Given a two dimensional array, how to extract unique rows? (★★★)

#### 97. Considering 2 vectors A & B, write the einsum equivalent of inner, outer, sum, and mul function (★★★)

#### 98. Considering a path described by two vectors (X,Y), how to sample it using equidistant samples (★★★)?

#### 99. Given an integer n and a 2D array X, select from X the rows which can be interpreted as draws from a multinomial distribution with n degrees, i.e., the rows which only contain integers and which sum to n. (★★★)

#### 100. Compute bootstrapped 95% confidence intervals for the mean of a 1D array X (i.e., resample the elements of an array with replacement N times, compute the mean of each sample, and then compute percentiles over the means). (★★★)