# 100 numpy exercises

This is a collection of exercises that have been collected in the numpy mailing list, on stack overflow
and in the numpy documentation. The goal of this collection is to offer a quick reference for both old
and new users but also to provide a set of exercises for those who teach.


If you find an error or think you've a better way to solve some of them, feel
free to open an issue at <https://github.com/rougier/numpy-100>.

File automatically generated. See the documentation to update questions/answers/hints programmatically.

Run the `initialize.py` module, then call a random question with `pick()` an hint towards its solution with
`hint(n)` and the answer with `answer(n)`, where n is the number of the picked question.

In [1]:
%run initialise.py

In [2]:
import numpy as np

In [57]:
pick()

83. How to find the most frequent value in an array?


#### 38. Consider a generator function that generates 10 integers and use it to build an array (★☆☆)

In [4]:
# p 38
def generate() :
    arr = []
    for i in range(10) :
        arr.append(np.random.randint(1500))
    return arr

x = generate()
print(x)

[617, 689, 1223, 904, 7, 538, 946, 1253, 1490, 396]


In [5]:
def generate_answer() :
    for x in range(10):
        yield x 

print(np.fromiter(generate_answer(), float, count=10))


[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]


##### answer

In [6]:
answer(38)

def generate():
    for x in range(10):
        yield x
Z = np.fromiter(generate(),dtype=float,count=-1)
print(Z)


In [7]:
def generate():
    for x in range(10):
        yield x # can't return.. => because iteration!
Z = np.fromiter(generate(),dtype=float,count=-1)
print(Z)
# Fromiter function => parameter: iteration, type => So, iteration can be function!
# Count => How much iteration? => can made multiplication table by (i<j)! or.. binomial coefficient ?

[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]


##### Which is the fatest code?

In [8]:
%timeit -n 10000 np.fromiter(generate(), dtype = int, count = -1)
%timeit -n 10000 generate_answer
%timeit -n 10000 np.fromiter((i for i in range(10)), dtype  = int, count = -1)

# Fromiter is much faster than generate_answer!

1.94 µs ± 633 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
13.2 ns ± 0.072 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
1.53 µs ± 24.9 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


#### 21. Create a checkerboard 8x8 matrix using the tile function (★☆☆)

In [9]:
x = np.tile([[0, 1],[1, 0]], (4, 4))
print(x)
type(np.array([[0,1],[1,0]]))

[[0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]]


numpy.ndarray

In [10]:
answer(21)

Z = np.tile( np.array([[0,1],[1,0]]), (4,4))
print(Z)


In [11]:
Z = np.tile( np.array([[0,1],[1,0]]), (4,4))
print(Z)
type(Z)

[[0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]]


numpy.ndarray

##### Is it faster to construct using ndarray than list ?

In [12]:
%timeit -n 10000 np.tile( np.array([[0,1],[1,0]]), (4,4))
%timeit -n 10000 np.tile([[0, 1],[1, 0]], (4, 4))
# similar!
# Below is little faster because of transformation.. but.. hmm
# I think ndarray is more stable than list.

4.91 µs ± 1.06 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
4.49 µs ± 91.2 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


#### 45. Create random vector of size 10 and replace the maximum value by 0 (★★☆)

In [13]:
arr = np.random.randint(1500, size = 10)
arr

array([1162,  147, 1137,  438,  439,  723, 1480,  460, 1065,  454])

In [14]:
arr[np.argmax(arr)] = 0
arr

array([1162,  147, 1137,  438,  439,  723,    0,  460, 1065,  454])

In [15]:
answer(45)

Z = np.random.random(10)
Z[Z.argmax()] = 0
print(Z)


In [16]:
def answers() :
    Z = np.random.random(10)
    Z[Z.argmax()] = 0
    return Z

def my_answer() :
    arr = np.random.randint(1500, size = 10)
    arr[np.argmax(arr)] = 0
    return arr

%timeit -n 10000 answers()
%timeit -n 10000 my_answer()

1.68 µs ± 541 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
8.71 µs ± 31.4 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


#### 52. Consider a random vector with shape (100,2) representing coordinates, find point by point distances (★★☆)

In [17]:
answer(52)

Z = np.random.random((10,2))
X,Y = np.atleast_2d(Z[:,0], Z[:,1])
D = np.sqrt( (X-X.T)**2 + (Y-Y.T)**2)
print(D)

# Much faster with scipy
import scipy
# Thanks Gavin Heverly-Coulson (#issue 1)
import scipy.spatial

Z = np.random.random((10,2))
D = scipy.spatial.distance.cdist(Z,Z)
print(D)


In [18]:
Z = np.random.random((10,2))
X, Y = Z[: , 0].reshape(1, -1), Z[:, 1].reshape(1, -1)
# X,Y = np.atleast_2d(Z[:,0], Z[:,1])
D = np.sqrt( (X-X.T)**2 + (Y-Y.T)**2)
print(D)

# Much faster with scipy
import scipy
# Thanks Gavin Heverly-Coulson (#issue 1)
import scipy.spatial

Z = np.random.random((10,2))
D = scipy.spatial.distance.cdist(Z, Z)
print(D)

[[0.         0.09065707 0.11629565 0.46034087 0.59045405 0.87107897
  0.90830291 0.47134055 0.84136273 0.67215028]
 [0.09065707 0.         0.12702559 0.39067888 0.57572799 0.79765869
  0.82536639 0.38919193 0.77422163 0.66450651]
 [0.11629565 0.12702559 0.         0.5129491  0.69434048 0.91503002
  0.93008534 0.41010737 0.89616649 0.77973887]
 [0.46034087 0.39067888 0.5129491  0.         0.38165117 0.41127335
  0.4726627  0.45609241 0.38357949 0.47770226]
 [0.59045405 0.57572799 0.69434048 0.38165117 0.         0.58718035
  0.73335718 0.8102564  0.50453472 0.09858104]
 [0.87107897 0.79765869 0.91503002 0.41127335 0.58718035 0.
  0.1819923  0.7281535  0.0939839  0.64381358]
 [0.90830291 0.82536639 0.93008534 0.4726627  0.73335718 0.1819923
  0.         0.66632351 0.27374291 0.80112503]
 [0.47134055 0.38919193 0.41010737 0.45609241 0.8102564  0.7281535
  0.66632351 0.         0.75233826 0.90880572]
 [0.84136273 0.77422163 0.89616649 0.38357949 0.50453472 0.0939839
  0.27374291 0.75233826

#### 23. Create a custom dtype that describes a color as four unsigned bytes (RGBA) (★☆☆)

In [19]:
answer(23)

color = np.dtype([("r", np.ubyte),
                  ("g", np.ubyte),
                  ("b", np.ubyte),
                  ("a", np.ubyte)])


In [20]:
color = np.dtype([("r", np.ubyte),
                  ("g", np.ubyte),
                  ("b", np.ubyte),
                  ("a", np.ubyte)])

In [21]:
a = np.array([(1, 1, 0, 1), 2], dtype= color)
# Auto broadcasting! if np.array([1]) => (1, 1, 1, 1)!

#### 65. How to accumulate elements of a vector (X) to an array (F) based on an index list (I)? (★★★)

In [22]:
# 가장 쉬운 방법 => 노가다 하는것
def my_answer_65(index_list, x) :
    ans = [0 for i in range(np.amax(index_list)+1)]
    for (i, val) in zip(index_list, x) :
        ans[i]+=val
    return ans

##### but.. above is too slow..

In [23]:
answer(65)

# Author: Alan G Isaac

X = [1,2,3,4,5,6]
I = [1,3,9,3,4,1]
F = np.bincount(I,X)
print(F)


In [24]:
index_list = np.random.randint(30, size = 30)
x = np.random.randint(100, size = index_list.size)
# I[0]=1 => 빈도수(F) 배열의 첫번째 index 값이 올라가는데, 이때 weight가 1이므로 1만큼 올라감
# 동일하게 I[5]=1이면 weight가 6이므로 6만큼 올라감 => 7

##### Which code is faster?

In [25]:
%timeit -n 10000 my_answer_65(index_list, x)
%timeit -n 10000 np.bincount(index_list, x)
# 역시 내장함수 행님이십니다

9.29 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
1.06 µs ± 8.32 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


#### 11. Create a 3x3 identity matrix (★☆☆)

In [26]:
mat = np.identity(3, dtype = int)
mat
# 역시 내장함수 행님이십니다...
# eye function is more useful.. but speed?

array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

In [27]:
%timeit -n 100000 np.identity(3, dtype = int)
%timeit -n 100000 np.eye(3, dtype = int)
# eye가 더 빠르노;; 버그임 ?

1.79 µs ± 24 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
1.28 µs ± 3.72 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


#### 8. Reverse a vector (first element becomes last) (★☆☆)

In [28]:
v = np.random.random(size = 10)
print(v)
v = v[::-1]
print(v)

[0.98971831 0.42060858 0.78186692 0.94949196 0.89435194 0.16976713
 0.66274529 0.87808361 0.20819172 0.26951913]
[0.26951913 0.20819172 0.87808361 0.66274529 0.16976713 0.89435194
 0.94949196 0.78186692 0.42060858 0.98971831]


In [29]:
answer(8)

Z = np.arange(50)
Z = Z[::-1]
print(Z)


#### 30. How to find common values between two arrays? (★☆☆)

In [30]:
a = np.random.randint(10, size = 8)
b = np.random.randint(10, size = 8)
ans = np.intersect1d(a, b)
print(a, b, ans, sep='\n')

[5 5 7 0 0 7 2 2]
[9 5 6 4 8 4 3 3]
[5]


In [31]:
answer(30)

Z1 = np.random.randint(0,10,10)
Z2 = np.random.randint(0,10,10)
print(np.intersect1d(Z1,Z2))


#### 16. How to add a border (filled with 0's) around an existing array? (★☆☆)

In [32]:
x = np.ones((5, 5))
x = np.pad(x, pad_width = 1, mode = 'constant', constant_values=0) # CNN Zero pading
x

array([[0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 0.],
       [0., 1., 1., 1., 1., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])

#### 3. Create a null vector of size 10 (★☆☆)

In [33]:
np.zeros(shape = (10))

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

#### 92. Consider a large vector Z, compute Z to the power of 3 using 3 different methods (★★★)

In [47]:
x = np.array([1, 2, 3])
# np.einsum()
np.einsum('i,i,i->i', x, x, x)

array([ 1,  8, 27])

In [48]:
np.power(x, 3)

array([ 1,  8, 27])

In [49]:
x*x*x

array([ 1,  8, 27])

In [50]:
%timeit -n 15000 np.einsum('i, i, i -> i', x, x, x)
%timeit -n 15000 np.power(x, 3)
%timeit -n 15000 x*x*x

2.44 µs ± 740 ns per loop (mean ± std. dev. of 7 runs, 15,000 loops each)
1.13 µs ± 15.2 ns per loop (mean ± std. dev. of 7 runs, 15,000 loops each)
925 ns ± 4.96 ns per loop (mean ± std. dev. of 7 runs, 15,000 loops each)


#### 6. Create a null vector of size 10 but the fifth value which is 1 (★☆☆)

In [56]:
x = np.zeros(10)
x[4] = 1
x

array([0., 0., 0., 0., 1., 0., 0., 0., 0., 0.])

In [55]:
answer(6)

Z = np.zeros(10)
Z[4] = 1
print(Z)


#### 83. How to find the most frequent value in an array?

In [61]:
x = np.random.randint(10, size = 1500)

In [62]:
x

array([6, 7, 0, ..., 6, 2, 2])

In [65]:
np.argmax(np.bincount(x))

9

##### What is answer..?

In [66]:
answer(83)

Z = np.random.randint(0,10,50)
print(np.bincount(Z).argmax())
