## Introduction to Numpy

This notebook is based on [QuantEcon](https://lectures.quantecon.org/).

In [1]:
import numpy as np

In [2]:
# Pick data from uniform distribution 
x = np.random.uniform(0,1,size=1000000) #0以上1未満のuniform distribution
x.mean()

0.49944826374395634

In [3]:
a = np.zeros(3)
a

array([0., 0., 0.])

In [4]:
type(a)

numpy.ndarray

- Data in NumPy arrays must be homogeneous.
- These types must be one of the data types provided by Numpy.

Default data type is float64.

In [5]:
type(a[0])

numpy.float64

In [6]:
# Specify data type
a = np.zeros(3,dtype=int)
type(a[0])

numpy.int32

In [7]:
z = np.zeros(10) # z is a flat array with no dim
z.shape

(10,)

In [8]:
# makes it  column vector
z.shape = (10,1)
z

array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.]])

In [9]:
z = np.zeros(4)
z.shape = (2,2) #matrix
z

array([[0., 0.],
       [0., 0.]])

In [10]:
# or 
z = np.zeros((2,2))
z

array([[0., 0.],
       [0., 0.]])

In [11]:
z = np.empty(3) # arrays in memory that can later be populated with data
z

array([0., 0., 0.])

In [12]:
# linspace  makes flat vector
z = np.linspace(2,4,5) # from 2 to 4 with 5 elements
type(z)

numpy.ndarray

In [13]:
z.shape

(5,)

In [14]:
z 

array([2. , 2.5, 3. , 3.5, 4. ])

In [15]:
z = np.identity(2) # identity matrix
z

array([[1., 0.],
       [0., 1.]])

We can create numpy arrays using lists, tuple, etc. in Python.


In [16]:
z = np.array((10, 20), dtype=float)    
#'float' is equivalent to 'np.float64'
z
type(z[0])

numpy.float64

In [17]:
z = np.array([[1, 2], [3, 4]])
# matrix from a list of lists
z

array([[1, 2],
       [3, 4]])

In [18]:
# to change data type of data that already exists
# We use np.asarray since np.asarray does not copy array
na = np.linspace(10,20,2)
na is np.asarray(na) 

True

In [19]:
na is np.array(na)

False

Indexing is same as pure python.

In [20]:
z = np.linspace(1,2,5)#端を含む
# np.linspaceが特殊例か？
z

array([1.  , 1.25, 1.5 , 1.75, 2.  ])

In [21]:
z[0]

1.0

In [22]:
z[0:2] # pick two element 2未満でチェックされてるみたい　

array([1.  , 1.25])

In [23]:
a1 = np.arange(1, 10) #これは10未満でチェック
a1

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [24]:
a2 = a1[3:6]
a2

array([4, 5, 6])

In [25]:
# 2d array
z = np.array([[1, 2], [3, 4]])
z

array([[1, 2],
       [3, 4]])

In [26]:
z[0, 0]

1

In [27]:
z[0,1]

2

In [28]:
z = np.linspace(2, 4, 5)
z

array([2. , 2.5, 3. , 3.5, 4. ])

In [29]:
indices = np.array((0, 2, 3))
z[indices]

array([2. , 3. , 3.5])

In [30]:
d = np.array([0, 1, 1, 0, 0], dtype=bool)
d

array([False,  True,  True, False, False])

In [31]:
z[d]

array([2.5, 3. ])

In [32]:
z = np.empty(3)
z

array([2. , 3. , 3.5])

In [33]:
z[:] = 42
z

array([42., 42., 42.])

In [34]:
a = np.array((4, 3, 2, 1))
a

array([4, 3, 2, 1])

We already learn computational complexity fo sorting.

NumPy provides fast sort algorithm.

Other method are also provided and highly optimized.

In [35]:
a.sort() # change original data
a

array([1, 2, 3, 4])

In [36]:
a.sum()

10

In [37]:
a.max()

4

In [38]:
a.mean()

2.5

In [39]:
a.argmax() # return the index fo the maximal element

3

In [40]:
a.cumsum() # Cumulative sum

array([ 1,  3,  6, 10], dtype=int32)

In [41]:
a.cumprod() # cumulative product

array([ 1,  2,  6, 24], dtype=int32)

In [42]:
a.var()

1.25

In [43]:
a.std()

1.118033988749895

In [44]:
a.shape =(2,2)
a.T # transpose

array([[1, 3],
       [2, 4]])

If z is a nondecreasing array, then 
z.searchsorted(a) returns the index of the first element of z that is >= a

In [45]:
z = np.linspace(2, 4, 5)
z

array([2. , 2.5, 3. , 3.5, 4. ])

In [46]:
z.searchsorted(2.2)

1

In [47]:
a = np.array((4, 3, 2, 1))
np.sum(a) # same as a.sum()

10

In [48]:
%timeit np.sum(a)

6.62 µs ± 287 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [49]:
%timeit a.sum()

4.77 µs ± 489 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


## Operations on Arrays

In [50]:
a = np.array([1,2,3,4])
b = np.array([5,6,7,8])
a + b

array([ 6,  8, 10, 12])

In [51]:
a * b 

array([ 5, 12, 21, 32])

In [52]:
a + 10

array([11, 12, 13, 14])

In [53]:
a * 10

array([10, 20, 30, 40])

In [54]:
A = np.ones((2, 2))
B = np.ones((2, 2))
A + B

array([[2., 2.],
       [2., 2.]])

In [55]:
A + 10

array([[11., 11.],
       [11., 11.]])

In [56]:
A*B

array([[1., 1.],
       [1., 1.]])

In [57]:
A = np.ones((2, 2))
B = np.ones((2, 2))
A @ B

array([[2., 2.],
       [2., 2.]])

In [58]:
A = np.array((1, 2))
B = np.array((10, 20))
A @ B # inner product of flat arrays

50

In [59]:
A = np.array(((1, 2), (3, 4)))
A @ (0,1)

array([2, 4])

In [60]:
a = np.array([42, 44])
a

array([42, 44])

In [61]:
a[-1] =0
a

array([42,  0])

## Reference and value pass

In [62]:
a = np.random.randn(3)
a

array([ 1.10975549,  0.66992125, -0.50769634])

In [63]:
b = a 
print(id(b))
id(a) == id(b)

1809081842464


True

In [64]:
b[0] = 0.0
print(id(b))

1809081842464


In [65]:
id(a) == id(b)

True

In [66]:
a = np.random.randn(4)
a

array([ 0.63273816,  0.95828252, -1.0268434 , -1.34818946])

If you do not want to change original array, use 'np.copy'

In [67]:
b = np.copy(a) 

In [68]:
b[:]= 1
id(a) == id(b)

False

In [69]:
a

array([ 0.63273816,  0.95828252, -1.0268434 , -1.34818946])

In [70]:
b

array([1., 1., 1., 1.])

## Vectorized functions

In [71]:
z = np.array([1,2,3])
np.sin(z)

array([0.84147098, 0.90929743, 0.14112001])

In [72]:
( 1/ np.sqrt(2 * np.pi)) * np.exp(- 0.5 * z**2)

array([0.24197072, 0.05399097, 0.00443185])

In [73]:
x = np.random.randn(4)
x

array([-0.76486314, -0.40462903, -0.23857884, -1.91687995])

In [74]:
np.where(x > 0, 1, 0)  
# Insert 1 if x > 0 true, otherwise 0

array([0, 0, 0, 0])

In [75]:
z = np.linspace(0, 10, 5)
z

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

In [76]:
z[z > 3]

array([ 5. ,  7.5, 10. ])

In [77]:
z = np.random.randn(10000)  # Generate standard normals
y = np.random.binomial(10, 0.5, size=1000)    
# 1,000 draws from Bin(10, 0.5)
y.mean()

5.078

In [78]:
A = np.array([[1, 2], [3, 4]])

np.linalg.det(A)# Compute the determinant

-2.0000000000000004

In [79]:
np.linalg.inv(A) # inverse matrid

array([[-2. ,  1. ],
       [ 1.5, -0.5]])