# NumPy Arrays

**python objects:** 

1. high-level number objects: integers, floating point
2. containers: lists (costless insertion and append), dictionaries (fast lookup)

**Numpy provides:**

1. extension package to Python for multi-dimensional arrays
2. closer to hardware (efficiency)
3. designed for scientific computation (convenience)
4. Also known as array oriented computing

## Importing numpy module

In [1]:
import numpy as np

In [2]:
a = np.array([0, 1, 2, 3])
print(a)

[0 1 2 3]


In [4]:
a1 = np.arange(10)
print(a1)

[0 1 2 3 4 5 6 7 8 9]


**Why it is useful:** Memory-efficient container that provides fast numerical operations.

In [0]:
#python lists
L = range(1000)
%timeit [i**2 for i in L]

307 µs ± 17.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [0]:
a = np.arange(1000)
%timeit a**2

1.35 µs ± 126 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [29]:
a2 = np.array([[0, 1, 2, 3],[1,2,3,4]])

In [30]:
a2.ndim

2

In [31]:
a2.shape

(2, 4)

In [32]:
a2

array([[0, 1, 2, 3],
       [1, 2, 3, 4]])

In [34]:
a2.reshape(4,2)

array([[0, 1],
       [2, 3],
       [1, 2],
       [3, 4]])

In [38]:
m = np.linspace(1,100,10)
m

array([  1.,  12.,  23.,  34.,  45.,  56.,  67.,  78.,  89., 100.])

In [45]:
a = [[[1,2],[3,4]],[[1,2],[3,4]]]
a

[[[1, 2], [3, 4]], [[1, 2], [3, 4]]]

In [48]:
a[0][1][1]

4

In [54]:
np.random.rand(5,2)

array([[0.25999969, 0.99248675],
       [0.93986485, 0.47484492],
       [0.31490487, 0.42990104],
       [0.68449364, 0.32879217],
       [0.89474888, 0.29301181]])

In [56]:
np.random.randn(5,5)

array([[ 0.10485215,  0.11339477,  1.94645996,  1.17684529, -0.31509438],
       [-0.18687574,  0.74452249,  0.07433619, -2.07970076, -0.18046743],
       [ 0.15960398, -0.91726017, -0.38102805,  0.72012072,  0.13728477],
       [ 0.39858266,  0.5088256 ,  0.86601489,  0.9053049 ,  0.09358064],
       [ 1.35625072, -0.01473464, -1.36338459, -1.42842865,  1.08737014]])

In [61]:
np.random.randint(1,100)

16

In [69]:
np.random.random_integers(1,100)

  """Entry point for launching an IPython kernel.


82

In [72]:
m = np.random.rand(2,2)
n = np.random.rand(2,2)
print(m)
n

[[0.06225777 0.76791893]
 [0.11980706 0.73718583]]


array([[0.83553903, 0.4947996 ],
       [0.20613114, 0.35630049]])

In [75]:
np.vstack((m,n))

array([[0.06225777, 0.76791893],
       [0.11980706, 0.73718583],
       [0.83553903, 0.4947996 ],
       [0.20613114, 0.35630049]])

In [76]:
np.hstack((m,n))

array([[0.06225777, 0.76791893, 0.83553903, 0.4947996 ],
       [0.11980706, 0.73718583, 0.20613114, 0.35630049]])

In [79]:
np.eye(3,3).astype(int)

array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

In [81]:
m = np.diag([1,2,3,4])
m

array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

In [86]:
m.diagonal()

array([1, 2, 3, 4])

In [87]:
m.all()

False

In [88]:
m.any()

True

In [89]:
m.argmax(),m.argmin()

(15, 1)

In [90]:
m.max(),m.min()

(4, 0)

In [92]:
m.dtype

dtype('int32')

**Each built-in data type has a character code that uniquely identifies it.**

- 'b' − boolean
- 'i' − (signed) integer
- 'u' − unsigned integer
- 'f' − floating-point
- 'c' − complex-floating point
- 'm' − timedelta
- 'M' − datetime
- 'O' − (Python) objects
- 'S', 'a' − (byte-)string
- 'U' − Unicode
- 'V' − raw data (void)

In [111]:
m = np.array([4 + 2j])
m.dtype

dtype('complex128')

In [113]:
m = np.array([2+3j, 4])
m

array([2.+3.j, 4.+0.j])

In [114]:
m.dtype

dtype('complex128')

In [115]:
m = np.array([2+3j, 4, 'python'])
m

array(['(2+3j)', '4', 'python'], dtype='<U64')

In [119]:
m = np.arange(0,4).reshape(2,2)
m

array([[0, 1],
       [2, 3]])

In [117]:
n

array([[0.83553903, 0.4947996 ],
       [0.20613114, 0.35630049]])

In [120]:
np.equal(m,n)

array([[False, False],
       [False, False]])

In [123]:
one = np.ones(9,dtype=int).reshape(3,3)
one 

array([[1, 1, 1],
       [1, 1, 1],
       [1, 1, 1]])

In [124]:
five = 5 * one
five

array([[5, 5, 5],
       [5, 5, 5],
       [5, 5, 5]])

In [125]:
np.dot(one,five)

array([[15, 15, 15],
       [15, 15, 15],
       [15, 15, 15]])

In [126]:
one * five

array([[5, 5, 5],
       [5, 5, 5],
       [5, 5, 5]])

In [130]:
six = one + five
six

array([[6, 6, 6],
       [6, 6, 6],
       [6, 6, 6]])

In [132]:
a2 = one[:2,:2]
a2

array([[1, 1],
       [1, 1]])

In [133]:
bol = np.array([[True,False],[True,False]])
bol

array([[ True, False],
       [ True, False]])

In [134]:
a2[bol]

array([1, 1])

In [135]:
a = np.arange(480,555)
a

array([480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492,
       493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505,
       506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518,
       519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531,
       532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544,
       545, 546, 547, 548, 549, 550, 551, 552, 553, 554])

In [138]:
e = a % 2 == 0
a[e]

array([480, 482, 484, 486, 488, 490, 492, 494, 496, 498, 500, 502, 504,
       506, 508, 510, 512, 514, 516, 518, 520, 522, 524, 526, 528, 530,
       532, 534, 536, 538, 540, 542, 544, 546, 548, 550, 552, 554])

In [139]:
year = np.arange(1950,2050)
year

array([1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960,
       1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971,
       1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982,
       1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993,
       1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
       2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015,
       2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024, 2025, 2026,
       2027, 2028, 2029, 2030, 2031, 2032, 2033, 2034, 2035, 2036, 2037,
       2038, 2039, 2040, 2041, 2042, 2043, 2044, 2045, 2046, 2047, 2048,
       2049])

In [145]:
leap = year[np.logical_or(year % 400 == 0,np.logical_and(year % 4 == 0,year % 100 != 0))]
leap

array([1952, 1956, 1960, 1964, 1968, 1972, 1976, 1980, 1984, 1988, 1992,
       1996, 2000, 2004, 2008, 2012, 2016, 2020, 2024, 2028, 2032, 2036,
       2040, 2044, 2048])

### repeating the elements

In [208]:
i = leap.repeat(2)
i

array([1952, 1952, 1956, 1956, 1960, 1960, 1964, 1964, 1968, 1968, 1972,
       1972, 1976, 1976, 1980, 1980, 1984, 1984, 1988, 1988, 1992, 1992,
       1996, 1996, 2000, 2000, 2004, 2004, 2008, 2008, 2012, 2012, 2016,
       2016, 2020, 2020, 2024, 2024, 2028, 2028, 2032, 2032, 2036, 2036,
       2040, 2040, 2044, 2044, 2048, 2048])

In [209]:
len(i)

50

In [210]:
j = i.reshape(2,5,5)
j

array([[[1952, 1952, 1956, 1956, 1960],
        [1960, 1964, 1964, 1968, 1968],
        [1972, 1972, 1976, 1976, 1980],
        [1980, 1984, 1984, 1988, 1988],
        [1992, 1992, 1996, 1996, 2000]],

       [[2000, 2004, 2004, 2008, 2008],
        [2012, 2012, 2016, 2016, 2020],
        [2020, 2024, 2024, 2028, 2028],
        [2032, 2032, 2036, 2036, 2040],
        [2040, 2044, 2044, 2048, 2048]]])

In [211]:
j.repeat(2,axis = 0)

array([[[1952, 1952, 1956, 1956, 1960],
        [1960, 1964, 1964, 1968, 1968],
        [1972, 1972, 1976, 1976, 1980],
        [1980, 1984, 1984, 1988, 1988],
        [1992, 1992, 1996, 1996, 2000]],

       [[1952, 1952, 1956, 1956, 1960],
        [1960, 1964, 1964, 1968, 1968],
        [1972, 1972, 1976, 1976, 1980],
        [1980, 1984, 1984, 1988, 1988],
        [1992, 1992, 1996, 1996, 2000]],

       [[2000, 2004, 2004, 2008, 2008],
        [2012, 2012, 2016, 2016, 2020],
        [2020, 2024, 2024, 2028, 2028],
        [2032, 2032, 2036, 2036, 2040],
        [2040, 2044, 2044, 2048, 2048]],

       [[2000, 2004, 2004, 2008, 2008],
        [2012, 2012, 2016, 2016, 2020],
        [2020, 2024, 2024, 2028, 2028],
        [2032, 2032, 2036, 2036, 2040],
        [2040, 2044, 2044, 2048, 2048]]])

In [212]:
j.sum()

100000

In [213]:
j.prod(axis = 1)

array([[-534413312,  395575296, 1745911808, -635183104,  273842176],
       [-316571648, 1745543168,  618086400, 1166016512, -706740224]])

In [214]:
j.cumsum()

array([  1952,   3904,   5860,   7816,   9776,  11736,  13700,  15664,
        17632,  19600,  21572,  23544,  25520,  27496,  29476,  31456,
        33440,  35424,  37412,  39400,  41392,  43384,  45380,  47376,
        49376,  51376,  53380,  55384,  57392,  59400,  61412,  63424,
        65440,  67456,  69476,  71496,  73520,  75544,  77572,  79600,
        81632,  83664,  85700,  87736,  89776,  91816,  93860,  95904,
        97952, 100000], dtype=int32)

In [215]:
j.mean()

2000.0

In [216]:
j.size

50

In [217]:
j.tolist()

[[[1952, 1952, 1956, 1956, 1960],
  [1960, 1964, 1964, 1968, 1968],
  [1972, 1972, 1976, 1976, 1980],
  [1980, 1984, 1984, 1988, 1988],
  [1992, 1992, 1996, 1996, 2000]],
 [[2000, 2004, 2004, 2008, 2008],
  [2012, 2012, 2016, 2016, 2020],
  [2020, 2024, 2024, 2028, 2028],
  [2032, 2032, 2036, 2036, 2040],
  [2040, 2044, 2044, 2048, 2048]]]

In [218]:
list(j)

[array([[1952, 1952, 1956, 1956, 1960],
        [1960, 1964, 1964, 1968, 1968],
        [1972, 1972, 1976, 1976, 1980],
        [1980, 1984, 1984, 1988, 1988],
        [1992, 1992, 1996, 1996, 2000]]),
 array([[2000, 2004, 2004, 2008, 2008],
        [2012, 2012, 2016, 2016, 2020],
        [2020, 2024, 2024, 2028, 2028],
        [2032, 2032, 2036, 2036, 2040],
        [2040, 2044, 2044, 2048, 2048]])]

In [219]:
j.tostring()

b'\xa0\x07\x00\x00\xa0\x07\x00\x00\xa4\x07\x00\x00\xa4\x07\x00\x00\xa8\x07\x00\x00\xa8\x07\x00\x00\xac\x07\x00\x00\xac\x07\x00\x00\xb0\x07\x00\x00\xb0\x07\x00\x00\xb4\x07\x00\x00\xb4\x07\x00\x00\xb8\x07\x00\x00\xb8\x07\x00\x00\xbc\x07\x00\x00\xbc\x07\x00\x00\xc0\x07\x00\x00\xc0\x07\x00\x00\xc4\x07\x00\x00\xc4\x07\x00\x00\xc8\x07\x00\x00\xc8\x07\x00\x00\xcc\x07\x00\x00\xcc\x07\x00\x00\xd0\x07\x00\x00\xd0\x07\x00\x00\xd4\x07\x00\x00\xd4\x07\x00\x00\xd8\x07\x00\x00\xd8\x07\x00\x00\xdc\x07\x00\x00\xdc\x07\x00\x00\xe0\x07\x00\x00\xe0\x07\x00\x00\xe4\x07\x00\x00\xe4\x07\x00\x00\xe8\x07\x00\x00\xe8\x07\x00\x00\xec\x07\x00\x00\xec\x07\x00\x00\xf0\x07\x00\x00\xf0\x07\x00\x00\xf4\x07\x00\x00\xf4\x07\x00\x00\xf8\x07\x00\x00\xf8\x07\x00\x00\xfc\x07\x00\x00\xfc\x07\x00\x00\x00\x08\x00\x00\x00\x08\x00\x00'

In [220]:
np.power(j,2)

array([[[3810304, 3810304, 3825936, 3825936, 3841600],
        [3841600, 3857296, 3857296, 3873024, 3873024],
        [3888784, 3888784, 3904576, 3904576, 3920400],
        [3920400, 3936256, 3936256, 3952144, 3952144],
        [3968064, 3968064, 3984016, 3984016, 4000000]],

       [[4000000, 4016016, 4016016, 4032064, 4032064],
        [4048144, 4048144, 4064256, 4064256, 4080400],
        [4080400, 4096576, 4096576, 4112784, 4112784],
        [4129024, 4129024, 4145296, 4145296, 4161600],
        [4161600, 4177936, 4177936, 4194304, 4194304]]], dtype=int32)

In [221]:
np.array_equal(j,a)

False

In [227]:
j[0:2,0] = 0

In [228]:
j

array([[[   0,    0,    0,    0,    0],
        [1960, 1964, 1964, 1968, 1968],
        [1972, 1972, 1976, 1976, 1980],
        [1980, 1984, 1984, 1988, 1988],
        [1992, 1992, 1996, 1996, 2000]],

       [[   0,    0,    0,    0,    0],
        [2012, 2012, 2016, 2016, 2020],
        [2020, 2024, 2024, 2028, 2028],
        [2032, 2032, 2036, 2036, 2040],
        [2040, 2044, 2044, 2048, 2048]]])

In [244]:
j[[1],[1],[0]]

array([2012])

In [245]:
np.arange(480,890)

array([480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492,
       493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505,
       506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518,
       519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531,
       532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544,
       545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557,
       558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570,
       571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583,
       584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596,
       597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609,
       610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622,
       623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635,
       636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648,
       649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 66

In [282]:
j = np.random.rand(5,5)
j

array([[0.59154465, 0.40432598, 0.40629042, 0.04940587, 0.80267282],
       [0.15696665, 0.48129777, 0.13889431, 0.05290738, 0.7009945 ],
       [0.34030498, 0.17267325, 0.49926741, 0.00699208, 0.63944395],
       [0.66423412, 0.77530783, 0.11631509, 0.6627762 , 0.33084882],
       [0.41411563, 0.55914351, 0.70106346, 0.53403925, 0.76641013]])

In [286]:
j.sort()
j

array([[0.00699208, 0.11631509, 0.15696665, 0.17267325, 0.33084882],
       [0.04940587, 0.13889431, 0.34030498, 0.40432598, 0.63944395],
       [0.05290738, 0.40629042, 0.41411563, 0.48129777, 0.7009945 ],
       [0.49926741, 0.53403925, 0.55914351, 0.59154465, 0.76641013],
       [0.6627762 , 0.66423412, 0.70106346, 0.77530783, 0.80267282]])

In [288]:
j.sort(axis = 0)


In [287]:
j.argsort()

array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]], dtype=int64)

In [280]:
j

array([[0.87802998, 0.04032474, 0.78903187, 0.64192567, 0.21271053],
       [0.11063265, 0.59057302, 0.21143659, 0.0347463 , 0.88829496],
       [0.86044692, 0.8253898 , 0.96069419, 0.51575314, 0.52921158],
       [0.60314643, 0.91714107, 0.39967668, 0.53603127, 0.09023255],
       [0.40814956, 0.5439583 , 0.75516571, 0.54451929, 0.2189376 ]])

In [289]:
np.median(j)

0.48129777200405455

In [292]:
np.median(j, axis = 0)

array([0.05290738, 0.40629042, 0.41411563, 0.48129777, 0.7009945 ])

In [293]:
np.median(j, axis = 1)

array([0.15696665, 0.34030498, 0.41411563, 0.55914351, 0.70106346])

In [299]:
j.std()

0.24815231814137928

In [300]:
j.var()

0.06157957299894032

In [304]:
help(j.flatten)

Help on built-in function flatten:

flatten(...) method of numpy.ndarray instance
    a.flatten(order='C')
    
    Return a copy of the array collapsed into one dimension.
    
    Parameters
    ----------
    order : {'C', 'F', 'A', 'K'}, optional
        'C' means to flatten in row-major (C-style) order.
        'F' means to flatten in column-major (Fortran-
        style) order. 'A' means to flatten in column-major
        order if `a` is Fortran *contiguous* in memory,
        row-major order otherwise. 'K' means to flatten
        `a` in the order the elements occur in memory.
        The default is 'C'.
    
    Returns
    -------
    y : ndarray
        A copy of the input array, flattened to one dimension.
    
    See Also
    --------
    ravel : Return a flattened array.
    flat : A 1-D flat iterator over the array.
    
    Examples
    --------
    >>> a = np.array([[1,2], [3,4]])
    >>> a.flatten()
    array([1, 2, 3, 4])
    >>> a.flatten('F')
    array([1, 3, 2, 4]

In [309]:
n = j.flatten()

In [312]:
j.T

array([[0.00699208, 0.04940587, 0.05290738, 0.49926741, 0.6627762 ],
       [0.11631509, 0.13889431, 0.40629042, 0.53403925, 0.66423412],
       [0.15696665, 0.34030498, 0.41411563, 0.55914351, 0.70106346],
       [0.17267325, 0.40432598, 0.48129777, 0.59154465, 0.77530783],
       [0.33084882, 0.63944395, 0.7009945 , 0.76641013, 0.80267282]])

In [316]:
j[::-1]

array([[0.6627762 , 0.66423412, 0.70106346, 0.77530783, 0.80267282],
       [0.49926741, 0.53403925, 0.55914351, 0.59154465, 0.76641013],
       [0.05290738, 0.40629042, 0.41411563, 0.48129777, 0.7009945 ],
       [0.04940587, 0.13889431, 0.34030498, 0.40432598, 0.63944395],
       [0.00699208, 0.11631509, 0.15696665, 0.17267325, 0.33084882]])

In [319]:
help(j.swapaxes)

Help on built-in function swapaxes:

swapaxes(...) method of numpy.ndarray instance
    a.swapaxes(axis1, axis2)
    
    Return a view of the array with `axis1` and `axis2` interchanged.
    
    Refer to `numpy.swapaxes` for full documentation.
    
    See Also
    --------
    numpy.swapaxes : equivalent function



In [323]:
j.swapaxes(0,1)

array([[0.00699208, 0.04940587, 0.05290738, 0.49926741, 0.6627762 ],
       [0.11631509, 0.13889431, 0.40629042, 0.53403925, 0.66423412],
       [0.15696665, 0.34030498, 0.41411563, 0.55914351, 0.70106346],
       [0.17267325, 0.40432598, 0.48129777, 0.59154465, 0.77530783],
       [0.33084882, 0.63944395, 0.7009945 , 0.76641013, 0.80267282]])

In [329]:
numpy --ver--

SyntaxError: invalid syntax (<ipython-input-329-c468a4d8b1e5>, line 1)

In [335]:
n = np.array([np.nan,2,3,5,5])
n

array([nan,  2.,  3.,  5.,  5.])

In [340]:
n.dtype

dtype('float64')