# NumPy ndarray
> avoid *from numpy import ** , which will cause confusion and conflict with other modules

In [1]:
import numpy as np
x = np.array([[1, 2, 3], [2, 3, 4]])
print(x)

[[1 2 3]
 [2 3 4]]


* Use **ndim** attribute to get the dimmension of an array

In [2]:
x.ndim

2

In [3]:
x.shape

(2, 3)

## Array indexing and slicing

In [3]:
x = np.random.random((3, 4, 5))
x

array([[[ 0.26437847,  0.80164233,  0.82625471,  0.59607897,  0.83895227],
        [ 0.76398099,  0.71879652,  0.37390207,  0.25422914,  0.68812365],
        [ 0.93781487,  0.65187547,  0.72149537,  0.79652745,  0.71108631],
        [ 0.73442666,  0.13090776,  0.70074357,  0.74525315,  0.6073456 ]],

       [[ 0.42210814,  0.91178805,  0.06946539,  0.73285394,  0.15558986],
        [ 0.64783686,  0.77577682,  0.748246  ,  0.38964352,  0.58910809],
        [ 0.59608485,  0.55435544,  0.80492638,  0.74173767,  0.5881487 ],
        [ 0.81669809,  0.301606  ,  0.49532253,  0.96698657,  0.06939879]],

       [[ 0.68660551,  0.36002865,  0.35094605,  0.52450347,  0.0518192 ],
        [ 0.09791141,  0.83985057,  0.78365145,  0.39651623,  0.78734257],
        [ 0.48637775,  0.55254813,  0.88448288,  0.21418904,  0.5891296 ],
        [ 0.59833343,  0.80132861,  0.05866252,  0.74726322,  0.67708837]]])

In [10]:
x[(1, 2, 3)]

0.63885411730593111

In [17]:
x[1, 1:, 1]

array([ 0.45043811,  0.73449053,  0.97101124])

In [18]:
y = np.random.random((3, 4))
y

array([[ 0.35275133,  0.54398307,  0.94688182,  0.12506365],
       [ 0.66073845,  0.9606453 ,  0.34734895,  0.44136876],
       [ 0.26642123,  0.28365086,  0.45140476,  0.3218879 ]])

In [19]:
y[1, :] // 2nd row

array([ 0.66073845,  0.9606453 ,  0.34734895,  0.44136876])

In [21]:
y[1, :-1]  // 2nd row

array([ 0.66073845,  0.9606453 ,  0.34734895])

In [22]:
y[:, 1]  // second column

array([ 0.54398307,  0.9606453 ,  0.28365086])

In [24]:
y[:, ::-1]  // reverse column

array([[ 0.12506365,  0.94688182,  0.54398307,  0.35275133],
       [ 0.44136876,  0.34734895,  0.9606453 ,  0.66073845],
       [ 0.3218879 ,  0.45140476,  0.28365086,  0.26642123]])

In [25]:
y[::-1, :]  // reverse row

array([[ 0.26642123,  0.28365086,  0.45140476,  0.3218879 ],
       [ 0.66073845,  0.9606453 ,  0.34734895,  0.44136876],
       [ 0.35275133,  0.54398307,  0.94688182,  0.12506365]])

In [5]:
x[1, 2]

array([ 0.59608485,  0.55435544,  0.80492638,  0.74173767,  0.5881487 ])

## Memory layout of ndarray
**flags** attribute holds information about the memory layout of the array.
* C_CONTIGUOUS indicates whether the array was C-style array: row-major indexing
* F_CONTIGUOUS indicates whether the array was Fortran-style array: column-major indexing 

It is very important to know the difference, which can speed up your program.

In [6]:
x.flags

  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

**Example**:

In [18]:
c_array = np.random.rand(10000, 10000)
f_array = np.asfortranarray(c_array)
def sum_row(x):
    return np.sum(x[0, :])
def sum_column(x):
    return np.sum(x[:, 0])

%timeit sum_row(c_array)
%timeit sum_row(f_array)
%timeit sum_column(c_array)
%timeit sum_column(f_array)

The slowest run took 5.90 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.99 µs per loop
The slowest run took 5.25 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 19.3 µs per loop
10000 loops, best of 3: 90 µs per loop
100000 loops, best of 3: 7.16 µs per loop


## Views and copies
2 ways of accessing data by slicing and indexing - copies and views: you can either access elements directly or create a copy of the array that contains only the accessed elements.
Use **may_share_memory** to check whether two arrays are copies or views of each other. While this method does the job in most cases, it is not always reliable, since it uses heuristics.

In [2]:
x = np.random.rand(100, 10)

In [3]:
y = x[:5, :]

In [4]:
np.may_share_memory(x, y)  # y is a view of x

True

* y is a view(a reference to x) of x, if we change y, x will be changed too.

In [6]:
y [:] = 0
x[:5,:]

array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

* y is a copy of x, which is independent from x. Changing y won't affect x

In [13]:
x = np.random.rand(100, 10)
y = np.empty([5, 10])
y[:] = x[:5, :]
print('y: ', y)
y[:] = 0
print('x: ', x[:5, :])
print('y: ', y)

y:  [[ 0.31446063  0.40567077  0.52808131  0.92165188  0.71207395  0.40662951
   0.56544913  0.9139226   0.13805357  0.8929419 ]
 [ 0.63991351  0.31606078  0.39307608  0.79875712  0.00159579  0.90698844
   0.07898949  0.21082609  0.16894868  0.35943743]
 [ 0.85636721  0.56875043  0.28457974  0.42140064  0.18784317  0.05623059
   0.81073741  0.98003296  0.09038954  0.57298738]
 [ 0.10199465  0.56411442  0.0921287   0.78169324  0.29586504  0.02769341
   0.88835059  0.72881954  0.17525631  0.68224582]
 [ 0.85123524  0.18607454  0.84929094  0.67369407  0.04138161  0.1624432
   0.6355854   0.44047373  0.61986576  0.81339425]]
x:  [[ 0.31446063  0.40567077  0.52808131  0.92165188  0.71207395  0.40662951
   0.56544913  0.9139226   0.13805357  0.8929419 ]
 [ 0.63991351  0.31606078  0.39307608  0.79875712  0.00159579  0.90698844
   0.07898949  0.21082609  0.16894868  0.35943743]
 [ 0.85636721  0.56875043  0.28457974  0.42140064  0.18784317  0.05623059
   0.81073741  0.98003296  0.09038954  0.57

## Creating Arrays
Arrays can be created from:
* instance from other data structure
* reading files on disk
* web
In this section, we will use list or functions in numpy

### Creating arrays from lists
To create a valid array object, arguments to array functions need to adhere to at least one of the following conditions:
* It has to be a valid iterable value or sequence, which may be nested
* It must have an **__array__** method that returns a valid numpy array
> The np.array() function will normally cast all input elements into the most suitable data type required for the array.

In [19]:
x = np.array([1, 2, 3, 'hello'])  # all to string
y = np.array(['hello', 'world'])  # all string
z = np.array([1, 2, 3, 4.5, 6.9, 'hello'])  # all to string
a = np.array((1, 2, 3, 4.5, 6.9, 'hello'))  # all to string
print(x, y, z, a)

['1' '2' '3' 'hello'] ['hello' 'world'] ['1' '2' '3' '4.5' '6.9' 'hello'] ['1' '2' '3' '4.5' '6.9' 'hello']


In [21]:
np.arange(5)  # range creates array

array([0, 1, 2, 3, 4])

In [22]:
np.array([[1, 2, 3, 4], [1, 2, 3, '5']])  # nested list creates 2 dimentaion array

array([['1', '2', '3', '4'],
       ['1', '2', '3', '5']], 
      dtype='<U21')

### Creating random arrays
* Create random arrays
* Create random permutations of arrays
* Generate arrays with specific probability distributions

In [27]:
x = np.random.rand(2, 2, 3)
x.shape

(2, 2, 3)

In [28]:
y = np.random.random((2, 2, 3))
y.shape

(2, 2, 3)

> rand is a convenience function for random. these two functions can only create arrays of floats.

Use **randint()** to create arrays of integers