<h3>    What is Numpy </h3>

NumPy, short for Numerical Python, is one of the most important foundational pack‐
ages for numerical computing in Python. Most computational packages providing
scientific functionality use NumPy’s array objects as the lingua franca for data
exchange.

<h4> What numpy provides: </h4>

<ul>
    <li> ndarray, an efficient multidimensional array providing fast array-oriented arith‐
metic operations and flexible broadcasting capabilities. </li>
    <li> Mathematical functions for fast operations on entire arrays of data without hav‐
ing to write loops. </li>
    <li> Linear algebra, random number generation </li>
    <li> A C API for connecting NumPy with libraries written in C, C++ </li>
</ul>

<h4> Why using It instead of python's list </h4>

<ul>
    <li> Provide matrix and vector operations </li>
    <li> It's written in C optimized code. Much more faster and memory efficient.(up to hundreds time faster and ten's time more memory efficient</li>
    <li>provides an easy-to-use C API, it is straightforward to pass data to
external libraries written in a low-level language and also for external libraries to
        return data to Python as NumPy arrays.</li>
</ul>

<h3> Deference between numpy array and python's list </h3><br><br><br>
<img src="https://raw.githubusercontent.com/h8hawk/Datacamp-Scientific-Python/master/files/array_vs_list.png"/>

<h3> What provides in this courese? </h3>

<ul>
    <li> Fast vectorized array operations for data munging and cleaning, subsetting and
        filtering, transformation, and any other kinds of computations </li>
    <li> Common array algorithms like sorting, unique, and set operations </li>
    <li> Efficient descriptive statistics and aggregating/summarizing data </li>
    <li> Data alignment and relational data manipulations for merging and joining
        together heterogeneous datasets </li>
    <li> Expressing conditional logic as array expressions instead of loops with if-elif-
        else branches </li>
    <li> Group-wise data manipulations (aggregation, transformation, function applica‐
        tion)</li>
</ul>


<h4> How numpy works </h4>

<ul>
    <li> NumPy internally stores data in a contiguous block of memory, independent of
other built-in Python objects. NumPy’s library of algorithms written in the C language can operate on this memory without any type checking or other overhead.
        NumPy arrays also use much less memory than built-in Python sequences. </li>
    <li> NumPy operations perform complex computations on entire arrays without the
    need for Python for loops. </li>
    
</ul>


<h4> The NumPy ndarray: A Multidimensional Array Object </h4>

<p> One of the key features of NumPy is its N-dimensional array object, or ndarray,
which is a fast, flexible container for large datasets in Python. Arrays enable you to
perform mathematical operations on whole blocks of data using similar syntax to the
equivalent operations between scalar elements. </p>




<h4> First look at numpy array: </h4>

Importing numpy module and prefixing it by <b>np</b>:

In [1]:
import numpy as np

Generate Some random data.  
<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.RandomState.html#numpy.random.RandomState">How numpy generate random numbers? </a>

In [2]:
data = np.random.randn(2,3)

In [3]:
data

array([[-2.63470329,  1.76767156,  0.48230588],
       [ 0.33645167,  1.70366514, -0.40070067]])

<p> Mathematical operation with <b>data</b> </p>

In [5]:
data * 10

array([[-26.34703294,  17.67671565,   4.8230588 ],
       [  3.36451674,  17.03665141,  -4.00700671]])

<p> all of the elements of <b>data</b> have been multiplied by 10. </p>

<h3> Some properties of numpy array's (ndarray) </h3>
<ul>
    <li> An ndarray is a generic multidimensional container for homogeneous data : all
        of the elements must be the same type. </li>
    <li> Every array has a <b>shape</b> : a tuple indicating the
        size of each dimension </li>
    <li> Every array has a <b>dtype</b> : an object describing the data type of the array </li>
</ul>

For 'data':

In [7]:
data.shape

(2, 3)

In [8]:
data.dtype

dtype('float64')

<h3> Creating ndarray's </h3>
<ul>
    <li> Easies way: using <b>array</b> function </li>
</ul>

First make python list:

In [9]:
data1 = [4, 5.3, 8, 9, 12]

In [10]:
arr1 = np.array(data1)

In [11]:
print(arr1)

[ 4.   5.3  8.   9.  12. ]


Nested sequence :

In [13]:
nested_seq = [[1, 2, 3], [4, 5, 6]]

In [14]:
nested_arr = np.array(nested_seq)

In [15]:
print(nested_arr)

[[1 2 3]
 [4 5 6]]


In [16]:
nested_arr.ndim

2

In [17]:
nested_arr.shape

(2, 3)

In [18]:
nested_arr.dtype

dtype('int64')

<h3> Some properties of numpy array's (ndarray) </h3>
<ul>
    <li> Unless explicitly specified, <b>np.array</b> tries to infer a good data
        type for the array that it creates. The data type is stored <b>dtype</b> </li>
</ul>
</br>

<h4> There are a number of other functions for creating new
arrays: </h4>
<ul>
    <li> <b>zeros</b> : create arrays of 0s </li>
    <li> <b>ones</b> : create arrays of 1s </li>
    <li> <b>empty</b> : creates an array without initializing its values to any particular value.</li>
    <li> <b>arange</b> : Like the built-in <b>range</b> but returns an ndarray instead of a list</li>
    <li> <b> ones_like </b> : produces a ones array of the same shape and dtype </li>
    <li> <b> zeros_like </b> : Like <b>ones_like</b> but for zeros </li>
    <li> <b> full </b> : Produce an array of the given shape and dtype with all values set to the indicated “fill value”</li>
    <li> <b> eye</b>, <b>identity </b> : Create a square N × N identity matrix (1s on the diagonal and 0s elsewhere) 
</ul>

In [20]:
np.zeros((4,3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [21]:
np.ones((4,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [30]:
np.empty((4,6))

array([[1.39408725e-316, 1.41115267e-316, 6.95160038e-310,
        6.95160691e-310, 6.95160038e-310, 6.95160038e-310],
       [6.95160038e-310, 6.95160691e-310, 6.95160038e-310,
        6.95160687e-310, 6.95160689e-310, 6.95160523e-310],
       [6.95160691e-310, 6.95160523e-310, 6.95160685e-310,
        5.50948807e-317, 6.95160690e-310, 6.95160038e-310],
       [6.95160207e-310, 6.95160523e-310, 6.95160449e-310,
        6.95160526e-310, 6.95160523e-310, 6.95160038e-310]])

<p> <b>empty</b>, unlike <b>zeros</b>, does not set the array values to zero, and may therefore be marginally faster. <br><b>empty</b> has nothing to do with creating an array that is "empty" in the sense of having no elements. It just means the array doesn't have its values initialized (i.e., they are unpredictable and depend on whatever happens to be in the memory allocated for the array).</p>

In [38]:
np.full((3,4), 4)

array([[4, 4, 4, 4],
       [4, 4, 4, 4],
       [4, 4, 4, 4]])

In [40]:
np.ones_like(nested_arr)

array([[1, 1, 1],
       [1, 1, 1]])

In [85]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [86]:
np.arange(1,10,.5)

array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. , 6.5, 7. ,
       7.5, 8. , 8.5, 9. , 9.5])

<h4>Data Types for ndarrays:</h4>
<ul>
<li> The data type or dtype is a special object containing the information the ndarray needs to interpret a chunk of memory as a particular
type of data. </li>
</ul>

In [42]:
arr1 = np.array([1, 2, 3, 4], dtype=np.float32)

In [43]:
arr1

array([1., 2., 3., 4.], dtype=float32)

In [44]:
arr2 = np.array([1.2, 3, -0.3], dtype=np.int32)

In [45]:
arr2

array([1, 3, 0], dtype=int32)

<h6> Casting floating-pint to intger: the decimal part will be truncated: </h6>

<h3> Numpy data types </h3>
<ul>
    <li> <b> int8, uint8 </b> : Signed and unsigned 8-bit (1 byte) integer types </li>
    <li> <b> int16, uint16 </b> : Signed and unsigned 16-bit integer types </li>
    <li> <b> int32, uint32 </b> : Signed and unsigned 32-bit integer types </li>
    <li> <b> int64, uint64 </b> : Signed and unsigned 64-bit integer types </li>
    <li> <b> float16 </b> : Half-precision floating point </li>
    <li> <b> float32 </b> : Standard single-precision floating point; compatible with C float </li>
    <li> <b> float64 </b> : Standard double-precision floating point; compatible with C double and
    Python float object </li>
    <li> <b> float128 </b> : Extended-precision floating point </li>
    <li> <b> complex64, complex128 </b> : Complex numbers represented by two 32, 64, or 128 floats, respectively </li>
    <li> <b> bool </b> : Boolean type storing True and False values </li>
    <li> <b> object </b> : Python object type; a value can be any Python object </li>
    <li> <b> string_ </b> : Fixed-length ASCII string type (1 byte per character); for example, to create a
    string dtype with length 10, use 'S10' </li>
    <li> <b> unicode_ </b> : Fixed-length Unicode type (number of bytes platform specific); same
specification semantics as string_ (e.g., 'U10' ) </li>
</ul><br><br>
<h5> Convert array's dtype : <b>astype</b> method </h5>



In [46]:
arr = np.array([1, 2, 3, 4, 5], dtype=np.int64)

In [47]:
float_arr = arr.astype(np.float32)

In [48]:
float_arr.dtype

dtype('float32')

<h6> Casting floating-pint to intger: the decimal part will be truncated: </h6>

In [53]:
string_arr = np.array(['ab', '1' , 'fj'], dtype=np.string_)

In [54]:
string_arr.dtype

dtype('S2')

In [55]:
string_arr.astype(np.int64)

ValueError: invalid literal for int() with base 10: 'ab'

<h3> Arithmetic with NumPy Arrays </h3>
<ul>
<li>Arrays are important because they enable you to express batch operations on data
without writing any for loops. NumPy users call this vectorization. Any arithmetic
operations between equal-size arrays applies the operation element-wise: </li>
</ul>

In [57]:
arr = np.array([[1, 2 , 3], [4, 5, 6]], dtype=np.float64)

In [59]:
arr * arr

array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

In [60]:
arr - arr

array([[0., 0., 0.],
       [0., 0., 0.]])

<p>Arithmetic operations with scalars propagate the scalar argument to each element in
the array:</p>

In [61]:
1/arr

array([[1.        , 0.5       , 0.33333333],
       [0.25      , 0.2       , 0.16666667]])

In [62]:
arr ** 5

array([[1.000e+00, 3.200e+01, 2.430e+02],
       [1.024e+03, 3.125e+03, 7.776e+03]])

<p>Comparisons between arrays of the same size yield boolean arrays:</p>

In [70]:
arr2 = np.array([[0, 5, 1], 
                 [6, 5 , 10]], dtype=np.float64)

In [71]:
arr2 > arr

array([[False,  True, False],
       [ True, False,  True]])

In [72]:
arr2 == arr

array([[False, False, False],
       [False,  True, False]])

<h4>Important term : broadcasting </h4>

<ul>
    <li> Operations between differently sized arrays is called <b>broadcasting</b> </li>
    <li> Broadcasting is the process of making arrays with different shapes have compatible shapes for arithmetic operations. </li>
</ul>

<p> Here we say that the scalar value 4 has been broadcast to all of the other elements in
the multiplication operation. : </p>

In [74]:
arr * 4

array([[ 4.,  8., 12.],
       [16., 20., 24.]])

In [75]:
arr + 4

array([[ 5.,  6.,  7.],
       [ 8.,  9., 10.]])

<h3> Indexing and Slicing</h3>
<ul>
    <li> <b>arr[start:stop:step]</b> for 1d arrays </li>
    <li> <b>arr[start:stop:step, start:stop:step, ....]</b> for more than  1d arrays </li> 
<ul>

In [112]:
arr = np.arange(20)

In [113]:
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

In [114]:
arr[4]

4

In [115]:
arr[4:7]

array([4, 5, 6])

In [116]:
arr[:6]

array([0, 1, 2, 3, 4, 5])

In [117]:
arr[2:10:2]

array([2, 4, 6, 8])

<p><b>Tip: </b> In numpy arrays and python's list [a:b] slicing from N index means [a,b) in math. or [a, b-1]

In [118]:
arr[3:5] = 256
# arr.__setitem__(slice(3,5), 256)

In [119]:
arr

array([  0,   1,   2, 256, 256,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15,  16,  17,  18,  19])

<h4>Advanced Tip :</h4>
<ul>
    <li> <b>arr[start:stop:step]</b> means: <b>arr[slice(start, stop, step)]</b> </li>
    <li> <b> arr[index] </b> means: <b>arr.__getitem__(index)</b>
    <li> <b>arr[index] = value </b> means: <b>arr.__setitem__(index, value)</b> </li>
</ul>
<p> numpy use above facilities to enhance user experience </p><br>

In [120]:
arr_slice = arr[1:4]

In [121]:
arr_slice

array([  1,   2, 256])

In [122]:
arr_slice[0]=9999

In [123]:
arr

array([   0, 9999,    2,  256,  256,    5,    6,    7,    8,    9,   10,
         11,   12,   13,   14,   15,   16,   17,   18,   19])

<p> change values in arr_slice , the mutations are reflected in the original array arr </p><br>

<h4> Slicing in higher dimension arrays: </h4><br>

In [131]:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

In [132]:
arr2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [125]:
arr2d[1]

array([4, 5, 6])

In [126]:
arr2d[2][0]

7

In [127]:
arr2d[2, 1]

8

<h4>Indexing elements in a NumPy array</h4><br>
<img src="files/numpy_2darray.jpg"/>

In [129]:
arr2d[:2, 1:] # Rows: start until 2  , columns: 1 until end

array([[2, 3],
       [5, 6]])

In [130]:
arr2d[:2, 2] 

array([3, 6])

In [133]:
arr2d[:, :1]

array([[1],
       [4],
       [7]])

In [134]:
arr2d[:, 1]

array([2, 5, 8])

<h4> Boolean Indexing </h4>
<p> This boolean array can be passed when indexing the array. </p><br>

In [140]:
data = np.random.randn(4, 4)

In [146]:
boolean_index = (data < -.5) | (data > .5)

In [147]:
boolean_index

array([[False,  True, False,  True],
       [ True, False,  True,  True],
       [False,  True, False,  True],
       [False, False, False,  True]])

In [148]:
data[boolean_index]

array([-0.62499107, -1.78162578, -1.19346015,  1.28649311,  1.43627954,
        1.35764248, -1.56761333,  1.40906143])

<h4> Fancy Indexing :</h4><br>

In [151]:
arr = np.random.randn(8, 4)

In [152]:
arr

array([[-0.57626695,  1.69600111,  1.24572976,  0.57590031],
       [-1.03964033,  0.28504213,  0.06454751, -0.47459827],
       [ 0.71509542, -0.56081488, -0.87134776, -0.75725122],
       [ 0.78368665,  0.25980447, -0.02587608,  0.47869727],
       [-0.75215995,  0.38554363,  1.13447456, -0.40413109],
       [ 0.05629162, -0.61290746,  1.42266215,  0.13370658],
       [ 0.83932092, -0.68305147, -0.9750427 ,  0.20258635],
       [ 0.70025208, -0.66512579, -1.6310869 , -1.19664332]])

In [153]:
arr[[4, 3, 0, 6]]

array([[-0.75215995,  0.38554363,  1.13447456, -0.40413109],
       [ 0.78368665,  0.25980447, -0.02587608,  0.47869727],
       [-0.57626695,  1.69600111,  1.24572976,  0.57590031],
       [ 0.83932092, -0.68305147, -0.9750427 ,  0.20258635]])

In [156]:
arr[[1, 6], [3, 1]]

array([-0.47459827, -0.68305147])