# NUMPY BASICS

NumPy, which stands for ‘Numerical Python’, is a library meant for scientific calculations. 
The basic data structure of NumPy is an array. 


link to NUMPY DOCUMENTATION : https://numpy.org/doc/1.20/ 

        
Others :
    
    https: //www.w3schools.com/python/numpy/numpy_intro.asp 
    https: //www.tutorialspoint.com/numpy/index.htm   
        
        
    
NumPy is the fundamental package for scientific computing in Python. 
It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices),
and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.


At the core of the NumPy package, is the ndarray object. 
This encapsulates n-dimensional arrays of homogeneous data types, with many operations being performed in compiled code for performance. There are several important differences between NumPy arrays and the standard Python sequences:

  

    
NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically).
Changing the size of an ndarray will create a new array and delete the original.

The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory.
The exception: one can have arrays of (Python, including NumPy) objects, thereby allowing for arrays of different sized elements.

NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data. 
Typically, such operations are executed more efficiently and with less code than is possible using Python’s built-in sequences.



## Why is NumPy Fast?


Vectorization describes the absence of any explicit looping, indexing, etc., in the code - these things are taking place, of course, just “behind the scenes” in optimized, pre-compiled C code. Vectorized code has many advantages, among which are:


vectorized code is more concise and easier to read

fewer lines of code generally means fewer bugs

the code more closely resembles standard mathematical notation (making it easier, typically, to correctly code mathematical constructs)

vectorization results in more “Pythonic” code. Without vectorization, our code would be littered with inefficient and difficult to read for loops.

           
            



import numpy as np


In [5]:
np.array( [1.0, 2, 3.5, 0.2, True] )


array([1. , 2. , 3.5, 0.2, 1. ])

In [6]:
np.array( [1.0, 2, 3.5, 0.2, "True"] )

array(['1.0', '2', '3.5', '0.2', 'True'], dtype='<U32')

In [8]:
np.array[1.0, 2, 3.5, 0.2, True]

TypeError: 'builtin_function_or_method' object is not subscriptable

In [44]:
arr = np.array( 1.0, 2, 3.5, 0.2, "True" )


TypeError: array() takes from 1 to 2 positional arguments but 5 were given

In [11]:
type(np.array)

builtin_function_or_method

In [13]:
type(np.array( [1.0, 2, 3.5, 0.2, "True"] ) )

numpy.ndarray

In [14]:
np.array( [1.0, 2, 3.5, 0.2, "True"] )

array(['1.0', '2', '3.5', '0.2', 'True'], dtype='<U32')

In [42]:
#if we don't want the tyoe to be displayed

arr = np.array((1, 2, 3, 4, 5))

print(arr)

#clearly the output will be a numpy array --> elements are displyed without a comma, unlike lists

['1.0' '2' '3.5' '0.2' 'True']


'str1024'

### 0-D Arrays

0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.

In [25]:
#Create a 0-D array with value 42


arr = np.array(42)

print(arr)

42


###  1-D Arrays


An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.

These are the most common and basic arrays.

In [26]:
#Create a 1-D array containing the values 1,2,3,4,5:

arr = np.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


### 2-D Arrays

An array that has 1-D arrays as its elements is called a 2-D array.

These are often used to represent matrix or 2nd order tensors.

In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)

### 3-D arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.

These are often used to represent a 3rd order tensor.

In [27]:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(arr)

[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]


# ndarray

NumPy’s array class is called ndarray. It is also known by the alias array. Note that numpy.array is not the same as the Standard Python Library class array.array, which only handles one-dimensional arrays and offers less functionality. The more important attributes of an ndarray object are:

    


### ndarray.ndim
the number of axes (dimensions) of the array.

### ndarray.shape
the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape tuple is therefore the number of axes, ndim.
If, for example, you have a 2-D array with 2 rows and 3 columns, the shape of your array is (2, 3).

### ndarray.size
the total number of elements of the array. This is equal to the product of the elements of shape.


### ndarray.dtype
an object describing the type of the elements in the array. One can create or specify dtype’s using standard Python types. Additionally NumPy provides types of its own. numpy.int32, numpy.int16, and numpy.float64 are some examples.

### ndarray.itemsize
the size in bytes of each element of the array. For example, an array of elements of type float64 has itemsize 8 (=64/8), while one of type complex32 has itemsize 4 (=32/8). It is equivalent to ndarray.dtype.itemsize.

### ndarray.data
the buffer containing the actual elements of the array. Normally, we won’t need to use this attribute because we will access the elements in an array using indexing facilities.



### Check Number of Dimensions?

NumPy Arrays provides the ndim attribute that returns an integer that tells us how many dimensions the array have.

In [28]:
a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)

0
1
2
3


In [36]:
a = np.arange(15).reshape(3, 5)
a


array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [37]:
a.shape

(3, 5)

In [38]:
a.ndim

2

In [39]:
a.dtype.name

'int32'

In [45]:
a.itemsize

4

In [46]:
a.size

15

In [49]:
b = np.array([1.2, 3.5, 5.1])
b


dtype('float64')

In [50]:
b.dtype

dtype('float64')

In [54]:
#array transforms sequences of sequences into two-dimensional arrays,
#sequences of sequences of sequences into three-dimensional arrays, and so on.

b = np.array([(1.5,2,3), (4,5,6)])
b


array([[1.5, 2. , 3. ],
       [4. , 5. , 6. ]])

In [53]:
#The type of the array can also be explicitly specified at creation time:


c = np.array( [ [1,2], [3,4] ], dtype=complex )
c

array([[1.+0.j, 2.+0.j],
       [3.+0.j, 4.+0.j]])

# NumPy Array Indexing and Slicing


Indexing refers to extracting a single element from an array, while slicing refers to extracting a subset of elements from
an array. Both indexing and slicing are exactly the same to those in lists. 
Having a unified method of extracting elements from lists and NumPy arrays helps in keeping the library simpler. 



Array indexing is the same as accessing an array element. You can access an array element by referring to its index number.

The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

In [2]:
#Get the first element from the following array:

import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[0])

1


In [3]:
#Get third and fourth elements from the following array and add them.



arr = np.array([1, 2, 3, 4])

print(arr[2] + arr[3])

7


In [4]:
#Access 2-D Arrays
#To access elements from 2-D arrays we can use comma separated integers representing the dimension and the index of the element.

#Example
#Access the 2nd element on 1st dim:

import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('2nd element on 1st dim: ', arr[0, 1])

2nd element on 1st dim:  2


In [5]:
#Access 3-D Arrays
#To access elements from 3-D arrays we can use comma separated integers representing the dimensions and the index of the element.

#Example
#Access the third element of the second array of the first array:

import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

print(arr[0, 1, 2])

6


In [6]:
#Negative Indexing
#Use negative indexing to access an array from the end.

#Example
#Print the last element from the 2nd dim:

import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('Last element from 2nd dim: ', arr[1, -1])

Last element from 2nd dim:  10


In [7]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5])

[2 3 4 5]


In [10]:
#Slice elements from index 4 to the end of the array:



arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[4:])

[5 6 7]


In [11]:
#Slice elements from the beginning to index 4 (not included):


arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[:4])

[1 2 3 4]


In [12]:
#Return every other element from index 1 to index 5:



arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5:2])

[2 4]


In [13]:
#Slicing 2-D Arrays
#Example
#From the second element, slice elements from index 1 to index 4 (not included):



arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[1, 1:4])

[7 8 9]


In [3]:
#Example
#From both elements, return index 2:

import numpy as np



arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[0:2, 2])

[3 8]


In [15]:
#From both elements, slice index 1 to index 4 (not included), this will return a 2-D array:

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[0:2, 1:4])

[[2 3 4]
 [7 8 9]]


In [17]:
## Data Types in NumPy

NumPy has some extra data types, and refer to data types with one character, like i for integers, u for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent them.


i - integer

b - boolean

u - unsigned integer

f - float

c - complex float

m - timedelta

M - datetime

O - object

S - string

U - unicode string

V - fixed chunk of memory for other type ( void )


## Data Types in Python

By default Python have these data types:

strings - used to represent text data, the text is given under quote marks. e.g. "ABCD"
integer - used to represent integer numbers. e.g. -1, -2, -3
float - used to represent real numbers. e.g. 1.2, 42.42
boolean - used to represent True or False.
complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j


SyntaxError: invalid syntax (<ipython-input-17-2618f9d38863>, line 3)

In [16]:
arr = np.array(['apple', 'banana', 'cherry'])

print(arr.dtype)


<U6


In [19]:
# For TYPES :  i, u, f, S and U we can define size as well.

#What if a Value Can Not Be Converted?
#If a type is given in which elements can't be casted then NumPy will raise a ValueError.

#ValueError: In Python ValueError is raised when the type of passed argument to a function is unexpected/incorrect.


arr = np.array([1, 2, 3, 4], dtype='i4')

print(arr)
print(arr.dtype)

[1 2 3 4]
int32


## MIN MAX MEAN Functions

Similar to lists, you can subset your data through conditions based on your requirements in NumPy arrays. 

To do this, you need to use logical operators such as ‘<’ and ‘>’. NumPy also has a few inbuilt functions such as max(), min()
and mean(), which allow you to calculate statistically important data over the data directly



#numpy.mean(input_array, axis=None, dtype=None, out=None, keepdims=<no value>)

This function can take five arguments. The purposes of these arguments are described below:

input_array :

It is a mandatory argument that takes an array as the value and the average of the array values is calculated by this function.



axis : 

It is an optional argument, and the value of this argument can be an integer or the tuple of integers. 
This argument is used for the multi-dimensional array. If the value of the axis is set to 0, then the function will calculate the mean of the column values, and if the value of the axis is set to 1, then the function will calculate the mean of the row values.



dtype :

It is an optional argument that is used to define the data type of the mean value.


out :

It is an optional argument and is used when the output of the function will need to store in an alternative array.
In this case, the dimension of the output array must be the same as the input array. The default value of this argument is None.


keepdims: It is an optional argument, and any Boolean value can be set in this argument. 
It is used to transmit the output properly based on the input array.


This function returns an array of mean values if the value of the out argument is set to None,
otherwise the function returns the reference to the output array.

In [20]:
np_array = np.array([6, 4, 9, 3, 1])

# Print array and mean values

print("The values of the one-dimensional NumPy array are:\n ", np_array)

print("The mean value of the one-dimensional array is:\n", np.mean(np_array))

# Create a two-dimensional array

np_array = np.array([[5, 3, 5], [5, 4, 3]])

# Print array and mean values

print("\nThe values of the two-dimensional NumPy array are:\n  ", np_array)

print("The mean values of the two-dimensional array are:\n", np.mean(np_array, axis=0))

The values of the one-dimensional NumPy array are:
  [6 4 9 3 1]
The mean value of the one-dimensional array is:
 4.6

The values of the two-dimensional NumPy array are:
   [[5 3 5]
 [5 4 3]]
The mean values of the two-dimensional array are:
 [5.  3.5 4. ]


In [22]:
#Use of max() function
#The syntax of the max() function is given below.

#Syntax:

#numpy.max(input_array, axis=None, out=None, keepdims=None, initial=None, where=None)

# Create NumPy array of integers

np_array = np.array([21, 5, 34, 12, 30, 6])

# Find the maximum value from the array

max_value = np.max(np_array)

# Print the maximum value

print('The maximum value of the array is: ', max_value)

The maximum value of the array is:  34


In [23]:
np_array = np.array([21, 5, 34, 12, 30, 6])

# Find the maximum value from the array

min_value = np.min(np_array)

# Print the maximum value

print('The minimum value of the array is: ', min_value)

The minimum value of the array is:  5


## Converting Data Type on Existing Arrays

The best way to change the data type of an existing array, is to make a copy of the array with the astype() method.

The astype() function creates a copy of the array, and allows you to specify the data type as a parameter.

The data type can be specified using a string, like 'f' for float, 'i' for integer etc. or you can use the data type directly
like float for float and int for integer.

In [24]:
#Example
#Change data type from float to integer by using 'i' as parameter value:


arr = np.array([1.1, 2.1, 3.1])

newarr = arr.astype('i')

print(newarr)
print(newarr.dtype)

[1 2 3]
int32


In [25]:
#Change data type from float to integer by using int as parameter value:


arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype(int)

print(newarr)
print(newarr.dtype)

[1 2 3]
int32


In [26]:
#Example
#Change data type from integer to boolean:



arr = np.array([1, 0, 3])

newarr = arr.astype(bool)

print(newarr)
print(newarr.dtype)

[ True False  True]
bool


In [43]:
# Create an empty array with 2 elements

np.empty(6)

# o/p may vary

array([4.24399158e-314, 8.48798317e-314, 1.27319747e-313, 1.69759663e-313,
       2.12199579e-313, 2.54639495e-313])

In [38]:
np.ones(2)


array([1., 1.])

In [39]:
np.zeros(2)


array([0., 0.])

In [46]:
#And even an array that contains a range of evenly spaced intervals. 
#To do this, you will specify the first number, last number, and the step size.

np.arange(2, 9, 2)

array([2, 4, 6, 8])

In [45]:
#You can also use np.linspace() to create an array with values that are spaced linearly in a specified interval:
    
np.linspace(0, 10, num=5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

In [47]:
#While the default data type is floating point (np.float64),
#you can explicitly specify which data type you want using the dtype keyword.

x = np.ones(2, dtype=np.int64)
x

array([1, 1], dtype=int64)

## SORTING

In addition to sort, which returns a sorted copy of an array, you can use:

argsort, which is an indirect sort along a specified axis,

lexsort, which is an indirect stable sort on multiple keys,

searchsorted, which will find elements in a sorted array, and

partition, which is a partial sort.

In [50]:
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])

#we can quickly sort the numbers in ascending order with:

np.sort(arr)

array([1, 2, 3, 4, 5, 6, 7, 8])

In [52]:
## CONCATENATION

a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

np.concatenate((a, b))

array([1, 2, 3, 4, 5, 6, 7, 8])

In [55]:
# 2-D array

x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6]])

np.concatenate((x, y), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6]])