# **READ Arrays**

Basically now that I have studied different methods to create arrays, now it's time to read the array

In [1]:
import numpy as np

## **Inspecting Arrays**

This is crucial because we need to know what type of array/dataset it is, before modifying it. It also helps to debug the issues.

In [2]:
ins_1d = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100]) #1d array

ins_2d = np.array([ #2d array
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12]
])

ins_3d = np.array([ #3d array
    [[100, 200, 300],
        [400, 500, 600]],

    [[719, 819, 919],
    [1019, 1119, 1129]]
])

In [3]:
# ways to inspect array:
ins_1d.shape # gives (10,) which basically says 10 columns
ins_1d.ndim # gives the number of dimensions/axes of the given array which in this case is only 1
ins_1d.size # total numbers of element present in the array -> 10
ins_1d.dtype # gives the type of data present in the array. since the array can hold only one data type
ins_2d.ravel() # flattens the array.ndim >= 2 into 1d array
ins_2d.T # gives the transpose of the array -> places rows as columns and vice versa
ins_2d.copy() # prints the deep (new) copy of the array and doesnt change the main array
ins_2d.view() # prints the copy of the array but changes made here will change the original array
np.info(ins_3d) # gives each and every detail of that array

class:  ndarray
shape:  (2, 2, 3)
strides:  (48, 24, 8)
itemsize:  8
aligned:  True
contiguous:  True
fortran:  False
data pointer: 0xe9cdb90
byteorder:  little
byteswap:  False
type: int64


In [4]:
data_1d = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100]) #1d array

data_2d = np.array([ #2d array
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12]
])

data_3d = np.array([ #3d array
    [[100, 200, 300],
        [400, 500, 600]],

    [[719, 819, 919],
    [1019, 1119, 1129]]
])

## **Indexing**

Use when the data point is given or known, and when we need to retieve only one value from the array

In [5]:
# indexing 1d arrays is simple as indexing in lists

data_1d[0] # gives first value -> 10, and in most of the language, first item is indexed by 0 not 1
data_1d[-1] # gives last value -> 100
data_1d[4] # 5th value


50

In [6]:
# indexing 2d arrays

# since this is a 2d array, we have 2 dimensions. first we need to get the desired row and then only the column

data_2d[0][3] # depicts 1st row and 4th column (0 is first, 3 is fourth) gives -> 4
data_2d[2][2] # 3rd row 3rd column/data -> 11

11

In [7]:
# indexing 3d arrays

# same as 2d array but with one extra layer of indexing, here we give 3 arguments and first one is slice, second and third are rows and columns respectively

data_3d[0][0][2] # first slice, first row, third column -> 300

300

## **Slicing Arrays**

Slicing is used when we need to slice the array from A to B. It changes the value of original array, so we won't be directly slicing the arrays, instead we will use the variable to store the sliced array.

Use when you need to see the contigious number of arrays, like from x index to y index

In [8]:
slice_1d = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100]) #1d array

slice_2d = np.array([ #2d array
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12]
])

slice_3d = np.array([ #3d array
    [[1002, 200, 300, 283782, 72632],
    [400, 500, 600, 238723, 2376237],
    [99999, 89898, 989898, 93983298, 23723]],

    [[719, 819, 919, 5678, 999],
    [1019, 1119, 1129, 100, 7777],
    [263723, 238723, 2365, 237632, 37632]],

    [[7149, 8419, 91349, 567348, 999],
    [101439, 134119, 143129, 12300, 723777],
    [2633443723, 23834723, 2365, 23742632, 276238]]
])

In [9]:
# slicing 1d array

# arr[start:stop]
sliced_1d = slice_1d[0:3] # gives the array from index 0 to index 2, or first element to 3rd element -> [10, 20, 30]
sliced_1d

# arr[start:upto]
sliced_1d_from_first = slice_1d[:6] # from 0 index to index 5 -> [10, 20, 30, 40, 50, 60]
sliced_1d_from_first

# arr[from:end]
sliced_1d_to_last = slice_1d[6:] # from index 6 to last -> [70, 80, 90, 100]
sliced_1d_to_last

# arr[::step]
sliced_1d_steps = slice_1d[::3] # stepping 3 from first number and on
sliced_1d_steps

# arr[::-1]
sliced_1d_reverse = slice_1d[::-1] # reverse
sliced_1d_reverse

array([100,  90,  80,  70,  60,  50,  40,  30,  20,  10])

In [10]:
# slicing 2d array

# accessing one row, all columns
sliced_2d_one_row = slice_2d[1, :] # selects 1st row only and it's all columns, but if we add : after 1, it shows all rows and it's columns from index 1
sliced_2d_one_row # [5, 6, 7, 8]

# accessing one column, all rows
sliced_2d_one_column = slice_2d[:, 1] # selects 2nd column of all the rows
sliced_2d_one_column



# arr[rows:columns], rectangular subset of the array
sliced_2d = slice_2d[1:3, 1:3] # this means give rows 1 and 2, with it's columns 1 and 2
sliced_2d

sliced_2d_b = slice_2d[0:3, 3:4]
sliced_2d_b

# arr[::steps, ::steps]
sliced_2d_steps = slice_2d[::2, ::2] # all row and column by 2 steps
sliced_2d_steps

sliced_2d_steps_row = slice_2d[1::2, :] # from row 2, show all rows with 2 steps and show all columns of those rows
sliced_2d_steps_row

sliced_2d_steps_column = slice_2d[0:3, ::2] # generate rows from row 1 to row 3 with their columns with 2 steps
sliced_2d_steps_column

array([[ 1,  3],
       [ 5,  7],
       [ 9, 11]])

In [11]:
# slicing 3d array

sliced_3d = slice_3d[1, 1, 0:2]
sliced_3d

sliced_3d_steps = slice_3d[::2, :2, 1::2]
sliced_3d_steps

array([[[   200, 283782],
        [   500, 238723]],

       [[  8419, 567348],
        [134119,  12300]]])

## **Boolean Indexing**

Use this when you need to know what numbers are present by performing Logics: is greater, is smaller, is equal, and, or, etc

In [12]:
bool_1d = np.array([5, 12, 3, 20, 8, 15, 1, 18, 7, 10])

bool_2d = np.array([
    [100, 2, 50, 7, 10, 150, 1, 80, 25, 5],
    [12, 1000, 3, 400, 5, 600, 7, 800, 9, 10],
    [90, 8, 70, 6, 50, 4, 30, 2, 10, 0]
])

bool_3d = np.array([
    [[1, 11, 21, 31, 41, 51, 61, 71, 81, 91],
     [2, 12, 22, 32, 42, 52, 62, 72, 82, 92],
     [3, 13, 23, 33, 43, 53, 63, 73, 83, 93]],

    [[100, 200, 300, 400, 500, 600, 700, 800, 900, 1000],
     [101, 201, 301, 401, 501, 601, 701, 801, 901, 1001],
     [102, 202, 302, 402, 502, 602, 702, 802, 902, 1002]]
])

In [13]:
# Boolean indexing in 1d

# >, <, = and !=
great_9 = bool_1d[bool_1d > 9] # gives array with numbers greater than 9
great_9

less_9 = bool_1d[ bool_1d < 9 ] # gives array with numbers lesser than 9
less_9

equal_5 = bool_1d[ bool_1d == 5] # gives if there is/are number equal to 5
equal_5

except_5 = bool_1d [bool_1d != 5] # wow, i thought it will give true, but != is more like excluding the number
except_5

# we dont use "and", "or" here, we use &, | because we need to do element-wise operations in numpy boolean indexing

# &, | and ~
or_5_20 = bool_1d [(bool_1d == 5) | (bool_1d == 20)] # show if either of them is true (because we told it to find if number either is 5 or 20)
or_5_20

and_5_20 = bool_1d [(bool_1d == 5) & (bool_1d == 20)] # it shows nothing because number/numbers can't be both 5 and 20 at the same time
and_5_20

# but if..
and_5_2 = bool_1d [(bool_1d < 5) & (bool_1d > 2)] # it gives result because number/numbers can be both greater than 2 and less than 5
and_5_2

# however..
and_5_10 = bool_1d [(bool_1d < 5) & (bool_1d > 10)] # it will give nothing because number/numbers can't be less than 5 and greater than 10 at the same time... less than 5 is -inf to 4, but greater than 10 is 10 to +inf
and_5_10

not_5 = bool_1d [~(bool_1d == 5)] # excludes 5
not_5

array([12,  3, 20,  8, 15,  1, 18,  7, 10])

In [14]:
# 2d boolean indexing

# by default boolean indexing creates the mask; arrray of true and false depending upon the logic. Then, to see the value, we print it and get the 1d array result.

# >, <, =, !=
great_2d = bool_2d [(bool_2d > 5)] # get all the values over 5
great_2d

great_2d_2 = bool_2d [bool_2d[:, 2] > 5] # get all rows where value of the column 1 is greater than 5
great_2d_2

## great_2d_3 = bool_2d [bool_2d[1] > 5] # get 1st row if it's greater than 5 (this gives error because bool_2d has (3, 10) shape, but the shape of (bool_2d[1] > 5) is (10,). so computer doesnt know which true is of which number since the mask is not same)
## great_2d_3

small_2d = bool_2d [bool_2d < 10]
small_2d

small_2d_2 = bool_2d [bool_2d[:, 3] < 9] # get all rows if the 4th column of the row is less than 9
small_2d_2

equal_2d = bool_2d [bool_2d[:, 0] == 100]
equal_2d

equal_100 = bool_2d [bool_2d == 100]
equal_100

or_2d = bool_2d [(bool_2d > 10) | (bool_2d < 5)]
or_2d

or_2d_2 = bool_2d [(bool_2d[:, 2] > 5) | (bool_2d[:, 7] < 20)]
or_2d_2

and_2d = bool_2d [(bool_2d > 20) & (bool_2d < 100)]
and_2d

and_2d_2 = bool_2d [(bool_2d[:, 2] > 10) & (bool_2d[:, 3] < 6)] # gives nothing because we dont have row which contains the both column2 > 10, and column 3 < 6
and_2d_2


array([], shape=(0, 10), dtype=int64)