<a href="https://colab.research.google.com/github/cranialsurge/learningpythonfordatascience/blob/main/Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Arrays

## Create an array

In [2]:
import numpy as np

In [2]:
# create array
l = [1.0, 2.0, 3.0]
a = np.array(l)
# display array
print(a)
# display array shape
print(a.shape)
# display array data type
print(a.dtype)

[1. 2. 3.]
(3,)
float64


## Functions to create arrays

In [3]:
# create empty array
a = np.empty([3,3])
print(a)

[[ 3.45126646e-31 -6.90253292e-31  1.72563323e-31]
 [-6.90253292e-31  1.50130091e-30 -4.65920972e-31]
 [ 1.72563323e-31 -4.65920972e-31  2.67473151e-31]]


In [5]:
# create zero array
a = np.zeros([3,5])
print(a)

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]


In [6]:
# create one array
a = np.ones([5])
print(a)

[1. 1. 1. 1. 1.]


## Combining arrays

### Vertical stack(vstack)

In [7]:
# create array with vstack
# create first array
a1 = np.array([1,2,3])
print(a1)
# create second array
a2 = np.array([4,5,6])
print(a2)
# vertical stack
a3 = np.vstack((a1, a2))
print(a3)
print(a3.shape)

[1 2 3]
[4 5 6]
[[1 2 3]
 [4 5 6]]
(2, 3)


### Horizontal Stack(hstack)

In [8]:
# create array with hstack
# create first array
a1 = np.array([1,2,3])
print(a1)
# create second array
a2 = np.array([4,5,6])
print(a2)
# create horizontal array
a3 = np.hstack((a1, a2))
print(a3)
print(a3.shape)

[1 2 3]
[4 5 6]
[1 2 3 4 5 6]
(6,)


## From 1D list to array

In [5]:
# create 1D array
data = [11, 22, 33, 44, 55]
print(type(data))
# array of data
data = np.array(data)
print(data)
print(type(data))

<class 'list'>
[11 22 33 44 55]
<class 'numpy.ndarray'>


In [6]:
# create 3D array
# list of data
data = [[11, 22],
[33, 44],
[55, 66]]
print(type(data))
# array of data
data = np.array(data)
print(data)
print(type(data))

<class 'list'>
[[11 22]
 [33 44]
 [55 66]]
<class 'numpy.ndarray'>


## Array indexing

In [13]:
# define array
data = np.array([11, 22, 33, 44, 55])
# index data
print(data[0])
print(data[4])
# you can use negative indices to retrieve values offset from the end of the array. 
# for example, the index -1 refers to the last item in the array, -2 the second last item and so on.
print(data[-1])
print(data[-4])

11
55
55
22


In [18]:
# index 2D array
from numpy import array
# define array
data = array([
[11, 22],
[33, 44],
[55, 66]])
# index data
print(data[0,0])
print(data[1,1])
# all items in the first row
print(data[0,])
# all items in the third row
print(data[2,])

11
44
[11 22]
[55 66]


## Slicing

### Slicing is specified using the colon operator : with a from and to index before and after the column respectively.
### The slice extends from the from index and ends one item before the to index.

In [26]:
# slice a 1D array
# define array
data = np.array([11, 22, 33, 44, 55])
# this prints all elements in the array
print(data[:])
# this prints the first item in the array
print(data[0:1])
# this prints the last 2 items in the array using -ve indices
print(data[-2:-1])

[11 22 33 44 55]
[11]
[44]


## Split input and output features

It is common to split your loaded data into input variables (X) and the output variable (y). We
can do this by slicing all rows and all columns up to, but before the last column, then separately
indexing the last column. For the input features, we can select all rows and all columns except
the last one by specifying : for in the rows index, and :-1 in the columns index.

### X = [:, :-1]

For the output column, we can select all rows again using : and index just the last column
by specifying the -1 index.

### y = [:, -1]

In [27]:
# split input and output data
# define array
data = array([
[11, 22, 33],
[44, 55, 66],
[77, 88, 99]])
# separate data
X, y = data[:, :-1], data[:, -1]
print(X)
print(y)

[[11 22]
 [44 55]
 [77 88]]
[33 66 99]


## Split train and test rows

It is common to split a loaded dataset into separate train and test sets. This is a splitting of
rows where some portion will be used to train the model and the remaining portion will be used
to estimate the skill of the trained model. This would involve slicing all columns by specifying :
in the second dimension index. The training dataset would be all rows from the beginning to
the split point.

### train = data[:split, :]

In [28]:
# split train and test data
# define array
data = array([
[11, 22, 33],
[44, 55, 66],
[77, 88, 99]])
# separate data
split = 2
train,test = data[:split,:],data[split:,:]
print(train)
print(test)

[[11 22 33]
 [44 55 66]]
[[77 88 99]]
