# C. Structure and Creation

A `Numpy array` is the most basic data structure available when it comes to handling data with Numpy.
Numpy provides a variety of vectorized functions and attributes, and Numpy arrays enable the use of Numpy functionality in an easy and quick way. 
Therefore, it is important to understand the structure of Numpy arrays to maximize utility of `Numpy`.

In lesson C, we'll cover the basic attributes of a Numpy array, `value`, `shape` and `dtype` and learn how to creat the arrays based on these attributes. 


### _Objective_
1. **Numpy Array Attributes**: Understanding structure of Numpy arrays and the attributes(`value`, `shape`, `dtype`, etc) 
2. **Numpy Array Creation**: Understanding how to create Numpy arrays 

In [None]:
import numpy as np # importing Numpy

# \[1. Structure of Numpy Arrays\]

When creating a Numpy array, you need to set the values for elements, **shape** of the array. To note is, all elements of a Numpy array must be of the same data type. 

Then, let's take a closer look at the important features of Numpy arrays, **shape** and **data type**.

## 1. Data Type

+ As mentioned in Basic Python, the variable type is automatically identified and set after a value is assigned to a variable, and a variable accepts values of any data types.

+ In contrast, Numpy arrays are homogeneous and only accept values of the **same data type** for elements. 

### (1) Data types of list elements and Numpy array elements

A Python list is heterogeneous and accepts values of any data types. For instance, you can create a list composed of a string, integer, and a floating-point number.

In [None]:
# List
A_ = [1,1.2,False]
A_

[1, 1.2, False]

On the other hand, a Numpy array is homogenous and only accepts the values of a single data type for elements. For instance, if you convert the list of three integers and a Boolean, `[1, 1, 2, False]`, to a Numpy array, the Boolean `False` will automatically be converted to an integer `0`.


In [None]:
# Numpy array
A = np.array([1,1.2,False])
A

array([1. , 1.2, 0. ])

### (2) **Type Casting** -  Data type conversion
You can use **`.astype(target_datatype)`** to manually convert a data type to a specified data type.

In [None]:
A = np.array([0,1,2,3,4,5])
A

array([0, 1, 2, 3, 4, 5])

In [None]:
# Converting integers to floating-point numbers 
A.astype(np.float) 

array([0., 1., 2., 3., 4., 5.])

In [None]:
# Converting integers to Boolean
A.astype(np.bool)

array([False,  True,  True,  True,  True,  True])

### (3) Data type of Numpy array elements
You can use `.dtype` to check the data type of given array elements.


#### Boolean array

In [None]:
A = np.array([True,False])
A

array([ True, False])

In [None]:
A.dtype

dtype('bool')

#### int array

In [None]:
A = np.array([1,2,3])
A

array([1, 2, 3])

In [None]:
A.dtype

dtype('int32')

## 2. Shape

+ In Numpy, **shape** means the dimensional shape of the array.

+ You can use 
 - **`.shape`** to check the dimensional shape in terms of the number of dimensions as well as the length of each dimension; 
 - **`.ndim`** to check the number of dimensions; 
 - **`.size`** to check the total number of elements in an array.

### (1) Shape of a Numpy array

When creating a Numpy array, the number of elements constituting each axis must be the same. For instance, for a (5, 5) array, each row must consist of 5 elements, and columns must also be of 5. Unless this condition is satisfied, it will return an error.

In [None]:
A = np.array([
    [1,2,3,4,5], # row 0 with 5 elements
    [2,3,4,5], # row 1 with 4 elements
    [3,4,5,6,7] # row 2 with 5 elements
],dtype=np.int64)
A

ValueError: ignored

In [None]:
B = np.array([
    [1,2,3,4,5], # row 0 with 5 elements
    [2,3,4,5,6], # row 1 with 5 elements
    [3,4,5,6,7] # row 2 with 5 elements
    ],dtype=np.int64)
B

array([[1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7]])

You can check the number of elements in each dimension with `.shape`.

In [None]:
B.shape

(3, 5)

The above tuple contains two comma-separated elements. Here, the number of tuple elements tells how many dimensions the array object has. You can see 2 values, `3 and 5`, in the tuple, and it shows the array is 2-dimensional. In addition, each number corresponds the number of elements on each axis, so, `B` has 3 elements on axis 0 and 5 elements on axis 1.

Here is the convention to follow for shape of Numpy arrays:
- For 1D-array, `.shape` returns a tuple with only 1 element (i.e. (n,)).
- For 2D-array, `.shape` returns a tuple with 2 elements (i.e. (n,m)).
- For 3D-array, `.shape` returns a tuple with 3 elements (i.e. (n,m,k)).
- For 4D-array, `.shape` returns a tuple with 4 elements (i.e. (n,m,k,j)).

You can check the total number of elements within a Numpy array regardless of the number of dimensions, you can use **`.size`** to check it.


In [None]:
B.size

15

### (2) Dimension

A Numpy array is an n-dimensional data structure and can take a variety of forms for shape. You can set the number of dimensions of the array by layering square brackets `[]`.


#### 1-dimensional array `(n,)` 

A one-dimensional array is called a **`vector`**. In a vector, the elements are enclosed only with a pair of **`[]`**.

In [None]:
A_1D = np.array([1,2,3,4,5])
A_1D

array([1, 2, 3, 4, 5])

In [None]:
A_1D.ndim # number of dimensions

1

In [None]:
A_1D.shape

(5,)

In [None]:
A_1D.size

5

#### 2-dimensional array `(n, m)`

A 2-dimensional array is called a `matrix`. A matrix consists of rows and columns and can be created by enclosing elements in two layers of square brackets as **`np.array([[elements]])`**.


In [None]:
B_2D = np.array([
    [1,2,3,4,5],
    [2,3,4,5,6],
    [3,4,5,6,7]
],dtype=np.int64)
B_2D

array([[1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7]], dtype=int64)

In [None]:
B_2D.ndim # checking the array dimension

2

In [None]:
B_2D.shape

(3, 5)

In [None]:
B_2D.size

15

#### 3-dimensional array `(n, m, d)`

In mathematics, an array of three or more dimensions are generally regarded as a **tensor or a multidimensional matrix**. A 3-dimensional array is created by enclosing elements in three layers of square brackets. 


In [None]:
C_3D = np.array([
    [
        [1,2,3,4,5],
        [2,3,4,5,6],
        [3,4,5,6,7]
    ],
    [
        [1,2,3,4,5],
        [2,3,4,5,6],
        [3,4,5,6,7]
    ]
])
C_3D

array([[[1, 2, 3, 4, 5],
        [2, 3, 4, 5, 6],
        [3, 4, 5, 6, 7]],

       [[1, 2, 3, 4, 5],
        [2, 3, 4, 5, 6],
        [3, 4, 5, 6, 7]]])

In [None]:
C_3D.ndim # what dimensional is the array?

3

In [None]:
C_3D.shape # 2x3x5

(2, 3, 5)

In [None]:
C_3D.size  # 2x3x5 = 15 is the total number of elements the array has

30

### (3) Axis

A Numpy array is a n-dimensional data structure composed of n-axes.
Since Numpy enables not only element-wise operations but axis-wise operations such as data aggregation and addition, it is crucial to understand the concept of **axis** before we move on to axis-wise operations.
To note is, Numpy follows the rule of zero-base indexing and axes are likewise indexed from 0. 

#### Axis in 1-dimensional arrays



![1d](https://user-images.githubusercontent.com/71862853/97671398-80694d00-1acb-11eb-823d-9a665c8eebc3.PNG)

A 1-dimensional array has only 1 axis, and the axis is **'axis 0'**.<br>
The above array is of shape (4, ), that says the array consists 1 axis of 4 elements. 

In [None]:
A = np.array([2, 5, 6, 9])

NameError: ignored

In [None]:
A.shape

NameError: ignored

#### Axes of a 2-dimensional array


![2d](https://user-images.githubusercontent.com/71862853/97675103-0c7e7300-1ad2-11eb-9fec-75b5ba72388f.PNG)<br>A 2-dimensional array `A` is of shape (2, 3).<br>
For a 2-dimensional array, the tuple for array shape takes two comma-separated values, and each refers to the number of elements on each axis, `(axis_0, axis_1)`. In this case, `A` consists of 2 rows `(axis = 0)` and 3 columns`(axis = 1)`.



In [None]:
A = np.array([[3.5, 4.0, 6.5], [.4, .9, 4.7]])
A

NameError: ignored

In [None]:
A.shape

(2, 3)

#### Axes of a 3-dimensional array


![3d](https://user-images.githubusercontent.com/71862853/97675954-469c4480-1ad3-11eb-935a-1ca750940a45.PNG)

A 3-dimensional array `A` is of shape (4, 3, 2).<br>
For a 3-dimensional array, the tuple for array shape takes 3 comma-separated values, and each refers to the number of elements on the corresponding axis from axis 0 to axis 2, `(axis_0, axis_1, axis2)` in this case. 'A' consists of 4 rows`(axis = 0)` and 3 columns`(axis = 1)` in a depth of 2(axis = 2). 



In [None]:
A = np.array([[[1,2,3], [4,5,6]], [[1,2,3], [4,5,6]]])
A

array([[[1, 2, 3],
        [4, 5, 6]],

       [[1, 2, 3],
        [4, 5, 6]]])

In [None]:
A.shape

(2, 2, 3)

# \[2. Creating Numpy Arrays\]

There are many ways to create a Numpy Array. It can be created by taking data from other data structures or by setting its configuration manually.

## 1. Creating an Array based on Other Data Structures 

You can create a new Numpy array by using data sets of other data structures such as of existing lists or tuples.




### (1) Creating an Array using existing lists or tuples - `np.array()`

This is the most basic way of creating an array with Numpy by converting a list or tuple into an array.

In [None]:
A = np.array([1,2,3,4,5])
A

array([1, 2, 3, 4, 5])

### (2) Copying an existing Array - `np.copy()`

Suppose you want to make a copy of `A`.

In [None]:
A = np.array([1,2,3,4,5])
A

array([1, 2, 3, 4, 5])

So, you create `B` as below. In that case, `B` is not taking a different memory space from `A`. Instead, `B` is just pointing to the same memory address as `A`.  

In [None]:
B = A
print(id(B))
print(id(A))

2573078041184
2573078041184


Then, what should you do if you want to create an array that only copies the values of `A` but not the memory address so that the copied one acts like a completely different variable from `A`?<br>
For that, you can use the `.copy()` method.

In [1]:
A = np.array([1,2,3,4,5])
A

NameError: name 'np' is not defined

In [2]:
B = A.copy() # copying and saving features of `A` in `B`
print(id(B))
print(id(A)) # 'A' and `B` stored in different memory addresses

NameError: name 'A' is not defined

In [3]:
B[0] = 0 # any change in `B` wouldn't affect `A`

NameError: name 'B' is not defined

In [None]:
A # `A` stays unchanged

array([1, 2, 3, 4, 5])

<br>

## 2. Creating an Array by Setting Configuration

+ You can create a Numpy array by manually setting the values, the shape, and the data type.

+ You can create a Numpy array, all elements of which are the same.

+ You can create a Numpy array by only copying the shape of an existing array, but with different values.

+ You can set the values of the elements at equal intervals.

### (1) Creating an arrays filled with a single value: 
- `np.zeros()`&nbsp;&nbsp; - a Numpy array with zeros.
- `np.ones()`&nbsp;&nbsp; - a Numpy array with ones.
- `np.full()` - a Numpy array with a given fill_value.

When using the above methods, the value for the array elements are preset. So, you should only specify the `shape` and `dtype` attributes.

In [None]:
# Creating a matrix of 3 rows and 2 columns filled with 0
A = np.zeros(shape=(3,2), dtype=np.float64)
A

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [None]:
# Creating a matrix of 3 rows and 2 columns filled with 1
A = np.ones(shape=(3,2), dtype=np.float64)
A

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

In [None]:
# Creating a matrix of 3 rows and 2 columns filled with 3 in a floating-point 
A = np.full(shape=(3,2), fill_value=3., dtype=np.float64)
A

array([[3., 3.],
       [3., 3.],
       [3., 3.]])

### (2) Creating an array of pre-set shape and values.
- `np.zeros_like()`&nbsp; &nbsp;
- `np.ones_like()`&nbsp; &nbsp;
- `np.full_like()`&nbsp;

With the given array shape, you can create an array filled with a single desired value by using the above methods. 


In [None]:
A = np.array([1,2,3,4,5])
A

array([1, 2, 3, 4, 5])

In [None]:
# Creating an array of shape `A` filled with 0
np.zeros_like(A, dtype=np.int) 

array([0, 0, 0, 0, 0])

In [None]:
# Creating an array of shape `A` filled with 1
np.ones_like(A, dtype=np.int) 

array([1, 1, 1, 1, 1])

In [None]:
# Creating an array of shape `A` filled with 13
np.full_like(A, fill_value=13, dtype=np.int) 

array([13, 13, 13, 13, 13])

### (3) Creating an array of values over a specified interval: 
- `np.linspace()`
- `np.arange()`

With Numpy, you can create an array with numbers at a specified interval.<br> 
For instance, You can create an array consisting of consecutive values from 1 to 10 at regular intervals, using the following Numpy functions. 

* `np.linspace()` - to create an array consisting of evenly spaced numbers **over a specified interval**.


* `np.arange()` - to create an array consisting of evenly spaced numbers **within a given interval**.
<br><br><br>




In [None]:
np.linspace(start=1,stop=10, num=19) # generating 19 values from 1(inclusive) to 10(exclusive) at equal intervals

array([ 1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5,  5. ,  5.5,  6. ,
        6.5,  7. ,  7.5,  8. ,  8.5,  9. ,  9.5, 10. ])

In [None]:
np.arange(1,10,0.5) # generating values from 1(inclusive) to 10(exclusive) in every 0.5

array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. , 6.5, 7. ,
       7.5, 8. , 8.5, 9. , 9.5])