![logo](../../img/license_header_logo.png)
> **Copyright &copy; 2021 CertifAI Sdn. Bhd.**<br>
 <br>
This program and the accompanying materials are made available under the
terms of the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). <br>
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations
under the License. <br>
<br>**SPDX-License-Identifier: Apache-2.0**

# 02 - Array Creation
Authored by: [Kian Yang Lee](https://github.com/KianYang-Lee) - kianyang.lee@certifai.ai

## <a name="description">Notebook Description</a>

This notebook describes various ways to create `numpy ndarray`. It is especially suitable for beginners new to `numpy` module.

By the end of this tutorial, you will be able to:

1. Explain about how `numpy ndarray` can be created
2. Apply `numpy.arange`, `numpy.array`, `numpy.zeros`, `numpy.ones` and `numpy.empty` methods to create `numpy ndarray`
3. Explain and apply `numpy.dtype` and `astype()` methods to specify and modify `numpy data-type`

## Notebook Outline
Below is the outline for this tutorial:
1. [Notebook Description](#description)
2. [Notebook Configurations](#configuration)
3. [Method 1: Create a Range of Evenly Spaced Values](#arange)
3. [Method 2: Create from `list` or `tuple`](#array)
4. [Method 3: Zeros, Ones and Empty](#similar)
5. [`ndarray` Data-type](#dtype)
6. [Summary](#summary)
7. [Reference](#reference)

## <a name="configuration">Notebook Configurations</a>
This notebook will works only on `numpy` module, a popular `python` library for numerical computation. It is common for people to import it using the alias `np`.

In [1]:
### BEGIN SOLUTION
import numpy as np
### END SOLUTION

## <a name="arange">Method 1: Create a Range of Evenly-spaced Values</a>
`numpy.arange()` method is a simple way to create an `ndarray`. It is akin to `Python`'s `range` function, but instead of returning a `list` object, it returns a `ndarray` object. It takes only one mandatory argument which is `stop` to include create the array with values up to but not including the `stop` value provided. In this case, the `start` will default to 0 and `step` default to 1, as shown below.

In [2]:
# 10 is not included in the ndarray
### BEGIN SOLUTION
np.arange(10) 
### END SOLUTION

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Users can also specify the `start` and `step` argument to have a more specific control of the `ndarray` object. Bear in mind that the values which are generated are within the half-open interval `[start, stop)`.

In [3]:
# demonstrating with only start and stop arguments
# 3 is included but not 10
### BEGIN SOLUTION
np.arange(start=2, stop=10) 
### END SOLUTION

array([2, 3, 4, 5, 6, 7, 8, 9])

In [4]:
# demonstrating with start, stop and step arguments
# next value is 2 more of previous
### BEGIN SOLUTION
np.arange(start=2, stop=10, step=2) 
### END SOLUTION

array([2, 4, 6, 8])

The downside of this method is that it can only create `ndarray` with one axis. An additional step of reshaping would be needed to produce `ndarray` with more than one axes.

## <a name="array">Method 2: Create from `list` or `tuple`</a>
A simple way to create an `ndarray` is by calling `numpy.array()` method and by providing `list` or `tuple` as the argument. This is intuitive for users that are already familiar with `Python` as `list` and `tuple` are the native data structures.

In [5]:
# using list 
### BEGIN SOLUTION
np.array([0, 1, 2])
### END SOLUTION

array([0, 1, 2])

In [6]:
# using tuple
### BEGIN SOLUTION
np.array((0, 1, 2))
### END SOLUTION

array([0, 1, 2])

Above shows two different arguments, `list` and `tuple`, that were used for `np.array()` method but the same result is achieved. 

Nonetheless, one common error among beginners is providing `array()` method with multiple arguments instead of a sequence. Like the one shown below:

In [7]:
# this will result in ValueError

try:
    ### BEGIN SOLUTION
    np.array(0, 1, 2)
    ### END SOLUTION
except ValueError as error:
    print(error)
    print("Multiple arguments are provided! Please provide only one argument!")

only 2 non-keyword arguments accepted
Multiple arguments are provided! Please provide only one argument!


Below is the same block of code except the difference lies in only one argument is provided. This can be successfully run.

In [8]:
# this will not result in ValueError
try:
    ### BEGIN SOLUTION
    np.array((0, 1, 2))
    ### END SOLUTION
except ValueError as error:
    print(error)
    print("Multiple arguments are provided! Please provide only one argument!")

To construct `ndarray` with multiple dimensions, just input sequences of sequences as the argument. 

In [9]:
# create 2-dimensional ndarray
### BEGIN SOLUTION
np.array([ [ 0, 1, 2], 
          [3, 4, 5] ])
### END SOLUTION

array([[0, 1, 2],
       [3, 4, 5]])

In [10]:
# create 3-dimensional ndarray
### BEGIN SOLUTION
np.array(
    [ 
        [ 
            [ 0, 1, 2 ], 
            [ 3, 4, 5 ], 
            [ 6, 7, 8 ]
        ],
        [
            [9, 10, 11],
            [12, 13, 14],
            [15, 16, 17]
        ]
    ])
### END SOLUTION

array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]]])

## <a name="zeros">Method 3: Zeros, Ones, and Empty</a>
We will be introducing three methods here: `numpy.zeros()`, `numpy.ones()` and `numpy.empty()`. As the name implies, `numpy zeros` generates an `ndarray` of only zeros, `numpy ones` generates an `ndarray` of only ones, and `numpy empty` generates an `ndarray` with random content that is dependent on memory state. Users need to only specify the shape for `ndarray` as the argument for these methods. 

The default dtype used is `float64`.

Below will be a demonstration of creating `ndarray` with the shape `(2, 3)` using the abovementioned methods.

In [11]:
### BEGIN SOLUTION
np.zeros((2, 3))
### END SOLUTION

array([[0., 0., 0.],
       [0., 0., 0.]])

In [12]:
### BEGIN SOLUTION
np.ones((2, 3))
### END SOLUTION

array([[1., 1., 1.],
       [1., 1., 1.]])

In [13]:
### BEGIN SOLUTION
np.empty((2, 3))
### END SOLUTION

array([[1., 1., 1.],
       [1., 1., 1.]])

We can see that the data type that is displayed is of the type `float64`.

## <a name="dtype">`ndarray` Data-type</a>
The data type of the `ndarray` can be specified during the creation of `ndarray`. We just have to specify what is the `dtype` that is used.

In [14]:
# specifying an ndarray of one with int32 dtype
### BEGIN SOLUTION
arr_1 = np.ones((2, 3), dtype="int32")
### END SOLUTION

In [15]:
### BEGIN SOLUTION
arr_1.dtype
### END SOLUTION

dtype('int32')

One can also modify the `dtype` even after the `ndarray` is created. However, it might cause lost of numerical precision by doing so. Below is an example of converting the `dtype` of `ndarray` after it is created using `astype()` method.

In [16]:
### BEGIN SOLUTION
arr_2 = np.array(
    [
        [2.2354, 3.23423542, 1.234],
        [5.3452, 6.35468, 9.2343]
    ])
arr_2
### END SOLUTION

array([[2.2354    , 3.23423542, 1.234     ],
       [5.3452    , 6.35468   , 9.2343    ]])

In [17]:
### BEGIN SOLUTION
arr_2.dtype
### END SOLUTION

dtype('float64')

In [18]:
### BEGIN SOLUTION
arr_2.astype('int32')
### END SOLUTION

array([[2, 3, 1],
       [5, 6, 9]])

We see the loss of numerical precision, whereby the each element in the `ndarray` contains only the integer information. `numpy` supports a wider variety of numerical types compared to native ones offered by `python`. You can check it out at `numpy`'s official [website](https://numpy.org/devdocs/user/basics.types.html).

##  <a name="summary">Summary</a>
To conclude, you should now be able to:
1. Explain about how `numpy ndarray` can be created
2. Apply `numpy.arange`, `numpy.array`, `numpy.zeros`, `numpy.ones` and `numpy.empty` methods to create `numpy ndarray`
3. Explain and apply `numpy.dtype` and `astype()` methods to specify and modify `numpy data-type`<br><br>
Congratulations, that concludes this lesson.    

## <a name="reference">Reference</a>
* [numpy.ndarray.dtype](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.dtype.html)
* [NumPy quickstart](https://numpy.org/doc/stable/user/quickstart.html)