![*INTERTECHNICA - SOLON EDUCATIONAL PROGRAMS - TECHNOLOGY LINE*](https://solon.intertechnica.com/assets/IntertechnicaSolonEducationalPrograms-TechnologyLine.png)

# Data Manipulation with Python - The NumPy Library - Array Creation

*Basic initialization of the workspace.*

In [18]:
!python -m pip install numpy
import numpy as np
print ("NumPy installed at version: {}".format(np.__version__))

NumPy installed at version: 1.19.5


#1. Creating Arrays

##1.1 Creating standard arrays

Standard arrays are the foundation of data representation in NumPy. They have all the information in the array having a single data type.

One of the simplest methods to create a NumPy array is to create it from standard Python arrays:

In [19]:
x_1d = np.array([ 1,   2,   3,   4,   5,   6,   7,   8,   9,  10])
print("The simple array's values are: \n {} ".format(x_1d))

The simple array's values are: 
 [ 1  2  3  4  5  6  7  8  9 10] 


In [20]:
x_2d = np.array(
      [[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20],
       [ 21,  22,  23,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])

print("The two dimensional array's values are: \n {} ".format(x_2d))

The two dimensional array's values are: 
 [[  1   2   3   4   5   6   7   8   9  10]
 [ 11  12  13  14  15  16  17  18  19  20]
 [ 21  22  23  24  25  26  27  28  29  30]
 [ 31  32  33  34  35  36  37  38  39  40]
 [ 41  42  43  44  45  46  47  48  49  50]
 [ 51  52  53  54  55  56  57  58  59  60]
 [ 61  62  63  64  65  66  67  68  69  70]
 [ 71  72  73  74  75  76  77  78  79  80]
 [ 81  82  83  84  85  86  87  88  89  90]
 [ 91  92  93  94  95  96  97  98  99 100]] 


NumPy also allows creation of arrays from Python lists:

In [21]:
x_1d_list = np.array(( 1,   2,   3,   4,   5,   6,   7,   8,   9,  10))

print("The values of the array created from list are \n {} ".format(x_1d_list))

The values of the array created from list are 
 [ 1  2  3  4  5  6  7  8  9 10] 


## 1.2 Specifying data types of array elements

NumPy allows the specification of the data types during array creation via the __dype__ parameter. 

In [22]:
# using a standard Python array for data initialization
intialization_data = [0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]

# using the same initialization data it is possible to create NumPy array with
# different item data types

int_array = np.array(intialization_data, dtype= np.int16)
float_array = np.array(intialization_data, dtype= np.float16)

print("The array created with the integer data type items is: \n {} ".format(int_array))
print("The array created with the float data type items is: \n {} ".format(float_array))

The array created with the integer data type items is: 
 [0 1 2 3 4 5 6 7 8 9] 
The array created with the float data type items is: 
 [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.] 


## 1.3 Creating record arrays

However, simple data types array may not be sufficient for in case we need to represent complex data as well. For this case, NumPy provides an additional type of arrays: **record arrays**. 

Record arrays allow accessing of data **not only by index but also by field names as well**, keeping a close resemblance of functionality with the record (or structure) concept from other programing languages.

By using lists it is possible to create **record arrays** as well:

In [23]:
# record arrays can be created from arrays of tuples containing
# the values in the array
# these arrays also need the name and the types of their fields
# to be specificied at the creation time.

x_1d_record = np.array(
    [("One", 1), ("Two", 2), ("Three", 3)],
    dtype=[("Literal Form", "U10"), ("Numeric Value", "i1")])

print("The values of record array are: \n{}".format(x_1d_record))

The values of record array are: 
[('One', 1) ('Two', 2) ('Three', 3)]


In [24]:
# the record arrays allow accessing of their data 
# via the field names

print("Numerical form values are: \n {}".format(x_1d_record["Numeric Value"]))
print("Literal form values are: \n {}".format(x_1d_record["Literal Form"]))

Numerical form values are: 
 [1 2 3]
Literal form values are: 
 ['One' 'Two' 'Three']


## 1.4 Creating arrays using dedicated functions

NumPy offers the ability to create and initialize arrays without the need to specificy each value from the array. This is done by using dedicated functions that create and initialize arrays based on different algorithms. 

One of these functions is **arange** which creates a uni-dimensional array and initializes it by default with a sequence of specified length and starting from zero.

In [25]:
# creating an array initialized with a sequence of length 10, starting from 0:
x_1d_arange = np.arange(10)
print("The array is intialized with a sequence of length 10 starting from 0: \n {}".format(x_1d_arange))

The array is intialized with a sequence of length 10 starting from 0: 
 [0 1 2 3 4 5 6 7 8 9]


Another useful function is **empty** which creates arrays with non-initialized content and various dimensions. 

In [26]:
# creating a non-initialized unidimensional array with length 10 
x_1d_empty = np.empty(10)
print("The unitialized array of size 10 has the following values: \n {}".format(x_1d_empty))

The unitialized array of size 10 has the following values: 
 [4.66239408e-310 2.46151512e-312 6.79038654e-313 2.48273508e-312
 2.05833592e-312 5.43472210e-322 0.00000000e+000 0.00000000e+000
 0.00000000e+000 0.00000000e+000]


In [27]:
# creating a non-initialized bi-dimensional array with size 10x10 
x_2d_empty = np.empty((10,10))
print("The unitialized matrix of size 10x10 has the following values: \n {}".format(x_2d_empty))

The unitialized matrix of size 10x10 has the following values: 
 [[0.45312295 0.81398533 0.75689227 0.57045031 0.23736193 0.7245475
  0.62462887 0.58675476 0.20726435 0.02596758]
 [0.94495315 0.50034402 0.89112753 0.17214315 0.69291615 0.67883403
  0.95456079 0.91122453 0.09260818 0.77066613]
 [0.44785109 0.69660702 0.98238654 0.98001356 0.96110468 0.9281232
  0.12998717 0.72710296 0.3242581  0.10067118]
 [0.99470485 0.94342468 0.31644422 0.72605578 0.50876751 0.19744056
  0.97800597 0.5820716  0.17311892 0.08366135]
 [0.21645218 0.80087516 0.4298721  0.60501183 0.93538641 0.05731942
  0.77964217 0.03092757 0.0875072  0.78264553]
 [0.00409939 0.11272811 0.2798413  0.54348495 0.48151979 0.51922303
  0.79467661 0.3571446  0.83808115 0.8671135 ]
 [0.98623733 0.28077696 0.85820915 0.50892119 0.96653078 0.0228148
  0.02134239 0.56356991 0.70446205 0.8861447 ]
 [0.24979302 0.10217656 0.26702591 0.78698881 0.43585204 0.78360419
  0.60658421 0.32234444 0.70274914 0.38436204]
 [0.05186706 0.498

The **zeros** function can be used to create arrays initialized with 0 values: 

In [28]:
# creating an array and initialize it with 0 value 
x_1d_zeros = np.zeros(10)
print("An array initialized with the value 0: \n {}".format(x_1d_zeros))

An array initialized with the value 0: 
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


In [29]:
# numpy allows also allows the creation of multidimensional arrays
# that are initialized with teh zero value
x_2d_zeros = np.zeros((10,10))
print("A matrix initialized with the value 0: \n {}".format(x_2d_zeros))

A matrix initialized with the value 0: 
 [[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]


The **ones** function behaves like **zeros** functions but initializes the array with 1 values. 

The **full** function creates arrays initialized with a specific value and various dimensions:

In [30]:
# creating an array with length 10 initialized with the value -1
x_1d_full = np.full(10, -1)
print("An array initialized with the value -1: \n {}".format(x_1d_full))

An array initialized with the value -1: 
 [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]


In [31]:
# creating a matrix with size 10x10 initialized with the value -1
x_2d_full = np.full((10,10), -1)
print("A matrix initialized with the value -1: \n {}".format(x_2d_full))

A matrix initialized with the value -1: 
 [[-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
 [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
 [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
 [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
 [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
 [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
 [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
 [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
 [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]
 [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1]]


An array can be also created with random-initialized content by using **random** function from **numpy.random** package:

In [32]:
# creating an array with length 10 initialized with random values 
# (continuous uniform distribution)

x_1d_random = np.random.random(10)
print("An array initialized with random values : \n {}".format(x_1d_random))

An array initialized with random values : 
 [0.69813117 0.81153532 0.71437081 0.93448037 0.31384583 0.98060334
 0.92945289 0.69299115 0.73143961 0.56018569]


In [33]:
# creating a matrix with size 10x10 initialized with random values 
# (continuous uniform distribution)

x_2d_random = np.random.random((10, 10))
print("A matrix initialized with random values : \n {}".format(x_2d_random))

A matrix initialized with random values : 
 [[0.8335289  0.68501361 0.58698901 0.24498961 0.85731526 0.44837911
  0.26204749 0.44643229 0.65750284 0.01573836]
 [0.34771703 0.96138425 0.62267703 0.90497998 0.34052733 0.87830332
  0.72058513 0.72325584 0.53343763 0.28070261]
 [0.26554679 0.74546412 0.91483399 0.63203577 0.41221213 0.07126426
  0.13191451 0.00817626 0.15497715 0.0279649 ]
 [0.7706395  0.94405119 0.08073474 0.07735681 0.52028745 0.26911091
  0.34021234 0.80737584 0.81851471 0.94300796]
 [0.96101479 0.23888679 0.32615911 0.03041207 0.26850885 0.94456208
  0.25830955 0.46926552 0.53854179 0.48167477]
 [0.97409732 0.8126102  0.71607053 0.36178692 0.61262465 0.42857971
  0.07842773 0.05754303 0.36626358 0.42838105]
 [0.70419913 0.5834547  0.9358273  0.74644734 0.80291737 0.94846215
  0.78521285 0.44695404 0.04762247 0.2727418 ]
 [0.23366449 0.47433807 0.25061257 0.76416789 0.21989346 0.27084472
  0.88801353 0.01562867 0.83734231 0.2598803 ]
 [0.99154648 0.0366127  0.78284778 0

## 1.5 Creating arrays using data loading

Numpy allows loading text-based data (in principle CSV data) via the **loadtxt** function. This function allows loading of CSV-based data from text data streams and transform it into an initialized NumPY array. 

This function allows also the specification of the parameters for loading operation such as: rows to be skipped, the data delimiter or a the specification of the record array to be associated with the data. 

In [34]:
# import packages for remote data load
import requests
import io

# read data remotely
data_url = "https://raw.githubusercontent.com/INTERTECHNICA-BUSINESS-SOLUTIONS-SRL/CourseDataManipulationWithPython/main/Module%202%20-%20The%20Numpy%20Library/Session%202%20-%20NumPy%20Basics/data/country_happines_rank_2020.csv"
response = requests.get(data_url)

# load the string data into a record array
loaded_data = np.loadtxt(
    io.StringIO(response.text), 
    skiprows = 1, 
    delimiter = ",", 
    dtype = {"names" : ("Country", "Rank", "Score", "Population"),
            "formats": ("U20", "int8", "float16", "float32")}
)

print("The CSV loaded data is: \n {}".format(loaded_data))

The CSV loaded data is: 
 [('Finland',    1, 7.77 , 5.54072021e+03)
 ('Denmark',    2, 7.6  , 5.79220215e+03)
 ('Norway',    3, 7.555, 5.42124121e+03)
 ('Iceland',    4, 7.492, 3.41243011e+02)
 ('Netherlands',    5, 7.49 , 1.71348711e+04)
 ('Switzerland',    6, 7.48 , 8.65462207e+03)
 ('Sweden',    7, 7.344, 1.00992646e+04)
 ('New Zealand',    8, 7.31 , 4.82223291e+03)
 ('Canada',    9, 7.277, 3.77421523e+04)
 ('Austria',   10, 7.246, 9.00639844e+03)
 ('Australia',   11, 7.227, 2.54998848e+04)
 ('Costa Rica',   12, 7.168, 5.09411816e+03)
 ('Israel',   13, 7.14 , 8.65553516e+03)
 ('Luxembourg',   14, 7.09 , 6.25978027e+02)
 ('United Kingdom',   15, 7.055, 6.78860078e+04)
 ('Ireland',   16, 7.02 , 4.93778613e+03)
 ('Germany',   17, 6.984, 8.37839453e+04)
 ('Belgium',   18, 6.92 , 1.15896230e+04)
 ('United States',   19, 6.89 , 3.31002656e+05)
 ('Czech Republic',   20, 6.85 , 1.07089814e+04)
 ('United Arab Emirates',   21, 6.824, 9.89040234e+03)
 ('Malta',   22, 6.727, 4.41542999e+02)
 ('