<a href="https://colab.research.google.com/github/belanasaikiran/CSE-5717-Big-Data-Analytics/blob/main/02/02_01_CreatingNumpyArrays_practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

*This notebook contains an excerpt from the [Python Data Science Handbook](http://shop.oreilly.com/product/0636920034919.do) by Jake VanderPlas; the content is available [on GitHub](https://github.com/jakevdp/PythonDataScienceHandbook).*


# Creating Arrays from Python Lists

## NumPy:
NumPy is a Python library used for working with arrays. It also has functions for working in domain of linear algebra, fourier transform, and matrices. NumPy was created in 2005 by Travis Oliphant.


In [2]:
import numpy as np
#importing numpy library. It is build on C, so faster operations such as array creations, slicing, modifying elements.

To Create arrays from python lists, we can use `np.array`

In [3]:
#integer array:
np.array([1,4,2,5,3])

array([1, 4, 2, 5, 3])

> Numpy is contrained to arrays that contain the same data type. Make sure types match.

---
Remember that unlike Python lists, NumPy is constrained to arrays that all contain the same type.
If types do not match, NumPy will upcast if possible (here, integers are up-cast to floating point):


In [4]:
np.array([3, 1.4]) # this will take default as floating number


array([3. , 1.4])

In [5]:
np.array([1, "Sai Kiran"]) # makes 1 as string datatype

array(['1', 'Sai Kiran'], dtype='<U21')

For a specific data type, use `dtype` keyword

In [6]:
np.array([1,2,3,4], dtype='float32')

array([1., 2., 3., 4.], dtype=float32)

Finally, unlike python lists, NumPy arrays can explicitly be multi-dimensional; here's one way of initializing a multidimensional; here's one way of initializing a multidimensional array using a list of lists:


In [7]:
# nested lists result in multi-dimentional arrays
np.array([range(i,i+3) for i in [2,4,6]])

array([[2, 3, 4],
       [4, 5, 6],
       [6, 7, 8]])

## Creating Arrays from Scratch

Especially for larger arrays, it is more efficient to create arrays from scratch using routines built into NumPy.
Here are several examples:

In [8]:
# create a length 10 integer array filled with zeros
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [15]:
# create a 3x5 floating-point array filled with ones
# NOTE: we are creating 3 rows and 5 columns
np.ones((3,5), dtype=float)

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [16]:
# create a 4x4 array filled with 3.14
np.full((4,4), 3.14)

array([[3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14]])

create an array filled with a linear sequence
starting at 0, ending at 20, stepping by 2
> this is similar to built-in ranger() function

In [20]:
np.arange(0,20,2)
# we get elements from 0 - 20 with every 2 elements starting from zero.


array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [23]:
# create an array of five values evenly spaced between 0 and 1
np.linspace(0,1,5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [26]:
# create a 3x3 array of uniformly distributed random values b/w 0 and 1
np.random.random((3,3))

array([[0.57509204, 0.91029227, 0.75195422],
       [0.25095154, 0.75570875, 0.29377533],
       [0.07702809, 0.04670889, 0.74698428]])

In [27]:
# create a 3x3 array of normally distribued random values with mean 0 and standard deviation 1
np.random.normal(0,1, (3,3))

array([[ 0.36381764, -0.95437253, -1.19764589],
       [ 0.6287819 ,  1.98287595, -0.03581534],
       [ 1.86410416, -1.37274057,  1.53250398]])

In [31]:
# create a 3x3 array of ranfom integers in the interval [0,10]
np.random.randint(0,10, (3, 3))

array([[8, 6, 2],
       [8, 8, 0],
       [7, 2, 0]])

In [33]:
# create a 4x4 identity matrix
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [36]:
# create an uninitialized array of three integers
# The values will be whatever happens to already exist at that memory location
np.empty(3)

array([1., 1., 1.])

# NumPy Standard Data Types

NumPy arrays contain values of a single type, so it is important to have detailed knowledge of those types and their limitaions. Becausse NumPy is built in C, the types will be familiar to users of C, Fortran, and other related languages.

The Standard NumPy data types are listed in the following table. Note that when constructiing an array, they can be specified using a string:


In [37]:
np.zeros(10, dtype='int16')

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int16)

Or using the associated NumPy object:

In [39]:
np.zeros(10, dtype=np.int16)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int16)

| Data type	    | Description |
|---------------|-------------|
| ``bool_``     | Boolean (True or False) stored as a byte |
| ``int_``      | Default integer type (same as C ``long``; normally either ``int64`` or ``int32``)|
| ``intc``      | Identical to C ``int`` (normally ``int32`` or ``int64``)|
| ``intp``      | Integer used for indexing (same as C ``ssize_t``; normally either ``int32`` or ``int64``)|
| ``int8``      | Byte (-128 to 127)|
| ``int16``     | Integer (-32768 to 32767)|
| ``int32``     | Integer (-2147483648 to 2147483647)|
| ``int64``     | Integer (-9223372036854775808 to 9223372036854775807)|
| ``uint8``     | Unsigned integer (0 to 255)|
| ``uint16``    | Unsigned integer (0 to 65535)|
| ``uint32``    | Unsigned integer (0 to 4294967295)|
| ``uint64``    | Unsigned integer (0 to 18446744073709551615)|
| ``float_``    | Shorthand for ``float64``.|
| ``float16``   | Half precision float: sign bit, 5 bits exponent, 10 bits mantissa|
| ``float32``   | Single precision float: sign bit, 8 bits exponent, 23 bits mantissa|
| ``float64``   | Double precision float: sign bit, 11 bits exponent, 52 bits mantissa|
| ``complex_``  | Shorthand for ``complex128``.|
| ``complex64`` | Complex number, represented by two 32-bit floats|
| ``complex128``| Complex number, represented by two 64-bit floats|