# Chapter 4. Understanding Data Types in Python

## A Python Integer is More Than Just an Integer
---

The standard Python implementation is written in C. This measn that every Python object is a cleverly disguised C structure. When we define an integer in Python, such as `x = 1000`, `x` is not just a "raw" integer. It's actually a pointer to a compound C structure containing several values.

A single integer in Python contains 4 pieces:
- `ob_refcnt`
  : a reference count that helps Python silently handle memory allocation and deallocation
- `ob_type`
  : encodes the type of the variable
- `ob_size`
  : specifies the size of the following data members
- `ob digit`
  : contains the actual integer value that we expect the Python variable to represent


## A Python List is More Than Just a List
---

Let's consider what happens when we use a Python data structure that holds many Python objects. The standard mutable multi-element container in Python is the list.

In [1]:
L = list(range(10))
L2 = [str(c) for c in L]
print('L = ', L, '\n', type(L[0]))
print('\nL2 = ', L2, "\n", type(L2[0]))

L =  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 
 <class 'int'>

L2 =  ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] 
 <class 'str'>


Because of Python's dynamic typing, we can create heterogenous lists.

In [2]:
L3 = [True, '2', 3.0, 4]
[type(item) for item in L3]

[bool, str, float, int]

Flexibility comes at a cost: to allow flexible types, each item in the list must contain its own type, reference count, and other info. In the case that all variables are of the same type, much of this info is redundant. It can be much more **efficient** to store the data in a fixed-type array. The array contains a single pointer to one contiguous block of data. The Python list, on the other hand, contains a pointer to a block of pointers, each of which in turn points to a full Python object. The advantage of a list is that it can be filled with data of any desired type. Fixed-type **NumPy**-style arrays lack this flexibility, but are much more efficient for storing and manipulating data.

## Creating Arrays from Python Lists
---

We'll start with the standard NumPy import, under the alias `np`.

In [3]:
import numpy as np

We can use `np.array` to create arrays from Python lists.

In [4]:
# Integer array
np.array([1, 2, 3, 4, 5])

array([1, 2, 3, 4, 5])

NumPy arrays can only contain data of the same type. If the types do not match, NumPy will **upcast* them according to its type promotion rules; here, integers are upcast to floating point.

In [5]:
np.array([3.14, 1, 2, 3])

array([3.14, 1.  , 2.  , 3.  ])

Use `dtype` to explicitly set the data type of the resulting array

In [6]:
np.array([1, 2, 3, 4], dtype = np.float32)

array([1., 2., 3., 4.], dtype=float32)

NumPy arrays can be multidimensional. Here's one way of initializing a multidimensional array using a list of lists.

In [7]:
# Nested lists resulting in multidimensional arrays
np.array([range(i, i + 3) for i in [1, 2, 3]])

array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5]])

## Creating Arrays from Scratch
---

Especially for larger arrays, it is more efficient to create arrays from scratch using routines built into NumPy.

In [12]:
# Create a length-10 integer array filled with 0s
np.zeros(10, dtype = int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [13]:
# Create a 3x5 floating-point array filled with 1s
np.ones((3,5), dtype = float)

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [14]:
# Create a 3x5 floating point array filled with 3.14
np.full((3,5), 3.14)

array([[3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14]])

In [16]:
# Create an array filled with a linear sequence starting at 0, ending at 20, stepping by 2
np.arange(0, 20, 2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [17]:
# Create an array of five values evenly spaced between 0 and 1
np.linspace(0, 1, 5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [18]:
# Create a 3x3 array of uniformly distributed pseudorandom values between 0 and 1
np.random.random((3,3))

array([[0.67230287, 0.87408131, 0.90666851],
       [0.59439528, 0.15295102, 0.48039788],
       [0.23361915, 0.24837909, 0.19367176]])

In [19]:
# Create a 3x3 array of normally distributed pseudorandom values with mean 0 and standard deviation 1
np.random.normal(0, 1, (3,3))

array([[-0.04157178,  0.04911098, -0.13414949],
       [-0.4704755 ,  0.29945024,  0.6916425 ],
       [-0.56272128, -0.4854479 , -1.7880196 ]])

In [22]:
# Create a 3x3 array of pseudorandom integers in the interval [0,10)
np.random.randint(0, 10, (3,3))

array([[3, 7, 4],
       [2, 4, 8],
       [1, 4, 9]])

In [23]:
# Create a 3x3 identity matrix
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [24]:
# Create an uninitialized array of three integers; the values will be whatever happens to already exist at the memory location
np.empty(3)

array([1., 1., 1.])

## NumPy Standard Data Types
---

Note that when constructing an array, the standard NumPy data types can be specified using a string:
```
np.zeros(10, dtype='int16')
```
Or using the associated NumPy object:
```
np.zeros(10, dtype=np.int16)
```

A table of Standard NumPy data types:
|Data type |Description|
|----------|-----------|
|bool_|Boolean(t/f) stored as a byte|
|int_|Default integer type (same as C; normally either int32 or int64|
|intc|Identical to C int (normally int32 or int64)|
|intp|Integer used for indexing (same as C `ssize_t`; normally either int32 or int64)|
|int8|Byte (-128 to 127)|
|int16|Integer (-32768 to 32767)|
|int32|Integer (–2147483648 to 2147483647)|
|int64|Integer (–9223372036854775808 to 9223372036854775807)|
|uint8|Unsigned integer (0 to 255)|
|uint16|Unsigned integer (0 to 65535)|
|uint32|Unsigned integer (0 to 4294967295)|
|uint64|Unsigned integer (0 to 18446744073709551615)|
|float_|Shorthand for `float_64`|
|float16|Half-precision float: sign bit, 5 bits exponent, 10 bits mantissa|
|float32|float32 Single-precision float: sign bit, 8 bits exponent, 23 bits mantissa|
|float64|Double-precision float: sign bit, 11 bits exponent, 52 bits mantissa|
|complex_|Shorthand for `complex128`|
|complex64|Complex number, represented by two 32-bit floats|
|complex128|Complex number, represented by two 64-bit floats|