# Python Data Types

Dynamic typing:

While a statically-typed language like C or Java requires each variable to be explicitly declared, a dynamically-typed language like Python skips this specification. For example, in C you might specify a particular operation as follows:

```C
/* C code */
int result = 0;
for(int i=0; i<100; i++){
    result += i;
}
```

While in Python the equivalent operation could be written this way:

```python
# Python code
result = 0
for i in range(100):
    result += i
```

In C, data types are explicitly declared. In Python, the types are dynamically inferred.

```python
# Python code
x = 4
x = "four"
```

Here we've switched the contents of ``x`` from an integer to a string. The same thing in C would lead (depending on compiler settings) to a compilation error:

```C
/* C code */
int x = 4;
x = "four";  // FAILS
```

### Python Integers: More Than Just an Integer

Every Python object is a cleverly-disguised C structure. When we define an integer in Python, such as ``x = 10000``, ``x`` is a pointer to a compound C structure:

```C
struct _longobject {
    long ob_refcnt;
    PyTypeObject *ob_type;
    size_t ob_size;
    long ob_digit[1];
};
```

A single integer in Python 3.4 actually contains four pieces:

- ``ob_refcnt``, a reference count that helps Python silently handle memory allocation and deallocation
- ``ob_type``, which encodes the type of the variable
- ``ob_size``, which specifies the size of the following data members
- ``ob_digit``, which contains the actual integer value that we expect the Python variable to represent.

This means that there is some overhead in storing an integer in Python as compared to an integer in a compiled language like C, as illustrated in the following figure:

![Integer Memory Layout](../figures/cint_vs_pyint.png)

### Python Lists: More Than Just a List

Consider now what happens when we use a Python data structure that holds many Python objects. The standard mutable multi-element container in Python is the list.

In [1]:
L = list(range(10))
L

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [2]:
type(L[0])

int

Or, similarly, a list of strings:

In [3]:
L2 = [str(c) for c in L]
L2

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

In [4]:
type(L2[0])

str

Because of Python's dynamic typing, we can even create heterogeneous lists:

In [5]:
L3 = [True, "2", 3.0, 4]
[type(item) for item in L3]

[bool, str, float, int]

To allow these flexible types, __each item is a complete Python object__. When all variables are of the same type, much of this information is redundant: it can be much more efficient to store data in a fixed-type array. The difference between a dynamic-type list and a fixed-type (NumPy-style) array is shown below.

![Array Memory Layout](../figures/array_vs_list.png)

### Fixed-Type Arrays in Python

Python offers several different options for storing data in efficient, fixed-type data buffers.
The built-in ``array`` module (available since Python 3.3) can be used to create dense arrays of a uniform type. In this case ``i`` is a type code indicating the contents are integers.

In [6]:
import array
L = list(range(10))
A = array.array('i', L)
A

array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

__Numpy's ``ndarray`` is MUCH more useful__. While Python's ``array`` object provides efficient __storage__ of array-based data, NumPy provides efficient __operations__ on that data.

In [7]:
import numpy as np

### Creating Arrays from Python Lists

First, we can use ``np.array`` to create arrays from Python lists:

In [8]:
# integer array:
np.array([1, 4, 2, 5, 3])

array([1, 4, 2, 5, 3])

NumPy is constrained to arrays that all contain the same type. If types do not match, NumPy will upcast if possible.

In [9]:
np.array([3.14, 4, 2, 3])

array([ 3.14,  4.  ,  2.  ,  3.  ])

If we want to explicitly set the data type of the resulting array, we can use the ``dtype`` keyword:

In [10]:
np.array([1, 2, 3, 4], dtype='float32')

array([ 1.,  2.,  3.,  4.], dtype=float32)

NumPy arrays can explicitly be multi-dimensional. Here's one way of initializing a multidimensional array using a list of lists:

In [11]:
# nested lists result in multi-dimensional arrays
np.array([range(i, i + 3) for i in [2, 4, 6]])

array([[2, 3, 4],
       [4, 5, 6],
       [6, 7, 8]])

The inner lists are treated as rows of the resulting two-dimensional array.

### Creating Arrays from Scratch

Especially for larger arrays, it is more efficient to create arrays from scratch using routines built into NumPy.

In [12]:
np.zeros(10, dtype=int)                     # Create a length-10 integer array filled with zeros

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [13]:
np.ones((3, 5), dtype=float)                # Create a 3x5 floating-point array filled with ones

array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.]])

In [14]:
np.full((3, 5), 3.14)                       # Create a 3x5 array filled with 3.14

array([[ 3.14,  3.14,  3.14,  3.14,  3.14],
       [ 3.14,  3.14,  3.14,  3.14,  3.14],
       [ 3.14,  3.14,  3.14,  3.14,  3.14]])

In [15]:
np.arange(0, 20, 2)                         # Create an array filled with a linear sequence
                                            # Starting at 0, ending at 20, stepping by 2
                                            # (this is similar to the built-in range() function)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [16]:
np.linspace(0, 1, 5)                        # Create an array of five values evenly spaced between 0 and 1

array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ])

In [17]:
np.random.random((3, 3))                    # Create a 3x3 array of uniformly distributed
                                            # random values between 0 and 1

array([[ 0.31877439,  0.77140697,  0.72730994],
       [ 0.82651878,  0.43438584,  0.07286639],
       [ 0.11250735,  0.94641138,  0.12343997]])

In [18]:
np.random.normal(0, 1, (3, 3))              # 3x3 array, normally distributed random values, mean=0, stdev=1

array([[ 0.62565626, -1.40674003,  0.80571348],
       [-0.7971686 ,  1.01344498, -1.31942534],
       [-0.29057927, -0.45832044, -0.602875  ]])

In [19]:
np.random.randint(0, 10, (3, 3))            # 3x3 array, random integers, interval [0,10]

array([[5, 1, 9],
       [0, 2, 6],
       [1, 7, 5]])

In [20]:
np.eye(3)                                   # 3x3 identity matrix

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

In [21]:
np.empty(3)                                 # uninitialized array. values will be existing memory contents.

array([ 1.,  1.,  1.])

## NumPy Standard Data Types

NumPy arrays contain values of a single type, so it is important to understand those types and their limitations.
Because NumPy is built in C, the types will be familiar to users of C, Fortran, and other related languages. Note that when constructing an array, they can be specified using a string:

```python
np.zeros(10, dtype='int16')
```

Or using the associated NumPy object:

```python
np.zeros(10, dtype=np.int16)
```

| Data type	    | Description |
|---------------|-------------|
| ``bool_``     | Boolean (True or False) stored as a byte |
| ``int_``      | Default integer type (same as C ``long``; normally either ``int64`` or ``int32``)| 
| ``intc``      | Identical to C ``int`` (normally ``int32`` or ``int64``)| 
| ``intp``      | Integer used for indexing (same as C ``ssize_t``; normally either ``int32`` or ``int64``)| 
| ``int8``      | Byte (-128 to 127)| 
| ``int16``     | Integer (-32768 to 32767)|
| ``int32``     | Integer (-2147483648 to 2147483647)|
| ``int64``     | Integer (-9223372036854775808 to 9223372036854775807)| 
| ``uint8``     | Unsigned integer (0 to 255)| 
| ``uint16``    | Unsigned integer (0 to 65535)| 
| ``uint32``    | Unsigned integer (0 to 4294967295)| 
| ``uint64``    | Unsigned integer (0 to 18446744073709551615)| 
| ``float_``    | Shorthand for ``float64``.| 
| ``float16``   | Half precision float: sign bit, 5 bits exponent, 10 bits mantissa| 
| ``float32``   | Single precision float: sign bit, 8 bits exponent, 23 bits mantissa| 
| ``float64``   | Double precision float: sign bit, 11 bits exponent, 52 bits mantissa| 
| ``complex_``  | Shorthand for ``complex128``.| 
| ``complex64`` | Complex number, represented by two 32-bit floats| 
| ``complex128``| Complex number, represented by two 64-bit floats| 