# Creating ndarrays
-----------------

Effective data-driven science and computation requires understanding how data is stored and manipulated.
The main difference:
* in C the data types of each variable are **explicitly declared**
* in Python the types are **dynamically inferred**

This sort of flexibility is one piece that makes Python convenient and easy to use.

### 1. A Python integer is more than just an integer
-------------------------

The standard Python implementation is written in C:
* every Python object is simply a cleverly-disguised C structure, which contains not only its value, but other information as well
* a variable is actually a pointer to a compound C structure:

```C
struct _longobject {
    long ob_refcnt;
    PyTypeObject *ob_type;
    size_t ob_size;
    long ob_digit[1];
};
```
which actually contains four pieces:
- ``ob_refcnt``, a reference count that helps Python silently handle memory allocation and deallocation
- ``ob_type``, which encodes the type of the variable
- ``ob_size``, which specifies the size of the following data members
- ``ob_digit``, which contains the actual integer value that we expect the Python variable to represent.

![Integer Memory Layout](cint_vs_pyint.png)

* C integer is essentially a label for a position in memory whose bytes encode an integer value.
* Python integer is a pointer to a position in memory containing all the Python object information:
    * ``PyObject_HEAD`` is the part of the structure containing the reference count, type code, and other pieces
    * Extra information is what allows Python to be coded so freely and dynamically
    * Extra information comes at a cost, which becomes especially apparent in structures that combine many of objects

### 2. A Python list is more than just a list
-------------------
 
The list is the standard mutable multi-element container in Python.


In [None]:
ls = list(range(10))
ls

In [None]:
type(ls[0])

Because of Python's dynamic typing lists can be heterogeneous:

In [None]:
lm = [True, "2", 3.0, 4]
[type(item) for item in lm]

* To allow these flexible types, each item in the list must contain its own extra information.
* In the special case that all variables are of **the same type**, this information is redundant: it can be much more efficient to store data in a fixed-type array.
* Fixed-type **NumPy** arrays lack Python flexibility, but are much more efficient for **storing and manipulating data**.

![Array Memory Layout](array_vs_list.png)

### 3. Fixed-type arrays 
--------------------
* storing data in efficient, fixed-type data buffers
* built-in ``array.array`` -- dense arrays of a uniform type

In [None]:
import array
la = list(range(10))
a = array.array('i', la)
a

Here ``'i'`` is a type code indicating the contents are integers.

### 4. [NumPy](http://numpy.org/) ``ndarray`` objects 
--------------------------
* provide efficient storage of array-based data
* add efficient *operations* on that data

In [None]:
import numpy as np

#### 4.1. Creating an array
-------------------

* Creating an array by Factory Method:

```ipython
array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0, like=None)
```

* Type:      builtin_function_or_method

``object``  -- **array_like** -- an array, any object exposing the array interface, an object whose
    ```__array__``` method returns an array, or any (nested) sequence.

In [None]:
np.array?

* Creating arrays from lists

In [None]:
a=np.array([1, 4, 2, 5, 3])
a

* Upcasting

In [None]:
np.array([3.14, 4, 2, 3])

* Explicitly setting of the data type

In [None]:
np.array([1, 2, 3, 4], dtype='float32')

* Arrays can explicitly be multi-dimensional-- using a list of lists, inner lists are treated as rows of the resulting two-dimensional array:

In [None]:
np.array([range(i, i + 3) for i in [2, 4, 6]])

#### 4.2. Creating arrays using routines 
------------------------

In [None]:
# Create a length-10 integer array filled with zeros
np.zeros(10, dtype=int)

In [None]:
# Create a 3x5 floating-point array filled with ones
np.ones((3, 5), dtype=float)

In [None]:
# Create a 3x5 array filled with 3.14
np.full((3, 5), 3.14)

In [None]:
# Create a 3x3 identity matrix
np.eye(3)

In [None]:
# Create an uninitialized array of three integers
# The values will be whatever happens to already exist at that memory location
np.empty(3)

In [None]:
# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2 (similar to the built-in range() )
np.arange(0, 20, 2)

In [None]:
# Create an array of five values evenly spaced between 0 and 1
np.linspace(0, 1, 5)

In [None]:
# Create a 3x3 array of uniformly distributed
# random values between 0 and 1
np.random.random((3, 3))

In [None]:
# Create a 3x3 array of normally distributed random values
# with mean 0 and standard deviation 1
np.random.normal(0, 1, (3, 3))

In [None]:
# Create a 3x3 array of random integers in the interval [0, 10)
np.random.randint(0, 10, (3, 3))

In [None]:
a=np.array([[1, 2, 3,4],[5,6,7,8]], ndmin=3)
a[0][0]

#### 5. NumPy standard data types
---------------------

| Data type	    | Description |
|---------------|-------------|
| ``bool_``     | Boolean (True or False) stored as a byte |
| ``int_``      | Default integer type (same as C ``long``; normally either ``int64`` or ``int32``)| 
| ``intc``      | Identical to C ``int`` (normally ``int32`` or ``int64``)| 
| ``intp``      | Integer used for indexing (same as C ``ssize_t``; normally either ``int32`` or ``int64``)| 
| ``int8``      | Byte (-128 to 127)| 
| ``int16``     | Integer (-32768 to 32767)|
| ``int32``     | Integer (-2147483648 to 2147483647)|
| ``int64``     | Integer (-9223372036854775808 to 9223372036854775807)| 
| ``uint8``     | Unsigned integer (0 to 255)| 
| ``uint16``    | Unsigned integer (0 to 65535)| 
| ``uint32``    | Unsigned integer (0 to 4294967295)| 
| ``uint64``    | Unsigned integer (0 to 18446744073709551615)| 
| ``float_``    | Shorthand for ``float64``.| 
| ``float16``   | Half precision float: sign bit, 5 bits exponent, 10 bits mantissa| 
| ``float32``   | Single precision float: sign bit, 8 bits exponent, 23 bits mantissa| 
| ``float64``   | Double precision float: sign bit, 11 bits exponent, 52 bits mantissa| 
| ``complex_``  | Shorthand for ``complex128``.| 
| ``complex64`` | Complex number, represented by two 32-bit floats| 
| ``complex128``| Complex number, represented by two 64-bit floats| 

More advanced type specification is possible  including compound data types [NumPy documentation](http://numpy.org/).