# Numpy basics
---
# 1 Data types
---
## 1.1 Array types and conversions between types
---

<table border="1" class="docutils">
<colgroup>
<col width="17%" />
<col width="83%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">Data type</th>
<th class="head">Description</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td><code class="docutils literal"><span class="pre">bool_</span></code></td>
<td>Boolean (True or False) stored as a byte</td>
</tr>
<tr class="row-odd"><td><code class="docutils literal"><span class="pre">int_</span></code></td>
<td>Default integer type (same as C <code class="docutils literal"><span class="pre">long</span></code>; normally either
<code class="docutils literal"><span class="pre">int64</span></code> or <code class="docutils literal"><span class="pre">int32</span></code>)</td>
</tr>
<tr class="row-even"><td>intc</td>
<td>Identical to C <code class="docutils literal"><span class="pre">int</span></code> (normally <code class="docutils literal"><span class="pre">int32</span></code> or <code class="docutils literal"><span class="pre">int64</span></code>)</td>
</tr>
<tr class="row-odd"><td>intp</td>
<td>Integer used for indexing (same as C <code class="docutils literal"><span class="pre">ssize_t</span></code>; normally
either <code class="docutils literal"><span class="pre">int32</span></code> or <code class="docutils literal"><span class="pre">int64</span></code>)</td>
</tr>
<tr class="row-even"><td>int8</td>
<td>Byte (-128 to 127)</td>
</tr>
<tr class="row-odd"><td>int16</td>
<td>Integer (-32768 to 32767)</td>
</tr>
<tr class="row-even"><td>int32</td>
<td>Integer (-2147483648 to 2147483647)</td>
</tr>
<tr class="row-odd"><td>int64</td>
<td>Integer (-9223372036854775808 to 9223372036854775807)</td>
</tr>
<tr class="row-even"><td>uint8</td>
<td>Unsigned integer (0 to 255)</td>
</tr>
<tr class="row-odd"><td>uint16</td>
<td>Unsigned integer (0 to 65535)</td>
</tr>
<tr class="row-even"><td>uint32</td>
<td>Unsigned integer (0 to 4294967295)</td>
</tr>
<tr class="row-odd"><td>uint64</td>
<td>Unsigned integer (0 to 18446744073709551615)</td>
</tr>
<tr class="row-even"><td><code class="docutils literal"><span class="pre">float_</span></code></td>
<td>Shorthand for <code class="docutils literal"><span class="pre">float64</span></code>.</td>
</tr>
<tr class="row-odd"><td>float16</td>
<td>Half precision float: sign bit, 5 bits exponent,
10 bits mantissa</td>
</tr>
<tr class="row-even"><td>float32</td>
<td>Single precision float: sign bit, 8 bits exponent,
23 bits mantissa</td>
</tr>
<tr class="row-odd"><td>float64</td>
<td>Double precision float: sign bit, 11 bits exponent,
52 bits mantissa</td>
</tr>
<tr class="row-even"><td><code class="docutils literal"><span class="pre">complex_</span></code></td>
<td>Shorthand for <code class="docutils literal"><span class="pre">complex128</span></code>.</td>
</tr>
<tr class="row-odd"><td>complex64</td>
<td>Complex number, represented by two 32-bit floats (real
and imaginary components)</td>
</tr>
<tr class="row-even"><td>complex128</td>
<td>Complex number, represented by two 64-bit floats (real
and imaginary components)</td>
</tr>
</tbody>
</table>

- To convert the type of an array, use the `.astype()`,or the type itself as a function.

In [2]:
import numpy as np
z = np.arange(3, dtype=np.uint8)
z

array([0, 1, 2], dtype=uint8)

In [4]:
z.astype(float)

array([ 0.,  1.,  2.])

In [5]:
np.int8(z)

array([0, 1, 2], dtype=int8)

**Note**: numpy knows that `int` refer to `np.int_`, `bool` means `np.bool_`, that `float` is `np.float_` and `complex` is `np.complex_`

- To determine the type of an array, look at the dtype attribute.

In [7]:
z.dtype

dtype('uint8')

- dtype objects also contain information about the type.

In [8]:
d = np.dtype(int)
d

dtype('int64')

In [9]:
np.issubdtype(d, np.integer)

True

In [10]:
np.issubdtype(d, np.floating)

False

## 1.2 Array Scalars
---
NumPy generally returns elements of arrays as array scalars (a scalar with an associated dtype).

## 1.3 Extended Precison
---
Python's float-point numbers are usually 64-bit floating-point numbers, nearly equivalent to `np.float64`.

# 2 Array creation
---
## 2.1 Introduction
---
there are 5 general mechanisms for creating arrays:
1. Conversion from other python structures(e.g.,lists, tuples)
2. Intrinsic numpy array array creation objects(e.g., arange, zeros, ones, etc.)
3. Reading arrays from disk , eigher from standard or custom fromats.
4. Creating arrays from raw bytes through the use of string or buffers.
5. Use of special library functions(e.g., random)

## 2.2 Converting pyton array_like objects to numpy arrays
---
    x = np.array('python list or tuple')

## 2.3 Intrinsic Numpy Array Creation
---
    np.zeros()
    np.ones()
    np.arange([start, ]stop[, step][, dtype])
- `linspace()` will create arrays with a specified number of elements, and spaced equally between ...

In [11]:
np.linspace(1,9,20)

array([ 1.        ,  1.42105263,  1.84210526,  2.26315789,  2.68421053,
        3.10526316,  3.52631579,  3.94736842,  4.36842105,  4.78947368,
        5.21052632,  5.63157895,  6.05263158,  6.47368421,  6.89473684,
        7.31578947,  7.73684211,  8.15789474,  8.57894737,  9.        ])

- `indices()` will create a set of arrays(stack as a one-higher dimensioned array)

In [14]:
np.indices((3, 3))

array([[[0, 0, 0],
        [1, 1, 1],
        [2, 2, 2]],

       [[0, 1, 2],
        [0, 1, 2],
        [0, 1, 2]]])

## 2.4 Reading Arrays From Disk
---
### 2.4.1 Standard Binary Formats
---
### 2.4.2 Common ASCII Formats
---
### 2.4.3 Custom Binary Formats
---
### 2.4.4 Use of Special Libraries
---

# 3 I/O with Numpy
---
## 3.1 Import data with `genfromtxt`

    numpy.genfromtxt(
                    fname,                     # str list, generator
                    dtype=<class 'float'>,     # 
                    comments='#',              # 指定注释标识符 cha
                    delimiter=None,            # 分隔符，可以用数字cha or int
                    skip_header=0,             # 跳过前n行 int
                    skip_footer=0,             # 跳过后n行 int
                    converters=None,           # {index_or_name:转换函数}
                    missing_values=None,       # {index_or_name:缺失标识}
                    filling_values=None,       # {index_or_name:替换值}
                    usecols=None,              # 指定需要列tuple
                    names=None,                # 列名＇a,b,c＇, # a,b,c
                    excludelist=None,          # 
                    deletechars=None,          # 
                    replace_space='_',         # 
                    autostrip=False,           # 去空白
                    case_sensitive=True,       # 
                    defaultfmt='f%i',          # name=None时，默认列名格式
                    unpack=None,               # 
                    usemask=False,             # 
                    loose=True,                # 
                    invalid_raise=True,        # 
                    max_rows=None,             # 
                    encoding='bytes'           #
                    )


# 4 Indexing
---

# 5 Broadcasting
---

# 6 Byte-swapping
---

# 7 Structured arrays
---

## 7.1 Introduction
---

In [2]:
import numpy as np
x = np.array([('Rex', 9, 81.0), ('Fido', 3, 27.0)],
             dtype=[('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])

In [3]:
x

array([('Rex', 9,  81.), ('Fido', 3,  27.)], 
      dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f4')])

In [5]:
# modify individual fiels of structured array by indexing with the field name
x['age'] = 5
x

array([('Rex', 5,  81.), ('Fido', 5,  27.)], 
      dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f4')])

## 7.2 Structured Datatypes
---
### 7.2.1 Structured Datatype Creation
---
- A list of tuple, one tuple per field.
    
  each tuple has the form `(fieldname, datatype, shape)`,`fieldname` is a string, `datatype`,`shape` is a tuple of integers.

In [6]:
np.dtype([('x', 'f4'),('', np.float32),('z', 'f4', (2,2))])

dtype([('x', '<f4'), ('f1', '<f4'), ('z', '<f4', (2, 2))])

- A string of comma-separated dtype specifications.

In [8]:
np.dtype('i8, f4, S3')

dtype([('f0', '<i8'), ('f1', '<f4'), ('f2', 'S3')])

In [10]:
np.dtype('3int8, float32, (2,3)float64')

dtype([('f0', 'i1', (3,)), ('f1', '<f4'), ('f2', '<f8', (2, 3))])

- A dictionary of field paramters arrays

In [11]:
np.dtype({'names':['col1', 'col2'],
          'formats': ['i4', 'f4'],
          'offsets': [0,4],
          'itemsize': 12})

dtype({'names':['col1','col2'], 'formats':['<i4','<f4'], 'offsets':[0,4], 'itemsize':12})

- A dictionary of field names

In [12]:
np.dtype({'col1': ('i1', 0), 'col2': ('f4', 1)})

dtype([('col1', 'i1'), ('col2', '<f4')])

### 7.2.2 Manipulating and Displaying Structured Datatypes
---

In [15]:
d = np.dtype([('x', 'i8'), ('y', 'f4'), ('z', 'f')])
d.names

('x', 'y', 'z')

In [16]:
d.fields

mappingproxy({'x': (dtype('int64'), 0),
              'y': (dtype('float32'), 8),
              'z': (dtype('float32'), 12)})

### 7.2.3 Automatic Byte Offsets and Alignment
---

In [17]:
def print_offsets(d):
    print("offsets:", [d.fields[name][1] for name in d.names])
    print("itemsize:", d.itemsize)
print_offsets(np.dtype('u1,u1,i4,u1,i8,u2'))

offsets: [0, 1, 2, 6, 7, 15]
itemsize: 17


In [18]:
print_offsets(np.dtype('u1,u1,i4,u1,i8,u2', align=True))
offsets: [0, 1, 4, 8, 16, 24]
itemsize: 32

offsets: [0, 1, 4, 8, 16, 24]
itemsize: 32


### 7.2.4 Field Titles
---

In [19]:
np.dtype([(('my title', 'name'), 'f4')])

dtype([(('my title', 'name'), '<f4')])

In [20]:
np.dtype({'name': ('i4', 0, 'my title')})

dtype([(('my title', 'name'), '<i4')])

In [28]:
d

dtype([('x', '<i8'), ('y', '<f4'), ('z', '<f4')])

In [29]:
for name in d.names:
    print(d.fields[name])

(dtype('int64'), 0)
(dtype('float32'), 8)
(dtype('float32'), 12)


### 7.2.5 Union types
---

## 7.3 Indexing and Assignment to Structured arrays
---
### 7.3.1 Assigning data to a Structured Array
---
- Assignment from python native types(tuple)

In [34]:
x = np.array([(1,2,3),(4,5,6)], dtype='i8, f4, f8')
x

array([(1,  2.,  3.), (4,  5.,  6.)], 
      dtype=[('f0', '<i8'), ('f1', '<f4'), ('f2', '<f8')])

- Assignment from Scalars
- Assignment from other Structured Arrays

### 7.3.2 Indexing Structured Arrays
---
- Accessing individual Fields

In [35]:
x = np.array([(1,2),(3,4)], dtype=[('foo', 'i8'), ('bar', 'f4')])
x['foo'] = 10
x

array([(10,  2.), (10,  4.)], 
      dtype=[('foo', '<i8'), ('bar', '<f4')])

In [33]:
x

array([(1,  2.,  3.), (4,  5.,  6.)], 
      dtype=[('f0', '<i8'), ('f1', '<f4'), ('f2', '<f8')])

- Accessing Multiple Fields

In [36]:
a = np.zeros(3, dtype=[('a','i4'),('b', 'i4'),('c','f4')])
a[['a','c']]

array([(0,  0.), (0,  0.), (0,  0.)], 
      dtype=[('a', '<i4'), ('c', '<f4')])

### 7.3.3 Viewing Structured Arrays Containing Objects
---
### 7.3.4 Structure Comparison
---

## 7.4 Record Arrays
---


# 8 Subclassing ndarray
---