24-Input and Output 之五
====
**Memory mapping files**

memmap为存储在磁盘上的二进制文件中的阵列创建一个内存映射。

memmap	Create a memory-map to an array stored in a binary file on disk.

# numpy.memmap
`class numpy.memmap`

为存储在磁盘上的二进制文件中的数组创建内存映射。

内存映射文件用于访问磁盘上的大段文件，而无需将整个文件读入内存。NumPy的memmap是类似于数组的对象。这与Python的mmap 模块不同，后者使用类似文件的对象。

这个ndarray的子​​类与某些操作有一些不愉快的交互，因为它不太适合作为子类。使用此子类的另一种方法是自己创建mmap 对象，然后直接使用ndarray .__ new__创建一个ndarray，传递在其'buffer ='参数中创建的对象。

此类可能在某些时候被转换为工厂函数，该函数将视图返回到mmap缓冲区。

删除memmap实例以关闭memmap文件。

Create a memory-map to an array stored in a binary file on disk.

Memory-mapped files are used for accessing small segments of large files on disk, without reading the entire file into memory. NumPy’s memmap’s are array-like objects. This differs from Python’s mmap module, which uses file-like objects.

This subclass of ndarray has some unpleasant interactions with some operations, because it doesn’t quite fit properly as a subclass. An alternative to using this subclass is to create the mmap object yourself, then create an ndarray with ndarray.__new__ directly, passing the object created in its ‘buffer=’ parameter.

This class may at some point be turned into a factory function which returns a view into an mmap buffer.

Delete the memmap instance to close the memmap file.

## Parameters:	
**filename** : str, file-like object, or pathlib.Path instance

要用作数组数据缓冲区的文件名或文件对象。

The file name or file object to be used as the array data buffer.

**dtype** : data-type, optional

用于解释文件内容的数据类型。默认是uint8。

The data-type used to interpret the file contents. Default is uint8.

**mode** : {‘r+’, ‘r’, ‘w+’, ‘c’}, optional

该文件以此模式打开：

- 'R'	打开现有文件以供阅读。
- 'R +'	打开现有文件进行读写。
- 'W +'	创建或覆盖现有文件以进行读写。
- 'C'	写时复制：分配会影响内存中的数据，但更改不会保存到磁盘。磁盘上的文件是只读的。

默认为'r +'。

The file is opened in this mode:

- ‘r’	Open existing file for reading only.
- ‘r+’	Open existing file for reading and writing.
- ‘w+’	Create or overwrite existing file for reading and writing.
- ‘c’	Copy-on-write: assignments affect data in memory, but changes are not saved to disk. The file on disk is read-only.

Default is ‘r+’.

**offset** : int, optional

在该文件中，数组数据从此偏移量开始。由于偏移量是以字节为单位测量的，因此通常应为`dtype`字节大小的倍数。在`mode != 'r'`时候，甚至超出文件末尾的正偏移都是有效的; 该文件将被扩展以容纳附加数据。默认情况下，`memmap` 将在文件的开头开始，即使是一个文件指针`fp` and `fp.tell() != 0` 。

In the file, array data starts at this offset. Since offset is measured in bytes, it should normally be a multiple of the byte-size of dtype. When `mode != 'r'`, even positive offsets beyond end of file are valid; The file will be extended to accommodate the additional data. By default, `memmap` will start at the beginning of the file, even if filename is a file pointer `fp` and `fp.tell() != 0`.

**shape** : tuple, optional

所需的阵列形状。如果`mode == 'r'`和偏移后的剩余字节数不是**dtype**字节大小的倍数，则必须指定**shape**。默认情况下，返回的数组将是1-D，其元素数由文件大小和数据类型确定。

The desired shape of the array. If `mode == 'r'` and the number of remaining bytes after offset is not a multiple of the byte-size of dtype, you must specify shape. By default, the returned array will be 1-D with the number of elements determined by file size and data-type.

**order** : {‘C’, ‘F’}, optional

指定ndarray内存布局的顺序： row-major，C-style或column-major，Fortran-style。这仅在形状大于1-D时有效。默认顺序为“C”。

Specify the order of the ndarray memory layout: row-major, C-style or column-major, Fortran-style. This only has an effect if the shape is greater than 1-D. The default order is ‘C’.

**注意：**

memmap对象可以在接受ndarray的任何地方使用。给出一个memmap `fp, isinstance(fp, numpy.ndarray) `返回True。

在32位系统上，内存映射文件不能大于2GB。

当memmap导致文件系统中创建或扩展文件超出其当前大小时，新部件的内容未指定。在具有POSIX文件系统语义的系统上，扩展部分将填充零字节。

The memmap object can be used anywhere an ndarray is accepted. Given a memmap `fp, isinstance(fp, numpy.ndarray) `returns True.

Memory-mapped files cannot be larger than 2GB on 32-bit systems.

When a memmap causes a file to be created or extended beyond its current size in the filesystem, the contents of the new part are unspecified. On systems with POSIX filesystem semantics, the extended part will be filled with zero bytes.

## 示例

In [15]:
import numpy as np
>>> data = np.arange(12, dtype='float32')
>>> data.resize((3,4))

这个例子使用一个临时文件，这样doctest就不会将文件写到目录中。你应使用一个“正常”的文件名。

This example uses a temporary file so that doctest doesn’t write files to your directory. You would use a ‘normal’ filename.

In [16]:
>>> from tempfile import mkdtemp
>>> import os.path as path
>>> filename = path.join(mkdtemp(), 'newfile.dat')

创建一个与我们的数据相匹配的dtype和shape的memmap：

Create a memmap with dtype and shape that matches our data:

In [17]:
>>> fp = np.memmap(filename, dtype='float32', mode='w+', shape=(3,4))
>>> fp

memmap([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]], dtype=float32)

将数据写入memmap数组：

Write data to memmap array:

In [18]:
>>> fp[:] = data[:]
>>> fp

memmap([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]], dtype=float32)

In [19]:
>>> fp.filename == path.abspath(filename)

True

在删除对象之前，删除刷新内存以更改磁盘：

Deletion flushes memory changes to disk before removing the object:

In [20]:
>>> del fp

加载memmap并验证数据是否已存储：

In [21]:
>>> newfp = np.memmap(filename, dtype='float32', mode='r', shape=(3,4))
>>> newfp

memmap([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]], dtype=float32)

只读memmap：

In [22]:
>>> fpr = np.memmap(filename, dtype='float32', mode='r', shape=(3,4))
>>> fpr.flags.writeable

False

写时复制memmap：

In [23]:
>>> fpc = np.memmap(filename, dtype='float32', mode='c', shape=(3,4))
>>> fpc.flags.writeable

True

可以分配给写时复制数组，但是值只写入数组的内存副本，而不是写入磁盘：

In [24]:
>>> fpc

memmap([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]], dtype=float32)

In [25]:
>>> fpc[0,:] = 0
>>> fpc

memmap([[ 0.,  0.,  0.,  0.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]], dtype=float32)

磁盘上的文件未更改：

In [26]:
>>> fpr

memmap([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]], dtype=float32)

偏移到memmap：

In [27]:
>>> fpo = np.memmap(filename, dtype='float32', mode='r', offset=16)
>>> fpo

memmap([ 4.,  5.,  6.,  7.,  8.,  9., 10., 11.], dtype=float32)

## Attributes:	
**filename** : str or pathlib.Path instance

映射文件的路径。
Path to the mapped file.

offset : int

文件中的偏移位置。
Offset position in the file.

mode : str

文件模式。
File mode.

## Methods

`flush()`将数组中的任何更改写入磁盘上的文件。

`flush()`Write any changes in the array to the file on disk.