<!-- Copyright (C) The IETF Trust (2011-2012) -->
<!-- Copyright (C) The Internet Society (2011-2012) -->
<section anchor='op:read_plus' title='Operation 65: READ_PLUS'>
READ_PLUS is a new variant of the NFSv4.1 READ
operation <xref target="ref:RFC5661" />. Besides being able
to support all of the data semantics of READ, it can also be
used by the server to return either holes or ADHs to the
client. For holes, READ_PLUS extends the response
to avoid returning data for portions of
the file which are either initialized and contain no backing store
or if the result would appear to be so. I.e., if the result was
a data block composed entirely of zeros, then it is easier to return
a hole. Returning data blocks of uninitialized data wastes computational
and network resources, thus reducing performance.
For ADHs, READ_PLUS is used to return the metadata describing the
portions of the file which are either initialized and contain no backing
If the client sends a READ operation, it is explicitly
stating that it is neither supporting sparse files nor ADHs. So
if a READ occurs on a sparse ADH or file, then the server must
expand such data to be raw bytes. If a READ occurs in
the middle of a hole or ADH, the server can only send back
bytes starting from that offset. In contrast, if a READ_PLUS
occurs in the middle of a hole or ADH, the server can send
back a range which starts before the offset and extends past
the range.
READ is inefficient for transfer of sparse sections of the file. As
such, READ is marked as OBSOLETE in NFSv4.2. Instead, a client
should issue READ_PLUS. Note that
as the client has no a priori knowledge of whether either an ADH
or a hole is present or not, it should always use READ_PLUS.
<section toc='exclude' title="ARGUMENT">
<?rfc include='autogen/read_plus_args.xml'?>
<section toc='exclude' title="RESULT">
<?rfc include='autogen/read_plus_res.xml'?>
<section toc='exclude' anchor='op:read_plus:desc' title="DESCRIPTION">
The READ_PLUS operation is based upon the NFSv4.1 READ operation
<xref target="ref:RFC5661" /> and similarly reads data from the
regular file identified by the current filehandle.
The client provides a rpa_offset of where the READ_PLUS is to start and a
rpa_count of how many bytes are to be read. A rpa_offset of zero means to
read data starting at the beginning of the file. If rpa_offset is
greater than or equal to the size of the file, the status NFS4_OK is
returned with di_length (the data length) set to zero and eof
set to TRUE.
The READ_PLUS result is comprised of an array of rpr_contents, each of
which describe a data_content4 type of data (<xref target="ss:adh:dc" />).
For NFSv4.2, the allowed
values are data, ADH, and hole. A server is required to support the
data type, but neither ADH nor hole. Both an ADH and a hole must be
returned in its entirety - clients must be prepared to get more
information than they requested. Both the start and the end of the
hole may execeed what was requested.
READ_PLUS has to support all of the errors which are returned by READ
plus NFS4ERR_UNION_NOTSUPP. If the client asks for a hole and the server
does not support that arm of the discriminated union, but does support
one or more additional arms, it can signal to the client that it supports
the operation, but not the arm with NFS4ERR_UNION_NOTSUPP.
If the data to be returned is comprised entirely of zeros, then
the server may elect to return that data as a hole. The server
differentiates this to the client by setting di_allocated to TRUE
in this case. Note that in such a scenario, the server is not
required to determine the full extent of the "hole" - it does not
need to determine where the zeros start and end.
The server may elect to return adjacent elements of the same type. For
example, the guard pattern or block size of an ADH might change, which
would require adjacent elements of type ADH. Likewise if the server
has a range of data comprised entirely of zeros and then a hole, it
might want to return two adjacent holes to the client.
If the client specifies a rpa_count value of zero, the READ_PLUS succeeds
and returns zero bytes of data. In all situations, the server may choose
to return fewer
bytes than specified by the client. The client needs to check for
this condition and handle the condition appropriately.
If the client specifies an rpa_offset and rpa_count value that is entirely
contained within a hole of the file, then the di_offset and di_length
returned must be for the entire hole. This result is considered valid
until the file is changed (detected via the change attribute). The
server MUST provide the same semantics for the hole as if the client
read the region and received zeroes; the implied holes contents
lifetime MUST be exactly the same as any other read data.
If the client specifies an rpa_offset and rpa_count value that
begins in a non-hole of the file but extends into hole the server
should return an array comprised of both data and a hole. The
client MUST be prepared for the server to return a short read
describing just the data. The client will
then issue another READ_PLUS for the remaining bytes, which the server
will respond with information about the hole in the file.
Except when special stateids are used, the stateid value for a
READ_PLUS request represents a value returned from a previous
byte-range lock or share reservation request or the stateid associated
with a delegation. The stateid identifies the associated owners if
any and is used by the server to verify that the associated locks are
still valid (e.g., have not been revoked).
If the read ended at the end-of-file (formally, in a correctly formed
READ_PLUS operation, if rpa_offset + rpa_count is equal to the size of the
file), or the READ_PLUS operation extends beyond the size of the file
(if rpa_offset + rpa_count is greater than the size of the file), eof is
returned as TRUE; otherwise, it is FALSE. A successful READ_PLUS of
an empty file will always return eof as TRUE.
If the current filehandle is not an ordinary file, an error will be
returned to the client. In the case that the current filehandle
represents an object of type NF4DIR, NFS4ERR_ISDIR is returned. If
the current filehandle designates a symbolic link, NFS4ERR_SYMLINK is
returned. In all other cases, NFS4ERR_WRONG_TYPE is returned.
For a READ_PLUS with a stateid value of all bits equal to zero, the
server MAY allow the READ_PLUS to be serviced subject to mandatory
byte-range locks or the current share deny modes for the file. For a
READ_PLUS with a stateid value of all bits equal to one, the server
MAY allow READ_PLUS operations to bypass locking checks at the server.
On success, the current filehandle retains its value.
<section toc='exclude' title="IMPLEMENTATION">
In general, the IMPLEMENTATION notes for READ in Section 18.22.4 of
<xref target="ref:RFC5661" /> also apply to READ_PLUS. One delta is
that when the owner has a locked byte range, the server
MUST return an array of rpr_contents with values inside that range.
<section toc='exclude' title="Additional pNFS Implementation Information">
With pNFS, the semantics of using READ_PLUS remains the same. Any
data server MAY return a hole or ADH result for a READ_PLUS request that
it receives. When a data server chooses to return such a result, it has the
option of returning information for the data stored on that
data server (as defined by the data layout), but it MUST not return
results for a byte range that includes data managed by another data server.
A data server should do its best to return as much information about
a ADH as is feasible without having to contact the metadata server.
If communication with the metadata server is required, then every
attempt should be taken to minimize the number of requests.
If mandatory locking is enforced, then the data server must also
ensure that to return only information that is within
the owner’s locked byte range.
<section toc='exclude' title="READ_PLUS with Sparse Files Example">
The following table describes a sparse file. For each byte
range, the file contains either non-zero data or a hole. In addition,
the server in this example uses a Hole Threshold of 32K.
<texttable anchor="space_example">
<ttcol align='left' >Byte-Range</ttcol>
<ttcol align='left' >Contents</ttcol>
<c>0-15999 </c> <c>Hole </c>
<c>16K-31999 </c> <c>Non-Zero</c>
<c>32K-255999 </c> <c>Hole </c>
<c>256K-287999</c> <c>Non-Zero</c>
<c>288K-353999</c> <c>Hole </c>
<c>354K-417999</c> <c>Non-Zero</c>
Under the given circumstances, if a client was to read from the file
with a max read size of 64K, the following will be
the results for the given READ_PLUS calls. This assumes the client has already opened the file,
acquired a valid stateid ('s' in the example), and just needs to issue READ_PLUS requests.
<list style="numbers">
READ_PLUS(s, 0, 64K) --> NFS_OK, eof = false, &lt;data[0,32K], hole[32K,224K]&gt;.
Since the first hole is less than the server's Hole Threshhold, the
first 32K of the file is returned as data and the remaining 32K is
returned as a hole which actually extends to 256K.
READ_PLUS(s, 32K, 64K) --> NFS_OK, eof = false, &lt;hole[32K,224K]&gt;
The requested range was all zeros, and the current hole begins at offset 32K and is 224K
in length. Note that the client should not have followed up the previous
READ_PLUS request with this one as the hole information from the previous
call extended past what the client was requesting.
READ_PLUS(s, 256K, 64K) --> NFS_OK, eof = false, &lt;data[256K, 288K], hole[288K, 354K]&gt;.
Returns an array of the 32K data and the hole which extends to 354K.
READ_PLUS(s, 354K, 64K) --> NFS_OK, eof = true, &lt;data[354K, 418K]&gt;.
Returns the final 64K of data and informs the client there is no more data in the file.
