Skip to content

Commit

Permalink
Adjustments to fread documenation (#3361)
Browse files Browse the repository at this point in the history
- explicitly mention that `columns` parameter will only work for CSV files;
- add more details on how `s3` buckets are read;
- minor text corrections.

Closes #3360
  • Loading branch information
oleksiyskononenko committed Sep 26, 2022
1 parent 7e806eb commit 79fcaff
Showing 1 changed file with 10 additions and 8 deletions.
18 changes: 10 additions & 8 deletions docs/api/dt/fread.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@
Parameters
----------
anysource: str | bytes | file | Pathlike | List
The first (unnamed) argument to fread is the *input source*.
Multiple types of sources are supported, and they can be named
explicitly: `file`, `text`, `cmd`, and `url`. When the source is
The first argument to fread is the input source.
Multiple types of sources are supported and can be named
explicitly as `file`, `text`, `cmd`, and `url`. When the source is
not named, fread will attempt to guess its type. The most common
type is `file`, but sometimes the argument is resolved as `text`
(if the string contains newlines) or `url` (if the string starts
Expand All @@ -37,7 +37,7 @@
``.read()``.

Generally, specifying a file name should be preferred, since
reading from a Python ``file`` can only be done in single-threaded
reading from a Python ``file`` can only be done in a single-threaded
mode.

This argument also supports addressing files inside an archive,
Expand All @@ -54,17 +54,19 @@

url: str
This parameter can be used to specify the URL of the input file.
S3 URLs are also supported. The data will first be downloaded
into a temporary directory and then read from there. In the end
the temporary files will be removed.
The data will first be downloaded into a temporary directory and
then read from there. In the end the temporary files will be removed.

A path to a public S3 bucket is also supported, however, internally
it first gets converted into the corresponding https URL.

We use the standard ``urllib.request`` module to download the
data. Changing the settings of that module, for example installing
proxy, password, or cookie managers will allow you to customize
the download process.

columns: ...
Limit which columns to read from the input file.
Limit which columns to read from the CSV file.

sep: str | None
Field separator in the input file. If this value is `None`
Expand Down

0 comments on commit 79fcaff

Please sign in to comment.