From babacd9ba4401034be8da15dbf25c4e0f972f87d Mon Sep 17 00:00:00 2001 From: Ran Date: Tue, 9 Jun 2020 16:40:34 +0800 Subject: [PATCH 1/2] sql: update parameters and examples of load data --- sql-statements/sql-statement-load-data.md | 59 ++++++++++++++++++++++- 1 file changed, 57 insertions(+), 2 deletions(-) diff --git a/sql-statements/sql-statement-load-data.md b/sql-statements/sql-statement-load-data.md index c8b76c74edddb..d2f25e6306ec5 100644 --- a/sql-statements/sql-statement-load-data.md +++ b/sql-statements/sql-statement-load-data.md @@ -15,10 +15,52 @@ The `LOAD DATA` statement batch loads data into a TiDB table. ![LoadDataStmt](/media/sqlgram/LoadDataStmt.png) +## Parameters + +### `LocalOpt` + +You can specify that the imported data file is located on the client or server by configuring the `LocalOpt` parameter. Currently TiDB only supports data import from the client, so when importing data, set `LocalOpt` to `Local`. + +### `Fields` and `Lines` + +You can specify how to process the data format by configuring the `Fields` and `Lines` parameters. + +- `FIELDS TERMINATED BY`: Specify the separating character of the data. +- `FIELDS ENCLOSED BY`: Specify the enclosing character of the data. +- `LINES TERMINATED BY`: Specify the line terminator, if you want to end a line with a certain character. + +Take the following data format as an example: + +``` +"bob","20","street 1"\r\n +"alice","33","street 1"\r\n +``` + +If you want to extract `bob`, `20`, and `street 1`, specify the separating character as `','`, and the enclosing character as `'\"'`: + +```sql +FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' +``` + +If you do not specify the processing parameters, the imported data is processed in the following manner: + +```sql +FIELDS TERMINATED BY '\t' ENCLOSED BY '' +LINES TERMINATED BY '\n' +``` + +### `IGNORE number LINES` + +You can ignore the first `number` lines of the file by configuring the `IGNORE number LINES` parameter. For example, if you configure `IGNORE 1 LINES`, the first line of the file is ignored. + +In addition, for the `DuplicateOpt`, `CharsetOpt`, and `LoadDataSetSpecOpt` parameters, TiDB currently only supports parsing syntax. + ## Examples +{{< copyable "sql" >}} + ```sql -mysql> CREATE TABLE trips ( +CREATE TABLE trips ( -> trip_id bigint NOT NULL PRIMARY KEY AUTO_INCREMENT, -> duration integer not null, -> start_date datetime, @@ -30,10 +72,23 @@ mysql> CREATE TABLE trips ( -> bike_number varchar(255), -> member_type varchar(255) -> ); +``` + +``` Query OK, 0 rows affected (0.14 sec) +``` + +The following example imports data using `LOAD DATA`. Comma is specified as the separating character. The double quotation marks that enclose the data is ignored. The first line of the file is ignored. + +If you see the error message `ERROR 1148 (42000): the used command is not allowed with this TiDB version`, refer to [ERROR 1148 (42000): the used command is not allowed with this TiDB version](/faq/tidb-faq.md#error-1148-42000-the-used-command-is-not-allowed-with-this-tidb-version) + +{{< copyable "sql" >}} -mysql> LOAD DATA LOCAL INFILE '/mnt/evo970/data-sets/bikeshare-data/2017Q4-capitalbikeshare-tripdata.csv' INTO TABLE trips FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type); +``` +LOAD DATA LOCAL INFILE '/mnt/evo970/data-sets/bikeshare-data/2017Q4-capitalbikeshare-tripdata.csv' INTO TABLE trips FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type); +``` +``` Query OK, 815264 rows affected (39.63 sec) Records: 815264 Deleted: 0 Skipped: 0 Warnings: 0 ``` From 0927c5eb86d92bfa7908c6690eb248d400eac7dd Mon Sep 17 00:00:00 2001 From: Ran Date: Tue, 9 Jun 2020 20:32:41 +0800 Subject: [PATCH 2/2] Apply suggestions from code review Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com> --- sql-statements/sql-statement-load-data.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/sql-statements/sql-statement-load-data.md b/sql-statements/sql-statement-load-data.md index d2f25e6306ec5..56618b9712af1 100644 --- a/sql-statements/sql-statement-load-data.md +++ b/sql-statements/sql-statement-load-data.md @@ -19,15 +19,15 @@ The `LOAD DATA` statement batch loads data into a TiDB table. ### `LocalOpt` -You can specify that the imported data file is located on the client or server by configuring the `LocalOpt` parameter. Currently TiDB only supports data import from the client, so when importing data, set `LocalOpt` to `Local`. +You can specify that the imported data file is located on the client or on the server by configuring the `LocalOpt` parameter. Currently, TiDB only supports data import from the client. Therefore, when importing data, set the value of `LocalOpt` to `Local`. ### `Fields` and `Lines` You can specify how to process the data format by configuring the `Fields` and `Lines` parameters. -- `FIELDS TERMINATED BY`: Specify the separating character of the data. -- `FIELDS ENCLOSED BY`: Specify the enclosing character of the data. -- `LINES TERMINATED BY`: Specify the line terminator, if you want to end a line with a certain character. +- `FIELDS TERMINATED BY`: Specifies the separating character of each data. +- `FIELDS ENCLOSED BY`: Specifies the enclosing character of each data. +- `LINES TERMINATED BY`: Specifies the line terminator, if you want to end a line with a certain character. Take the following data format as an example: @@ -42,7 +42,7 @@ If you want to extract `bob`, `20`, and `street 1`, specify the separating chara FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' ``` -If you do not specify the processing parameters, the imported data is processed in the following manner: +If you do not specify the parameters above, the imported data is processed in the following way by default: ```sql FIELDS TERMINATED BY '\t' ENCLOSED BY '' @@ -51,9 +51,9 @@ LINES TERMINATED BY '\n' ### `IGNORE number LINES` -You can ignore the first `number` lines of the file by configuring the `IGNORE number LINES` parameter. For example, if you configure `IGNORE 1 LINES`, the first line of the file is ignored. +You can ignore the first `number` lines of a file by configuring the `IGNORE number LINES` parameter. For example, if you configure `IGNORE 1 LINES`, the first line of a file is ignored. -In addition, for the `DuplicateOpt`, `CharsetOpt`, and `LoadDataSetSpecOpt` parameters, TiDB currently only supports parsing syntax. +In addition, TiDB currently only supports parsing the syntax of the `DuplicateOpt`, `CharsetOpt`, and `LoadDataSetSpecOpt` parameters. ## Examples @@ -80,7 +80,7 @@ Query OK, 0 rows affected (0.14 sec) The following example imports data using `LOAD DATA`. Comma is specified as the separating character. The double quotation marks that enclose the data is ignored. The first line of the file is ignored. -If you see the error message `ERROR 1148 (42000): the used command is not allowed with this TiDB version`, refer to [ERROR 1148 (42000): the used command is not allowed with this TiDB version](/faq/tidb-faq.md#error-1148-42000-the-used-command-is-not-allowed-with-this-tidb-version) +If you see the error message `ERROR 1148 (42000): the used command is not allowed with this TiDB version`, refer to [ERROR 1148 (42000): the used command is not allowed with this TiDB version](/faq/tidb-faq.md#error-1148-42000-the-used-command-is-not-allowed-with-this-tidb-version). {{< copyable "sql" >}}