Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Include bikeshare example database in manual #596

Merged
merged 16 commits into from
Sep 12, 2018
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
+ Quick Start
- [TiDB Quick Start Guide](QUICKSTART.md)
- [Basic SQL Statements](try-tidb.md)
- [Bikeshare Example Database](bikeshare-example-database.md)
+ TiDB User Guide
+ TiDB Server Administration
- [The TiDB Server](sql/tidb-server.md)
Expand Down
67 changes: 67 additions & 0 deletions bikeshare-example-database.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
title: Bikeshare Example Database
summary: Install the Bikeshare Example Database
---

# Bikeshare Example Database

Examples used in the TiDB manual use [System Data](https://www.capitalbikeshare.com/system-data) from
Capital Bikeshare, released under the [Capital Bikeshare Data License Agreement](https://www.capitalbikeshare.com/data-license-agreement).

## Downloading all data files

The system data is available [for download in .zip files](https://s3.amazonaws.com/capitalbikeshare-data/index.html) organized per year. Downloading and extracting all files requires approximately 3GB of disk space. To download all files for years 2010-2017 using a bash script:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete the extra space both before and after this sentence "Downloading and extracting all files requires approximately 3GB of disk space." Just leave one necessary space.


```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls add bash for the code fencing. And also add corresponding code fencing for other code blocks in this PR.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this blank line.

mkdir -p bikeshare-data && cd bikeshare-data

for YEAR in 2010 2011 2012 2013 2014 2015 2016 2017; do
wget https://s3.amazonaws.com/capitalbikeshare-data/${YEAR}-capitalbikeshare-tripdata.zip
unzip ${YEAR}-capitalbikeshare-tripdata.zip
done;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this blank line, too.

```

## Load data into TiDB

The system data can be imported into TiDB using the following schema:

```
CREATE DATABASE bikeshare;
USE bikeshare;

CREATE TABLE trips (
trip_id bigint NOT NULL PRIMARY KEY auto_increment,
duration integer not null,
start_date datetime,
end_date datetime,
start_station_number integer,
start_station varchar(255),
end_station_number integer,
end_station varchar(255),
bike_number varchar(255),
member_type varchar(255)
);
```
You can import files indivudally using the example `LOAD DATA` command here, or import all files using the bash loop below:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indivudally -> individually


```
LOAD DATA LOCAL INFILE '2017Q1-capitalbikeshare-tripdata.csv' INTO TABLE trips
FIELDS TERMINATED BY ',' ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(duration, start_date, end_date, start_station_number, start_station,
end_station_number, end_station, bike_number, member_type);
```

### Import all files

To import all `*.csv` files into TiDB in a bash loop:

```
for FILE in `ls *.csv`; do
echo "== $FILE =="
mysql bikeshare -e "LOAD DATA LOCAL INFILE '${FILE}' INTO TABLE trips FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type);"
done;
```