# Pisces trial database upload

## This document describes the steps required to upload the Pisces trial data to MariaDB

In the preceding step we screened and pre-processed the raw data for upload. Here, we actually add records to the database. A line-by-line form is of no use - we have to bulk upload records. But first, we need to actually make a database to upload to.

## Pisces trial database structure
To generate the database from scratch, we use SQL rather than any sort of graphical interface. It is easier to spot mistakes, and if we screw something up it's easier at this early stage to simply re-run everything and start again. First we need to log in via a terminal to the server and start MariaDB. The user of course needs to have write permissions. These were established when I set up MariaDB in the server chapter.  
**Note**:  
- We use the -p option to prompt for a password. It's the only way to get in
- SQL statements are finished with a semicolon

```
ssh craig@77.68.125.129
>craigspassword

mariadb -p
Enter password: craigspassword
```
This brings up the MariaDB prompt. The approach I take is to have the SQL code in a text file, and copy-paste segments into the terminal step by step. Be careful is you are using a Windows text editor because it may use stupid Windows-specific characters for line or carriage returns.  

```
CREATE DATABASE IF NOT EXISTS piscestrial;

SHOW DATABASES;

USE piscestrial;

CREATE TABLE trialID(
    SampleID varchar(30) PRIMARY KEY,
    Project varchar(30) NOT NULL,
    Asset varchar(30) NOT NULL,
    StartDateDT date NOT NULL,
    StartDate varchar(10) NOT NULL,
    EndDate varchar(10) NOT NULL,
    Location varchar(30) NOT NULL,
    Activity varchar(10) NOT NULL,
    Gear varchar(20),    
    GearType varchar(20),
    GearNo INT,
    Bait varchar(30) NOT NULL,
    GearInfo varchar(30) NOT NULL,    
    LatitudeCatchApp decimal(10,7) NOT NULL,    
    LongitudeCatchApp decimal(10,7) NOT NULL,  
    SoakTowTime decimal(4),
    Position varchar(30) NOT NULL,
    Light varchar(30) NOT NULL,
    Colour varchar(30) NOT NULL,
    Flash varchar(30) NOT NULL,
    Intensity varchar(30) NOT NULL
    );

CREATE TABLE catchData(
    SampleID varchar(30) NOT NULL,
    CountID varchar(60) PRIMARY KEY,
    Species varchar(30) NOT NULL,
    ReturnedWeight INT,
    ReturnedNumber INT,
    RetainedWeight INT,
    RetainedNumber INT
    );

CREATE TABLE trackData(
    Date datetime NOT NULL,
    Latitude decimal(10,7),    
    Longitude decimal(10,7), 
    SpeedMPH INT,
    SpeedKPH INT,
    AltFeet INT,
    Altmeter INT,
    Accuracy INT,
    Type varchar(10),
    DateString varchar(40),        
    Asset varchar(30) NOT NULL,
    trackID varchar(60) PRIMARY KEY
    );


CREATE TABLE effortData(
    SampleID varchar(30) PRIMARY KEY,
    Asset varchar(30) NOT NULL,
    StartDate varchar(10) NOT NULL,
    StartDateDT date NOT NULL,
    Activity varchar(10) NOT NULL,
    TrawlStartTime varchar(30) NOT NULL,
    TrawlEndTime varchar(30) NOT NULL,
    TrawlStartLat decimal(10,7),    
    TrawlStartLon decimal(10,7), 
    TrawlEndLat decimal(10,7),    
    TrawlEndLon decimal(10,7), 
    TowTimeMinutes decimal(7,3), 
    TowTimeHours decimal(6,4)
    );

CREATE TABLE ElAnneEffortData(
    EffortIDEilidhAnne varchar(30) NOT NULL,
    Asset varchar(30) NOT NULL,
    StartDate varchar(10) NOT NULL,
    StartDateDT date NOT NULL,
    TrawlDayStart varchar(30) NOT NULL,
    TrawlDayEnd varchar(30) NOT NULL,
    TrawlDayStartLat decimal(10,7),    
    TrawlDayStartLon decimal(10,7), 
    TrawlDayEndLat decimal(10,7),    
    TrawlDayEndLon decimal(10,7), 
    TowTimeMinutes decimal(7,3), 
    TowTimeHours decimal(6,4)
    );
    

CREATE TABLE fishLengthData(
    SampleID varchar(30) NOT NULL,
    Species varchar(30) NOT NULL,
    Length INT NOT NULL
    );

CREATE TABLE sampleWeightData(
    SampleID varchar(30) NOT NULL,
    WeightSpecies varchar(30) NOT NULL,
    Weight INT NOT NULL
    );

CREATE TABLE bulkData(
    SampleID varchar(30) PRIMARY KEY,
    Bulk INT NOT NULL
    );

CREATE TABLE deckPhotoFile(
    SampleID varchar(30) PRIMARY KEY,
    PhotoFilename varchar(60) NOT NULL,
    PhotoSubject varchar(30) NOT NULL
    );
```

So far so good. MariaDB has generated a database schema and structure, and set up directories for where the tables will live. Don't be tempted to manually alter or define these. MariaDB will manage (effectively) unlimited numbers of databases, but it needs to keep track of everything internally. It is best to let it do its own organising.  
Of course we only need to make the database once. Oce it's there, we need to populate it with data.  

## Transferring data to the server
To get data to the server I use Filezilla, which is a graphical FTP interface. The database files live in:   
/home/sntech/PiscesTrialDatabase  
Within this directory there is a Data folder, which holds the csv files for upload, and folders for both the original and the renamed versions of the photos. Needless to say, you will need write permissions for this directory. 
````{margin}
```{warning}
I can't stress how important it is to be careful when moving files to and from the server - particularly when you have extended write permissions. A lot of damage can be done within a very short period of time on a unix server!
```
````
Now the files are on the server, we can again use some SQL code to upload them to the database. The exact code will depend on which tables you are uploading to. Also, in the example below I've just specified some generic file names.

```{warning}
Deep breath...
```

Assuming we have just ssh'ed into the server, we need to start MariaDB first.

```
mariadb -p
Enter password: craigspassword

SET GLOBAL local_infile=1;

USE piscestrial;

LOAD DATA LOCAL INFILE '/home/sntech/PiscesTrialDatabase/Data/trialIDProcessed.csv' 
IGNORE INTO TABLE trialID 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

LOAD DATA LOCAL INFILE '/home/sntech/PiscesTrialDatabase/Data/catchDataProcessed.csv' 
IGNORE INTO TABLE catchData 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

LOAD DATA LOCAL INFILE '/home/csyms/Documents/SafetyNetTech/MariaDBases/CatchApp/trackDataProcessed.csv' 
IGNORE INTO TABLE trackData 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

LOAD DATA LOCAL INFILE '/home/sntech/PiscesTrialDatabase/Data/effortDataProcessed.csv' 
IGNORE INTO TABLE effortData 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

LOAD DATA LOCAL INFILE '/home/sntech/PiscesTrialDatabase/Data/fishLengthDataProcessed.csv' 
IGNORE INTO TABLE fishLengthData 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

LOAD DATA LOCAL INFILE '/home/sntech/PiscesTrialDatabase/Data/sampleWeightDataProcessed.csv' 
IGNORE INTO TABLE sampleWeightData 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

LOAD DATA LOCAL INFILE '/home/sntech/PiscesTrialDatabase/Data/bulkDataProcessed.csv' 
IGNORE INTO TABLE bulkData 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

LOAD DATA LOCAL INFILE '/home/sntech/PiscesTrialDatabase/Data/deckPhotoFileProcessed.csv' 
IGNORE INTO TABLE deckPhotoFile 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
```

A few explanatory notes...  
- The SET GLOBAL local_infile=1; statement is required because by default MySQL 8.0 doesn't allow infile upload. This is a security thing.
- LOAD DATA LOCAL enables the local filesystem to be read. If you drop the LOCAL option, you will receive a relatively uninformative message about permissions and tear your hair out trying to figure what's wrong.
- IGNORE INTO TABLE is a rather odd syntax, but this prevents upload of duplicates. Each entry in the processed file has either a primary key (unique) value, or a combination of values that will be unique to a given sample. IGNORE INTO scans for and will not upload duplicates on the INFILE. This prevents you adding two lines for the same data point
- IGNORE 1 ROWS drops the header from the CSV.
- For Eilidh Anne there is separate effort table, because we can only identify daily effort, not hours per haul.

If all has gone according to plan, everything is in the database. You can check individual tables by (for example):
```
SELECT * FROM trialID;
```
Or if you want to do a visual check, see the next step.

## An alternative...
It is possible to use a GUI interface to upload data. This is occasionally useful if you have to upload from a local file, however I prefer to keep the master file(s) saved on the server so things can be reconstructed *de novo* if required.  
The recommended GUI is phpMyAdmin. 

```{figure} PhpMyAdminScreenshot2.png
```

Generally I use phpMyAdmin just as a visual check that nothing stupid has crept in.  

That completes the Pisces Trial database read-in procedure.