# Introduction

In this notes, examples concerning data processing are presented

# Read and Write Data Files

## CSV Files

### Read CSV Files

#### Read a CSV File with Default Options
* `read.csv` returns a data.frame

The comma-separated values (CSV) file used here for demonstration has a content:  
```
"","BPchange","Dose","Run","Treatment","Animal"
"1",0.5,6.25,"C1","Control","R1"
"2",4.5,12.5,"C1","Control","R1"  
"3",10,25,"C1","Control","R1"
"4",26,50,"C1","Control","R1"
"5",37,100,"C1","Control","R1"
"6",32,200,"C1","Control","R1"
```

In [1]:
# Getting data from file "rabbit.csv"
rabbit_sample <- read.csv("datasets/rabbit.csv")

# Print the class of the variable rabbit_sample
print(class(rabbit_sample))

# Printing first few lines of the dataframe
head(rabbit_sample)

[1] "data.frame"


Unnamed: 0_level_0,X,BPchange,Dose,Run,Treatment,Animal
Unnamed: 0_level_1,<int>,<dbl>,<dbl>,<fct>,<fct>,<fct>
1,1,0.5,6.25,C1,Control,R1
2,2,4.5,12.5,C1,Control,R1
3,3,10.0,25.0,C1,Control,R1
4,4,26.0,50.0,C1,Control,R1
5,5,37.0,100.0,C1,Control,R1
6,6,32.0,200.0,C1,Control,R1


#### Not Assuming the First Row in the CSV File is Labels
* The column labels will be "V1", "V2", etc...

In [2]:
# Getting data from file "rabbit.csv"
rabbit_sample <- read.csv("datasets/rabbit.csv", header = FALSE)

# Printing first few lines of the dataframe
head(rabbit_sample)

Unnamed: 0_level_0,V1,V2,V3,V4,V5,V6
Unnamed: 0_level_1,<int>,<fct>,<fct>,<fct>,<fct>,<fct>
1,,BPchange,Dose,Run,Treatment,Animal
2,1.0,0.5,6.25,C1,Control,R1
3,2.0,4.5,12.5,C1,Control,R1
4,3.0,10,25,C1,Control,R1
5,4.0,26,50,C1,Control,R1
6,5.0,37,100,C1,Control,R1


#### Using Custom Column Names
* The rule is the same as rows.

In [3]:
# Getting data from file "rabbit.csv"
rabbit_sample <- read.csv("datasets/rabbit.csv", col.names = c("A", "B", "C", "D", "E", "F"))

# Printing first few lines of the dataframe
head(rabbit_sample)

Unnamed: 0_level_0,A,B,C,D,E,F
Unnamed: 0_level_1,<int>,<dbl>,<dbl>,<fct>,<fct>,<fct>
1,1,0.5,6.25,C1,Control,R1
2,2,4.5,12.5,C1,Control,R1
3,3,10.0,25.0,C1,Control,R1
4,4,26.0,50.0,C1,Control,R1
5,5,37.0,100.0,C1,Control,R1
6,6,32.0,200.0,C1,Control,R1


### Write CSV Files


#### Simple Use of `write.csv`

In [4]:
# Write the data.frame to "testing.csv"
write.csv(rabbit_sample, "datasets/testing.csv")

The file "testing.csv" contains:  
```
"","A","B","C","D","E","F"
"1",1,0.5,6.25,"C1","Control","R1"
"2",2,4.5,12.5,"C1","Control","R1"
"3",3,10,25,"C1","Control","R1"
"4",4,26,50,"C1","Control","R1"
"5",5,37,100,"C1","Control","R1"
"6",6,32,200,"C1","Control","R1"
```

## XLSX Files

### Read XLSX Files

#### Read a CSV File with Default Options

* `xlsx::read.xlsx` returns a data.frame
* The xlsx file used for demonstration contains the following data:

![title](img/Fig_02_01.png)

In [5]:
# Loading the xlsx library
library(xlsx)

# Get the iris dataset from iris.xlsx, the second argument is the index of the worksheet in the xlsx file.
iris_table <- xlsx::read.xlsx("datasets/iris.xlsx", 1)

# Print first few lines of the table
head(iris_table)

Unnamed: 0_level_0,NA.,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
Unnamed: 0_level_1,<fct>,<dbl>,<dbl>,<dbl>,<dbl>,<fct>
1,1,5.1,3.5,1.4,0.2,setosa
2,2,4.9,3.0,1.4,0.2,setosa
3,3,4.7,3.2,1.3,0.2,setosa
4,4,4.6,3.1,1.5,0.2,setosa
5,5,5.0,3.6,1.4,0.2,setosa
6,6,5.4,3.9,1.7,0.4,setosa


### Write XLSX Files

* The basic syntax: 
```
write.xlsx(
       x,
       file,
       sheetName = "Sheet1",
       col.names = TRUE,
       row.names = TRUE,
       append = FALSE,
       showNA = TRUE,
       password = NULL
     )
```

In [6]:
# Staff table to export
staff_table = data.frame(
    ID = c(1L, 2L, 3L, 4L),
    Name = c("Tom", "Ann", "Peter", "Kelly"), 
    Phone = c(73490245L, 77990904L, 47876737L, 35146136L)
)

# Write the xlsx file to the file namely staff_table.xlsx
xlsx::write.xlsx(staff_table, "datasets/staff_table.xlsx", append = FALSE)

* The output xlsx file:

![title](img/Fig_02_02.png)

# Database

# Packages for Data Handling