# Data Types

In R, the variables are not declared as some data type. The variables are assigned with R-Objects and the data type of the R-object becomes the data type of the variable. There are many types of R-objects. The frequently used ones are −

* Vectors
* Lists
* Matrices
* Arrays
* Factors
* Data Frames

The simplest of these objects is the **vector** object and there are six data types of these atomic vectors, also termed as six classes of vectors. The other R-Objects are built upon the atomic vectors.

**Logical**

 they can take only three possible values: `FALSE`, `TRUE`, and `NA`.

In [8]:
TRUE

In [4]:
FALSE

Short form:

In [1]:
T

In [2]:
F

Get the class:

In [5]:
class(TRUE)

In [6]:
class(FALSE)

In [6]:
typeof(NA)

Why the default type of `NA` is logical? Because this will avoid coercion 

The presence of missing values shouldn’t affect the type of an object. Recall that there is a type-hierarchy for coercion from character → double → integer → logical. When combining NAs with other atomic types, the NAs will be coerced to integer (`NA_integer_`), double (`NA_real_`) or character (`NA_character_`) and not the other way round. If `NA` were a character and added to a set of other values all of these would be coerced to character as well.

In [4]:
c(NA_character_, 1) # 1 will be coerced to character vector, this is bad
# because the presence of missing values shouldn't affect the type of an object.

<hr>

**`as.logical`**: converting to `logical`

In [1]:
as.logical(1)

In [5]:
as.logical(list(TRUE, -1, 0))

**Numeric**

Doubles are approximations. Doubles represent floating point numbers that can not always be precisely represented with a fixed amount of memory. This means that you should consider all doubles to be approximations. For example, what is square of the square root of two?

In [7]:
sqrt(2) ^ 2 == 2

In [8]:
# when comparing doubles, use `dplur::near` to allow some numerical tolerance
dplyr::near(sqrt(2) ^ 2, 2)

Double has 4 special values: `Inf`, `-Inf`, `NaN`, `NA`

Avoid using `==` to check for these other special values. Instead use the helper functions `is.finite()`, `is.infinite()`, and `is.nan()`:

In [9]:
is.finite(Inf)
is.infinite(-Inf)
is.nan(NaN)

In [9]:
1

In [10]:
1.234

In [11]:
class(1)

In [12]:
class(1.2345)

In [13]:
class(1e-3)

<hr>

**`as.numeric`**: converting to double

In [6]:
as.numeric('1.3232')

In [7]:
as.numeric(c('1.32', '32', '.1'))

**Integer**

Integer has 1 special value: `NA`

In [15]:
19L

In [16]:
0L

In [17]:
-1L

In [18]:
class(19L)

In [20]:
class(-1L)

<hr>

**`as.integer`**: converting to integer

In [8]:
as.integer(1.234)

In [10]:
as.integer(c(1.343, 1.99, '.1'))

**Complex**

In [21]:
3 + 2i

In [22]:
class(3 + 2i)

<hr>

**`as.complex`**: convert to complex

In [13]:
as.complex(5)

In [14]:
as.complex(c(3, 1, 0))

**Character**

Character vectors are the most complex type of atomic vector, because each element of a character vector is a string, and a string can contain an arbitrary amount of data.

You’ve already learned a lot about working with strings in strings. Here I wanted to mention one important feature of the underlying string implementation: R uses a global string pool. This means that each unique string is only stored in memory once, and every use of the string points to that representation. This reduces the amount of memory needed by duplicated strings.

In [23]:
'a'

In [24]:
'VN Pikachu'

In [25]:
'3.14'

In [26]:
class('3.14')

<hr>

**`as.character`**: convert to character

In [16]:
as.character(5)

In [15]:
as.character(1:5)

**Raw**

In [27]:
v <- charToRaw('VN Pikachu')
v

 [1] 56 4e 20 50 69 6b 61 63 68 75

In [28]:
class(v)

# Missing values

Note that each type of atomic vector has its own missing value:

In [None]:
NA             # logical
 
NA_integer_    # integer
 
NA_character_  # character

NA_real_       # double

Normally you don’t need to know about these different types because you can always use `NA` and it will be converted to the correct type using the implicit coercion rules described next. However, there are some functions that are strict about their inputs, so it’s useful to have this knowledge sitting in your back pocket so you can be specific when needed.

## Vectors 

Using **`c()`** to create a vector

In [31]:
help(c)

In [32]:
Pikachu <- c(31, 57611, 2799)
Pikachu

In [37]:
range = 1:10
range

In [35]:
#create a vector using slice Notation
c(1, 5:7)

In [39]:
c(1, 3, 8, 'Pikachu')

In [55]:
print(c(1,5,8))

[1] 1 5 8


## List

In [40]:
help(list)

A list is an R-object which can contain many different types of elements inside it like vectors, functions and even another list inside it.

In [54]:
products <- list(3, 1:5, c(1, 8, 5), 'Pikachu', sin)
print(products)

[[1]]
[1] 3

[[2]]
[1] 1 2 3 4 5

[[3]]
[1] 1 8 5

[[4]]
[1] "Pikachu"

[[5]]
function (x)  .Primitive("sin")



## Matrix

In [46]:
help(matrix)

A matrix is a two-dimensional rectangular data set. It can be created using a vector input to the matrix function.

In [53]:
data = matrix(c('a', 'b', 'a', 'b'), nrow = 2, ncol = 2)
print(data)

     [,1] [,2]
[1,] "a"  "a" 
[2,] "b"  "b" 


## Array

In [58]:
help(array)

**`array()`**

While matrices are confined to two dimensions, arrays can be of any number of dimensions. The array function takes a dim attribute which creates the required number of dimension. In the below example we create an array with two elements which are 3x3 matrices each.

In [52]:
filter = array(c(1,2,3), dim = c(2,2,2))
print(filter)

, , 1

     [,1] [,2]
[1,]    1    3
[2,]    2    1

, , 2

     [,1] [,2]
[1,]    2    1
[2,]    3    2



## Factor

In [59]:
help(factor)

**`factor()`**

In [57]:
apple_colors <- c('green','green','yellow','red','red','red','green')
factor_apples <- factor(apple_colors)
print(factor_apples)

[1] green  green  yellow red    red    red    green 
Levels: green red yellow


## Data Frames

In [60]:
help(data.frame)

**`data.frame()`**

In [61]:
clan <- data.frame(
    name = c('VN Pikachu', 'Tank Cao', 'Meomeo888'),
    level = c(31, 34, 32),
    damage = c(57611, 54832, 55321)
)
clan

name,level,damage
VN Pikachu,31,57611
Tank Cao,34,54832
Meomeo888,32,55321


# Variables

## Declaration

 A valid variable name consists of letters, numbers and the dot or underline characters.  
 The variable name starts with a letter or the dot not followed by a number.

In [63]:
name <- 'VN Pikachu'

In [64]:
exp10 <- 10L

In [65]:
bessel_function <- 8.32

In [66]:
clan.name <- 'VN Champion'

In [68]:
.server = 'SG'

## Assignment

The variables can be assigned values using leftward, rightward and equal to operator

In [74]:
#lefward
value <- 30L
value

In [73]:
#rightward
9.75 -> score
score

In [75]:
#equal
bytes = charToRaw('stream')
bytes

[1] 73 74 72 65 61 6d

In [77]:
cat('variable 1:', value, '\nvariable 2:', score, '\nvariable 3:', bytes)

variable 1: 30 
variable 2: 9.75 
variable 3: 73 74 72 65 61 6d

## Finding variables

To know all the variables currently available in the workspace we use the **`ls()`** function.  
Also the **`ls()`** function can use patterns to match the variable names.

```R
ls(name, pos = -1L, envir = as.environment(pos), all.names = FALSE,
  pattern, sorted = TRUE)
```

In [79]:
print(ls())

 [1] "apple_colors"    "bessel_function" "bytes"           "clan"           
 [5] "clan.name"       "data"            "exp10"           "factor_apples"  
 [9] "filter"          "n1"              "name"            "Pikachu"        
[13] "products"        "range"           "score"           "v"              
[17] "value"          


The **`ls()`** function can use patterns to match the variable names.

In [81]:
ls(pattern = 'chu')

The variables starting with **dot(.)** are hidden, they can be listed using `"all.names = TRUE"` argument to **`ls()`** function.

In [82]:
ls(all.names = TRUE)

## Deleting variables

**`remove`** and **`rm`** can be used to remove objects. These can be specified successively as character strings, or in the character vector list, or through a combination of both. All objects thus specified will be removed.

**Usange**

```R

remove(..., list = character(), pos = -1,
       envir = as.environment(pos), inherits = FALSE)

rm    (..., list = character(), pos = -1,
       envir = as.environment(pos), inherits = FALSE)
```

**Parameters**

```R

...	
the objects to be removed, as names (unquoted) or character strings (quoted).

list	
a character vector naming objects to be removed.

pos	
where to do the removal. By default, uses the current environment. See ‘details’ for other possibilities.

envir	
the environment to use. See ‘details’.

inherits	
should the enclosing frames of the environment be inspected?
```

Variables can be deleted by using the **`rm()`** function. 

In [83]:
products

In [84]:
#delete variable `products`
rm(products)

In [85]:
#try getting the value of products (will raise an error)
products

ERROR: Error in eval(expr, envir, enclos): object 'products' not found


In [87]:
#delete all variables
rm(list = ls())

<hr>

In [6]:
player.name <- 'VN Pikachu'
player.level <- 31
player.stats <- c(57611, 2799)
player.clan <- 'VN Champions'
game <- 'Tank Force'
firm <- 'Extreme Devloper'
country <- 'Russia'
ls(all.names = TRUE)

In [7]:
#remove multiple objects
rm(player.clan, player.stats)

ls(all.names = TRUE)

In [9]:
#remove mutiple objects (as a list)
rm(list = c('game', 'firm'))
ls()

In [10]:
#remove an object (using string)
rm('country')
ls()

---

Many of the functions for working with vectors have generalisations for matrices and arrays:

Vector	        |Matrix	                |Array
----------------|-----------------------|--------------
names()	        |rownames(), colnames()	|dimnames()
length()        |row(), ncol()	        |dim()
c()	            |bind(), cbind()	    |abind::abind()
—	            |()	                    |aperm()
is.null(dim(x))	|s.matrix()	            |is.array()