Data Science Fundamentals: R |
[Table of Contents](../index.ipynb)
- - - 
<!--NAVIGATION-->
Module 13. [Introduction](./00.ipynb) | [Basic Syntax](./01.ipynb)  | [Data Types](./02.ipynb) | [Variables](./03.ipynb) | [Operators](./04.ipynb) | [Decision Making](./05.ipynb)  | [Functions](./06.ipynb) | [Strings](./07.ipynb) | [Vectors](./08.ipynb) | [Lists](./09.ipynb) | [Matrices](./10.ipynb) | [Arrays](./11.ipynb) | [Factors](./12.ipynb) | [Data Frames](./13.ipynb) | [Data Reshaping](./14.ipynb) | [Exercises](./15.ipynb)

Strings
---

Any value written within a pair of single quote or double quotes in R is treated as a string. Internally R stores every string within double quotes, even when you create them with single quote.

<b>Rules Applied in String Construction</b>

- The quotes at the beginning and end of a string should be both double quotes or both single quote. They can not be mixed.

- Double quotes can be inserted into a string starting and ending with single quote.

- Single quote can be inserted into a string starting and ending with double quotes.

- Double quotes can not be inserted into a string starting and ending with double quotes.

- Single quote can not be inserted into a string starting and ending with single quote.

<b>Examples of Valid Strings</b>

Following examples clarify the rules about creating a string in R.

In [1]:
a <- 'Start and end with single quote'
print(a)

b <- "Start and end with double quotes"
print(b)

c <- "single quote ' in between double quotes"
print(c)

d <- 'Double quotes " in between single quote'
print(d)

[1] "Start and end with single quote"
[1] "Start and end with double quotes"
[1] "single quote ' in between double quotes"
[1] "Double quotes \" in between single quote"


<b>Examples of Invalid Strings</b>

In [2]:
e <- 'Mixed quotes" 
print(e)

f <- 'Single quote ' inside single quote'
print(f)

g <- "Double quotes " inside double quotes"
print(g)

ERROR: Error in parse(text = x, srcfile = src): <text>:4:7: unexpected symbol
3: 
4: f <- 'Single
         ^


<b>String Manipulation</b>

<b>Concatentating Strings - Paste () Function</b>

Many strings in R are combined using the paste() function. It can take any number of arguments to be combined together.

SYNTAX
The basic syntax for paste function is −

Following is the description of the parameters used −

- ... represents any number of arguments to be combined.

- sep represents any separator between the arguments. It is optional.

- collapse is used to eliminate the space in between two strings. But not the space within two words of one string.

In [5]:
a <- "Hello"
b <- 'How'
c <- "are you? "

print(paste(a,b,c))

print(paste(a,b,c, sep = "-"))

print(paste(a,b,c, sep = "", collapse = ""))

[1] "Hello How are you? "
[1] "Hello-How-are you? "
[1] "HelloHoware you? "


<b>Formatting Numbers & Strings - FORMAT() Function</b>

Numbers and strings can be formatted to a specific style using format() function.

SYNTAX
The basic syntax for format function is −

Following is the description of the parameters used −

- x is the vector input.

- digits is the total number of digits displayed.

- nsmall is the minimum number of digits to the right of the decimal point.

- scientific is set to TRUE to display scientific notation.

- width indicates the minimum width to be displayed by padding blanks in the beginning.

- justify is the display of the string to left, right or center.

See example -

In [6]:
# Total number of digits displayed. Last digit rounded off.
result <- format(23.123456789, digits = 9)
print(result)

# Display numbers in scientific notation.
result <- format(c(6, 13.14521), scientific = TRUE)
print(result)

# The minimum number of digits to the right of the decimal point.
result <- format(23.47, nsmall = 5)
print(result)

# Format treats everything as a string.
result <- format(6)
print(result)

# Numbers are padded with blank in the beginning for width.
result <- format(13.7, width = 6)
print(result)

# Left justify strings.
result <- format("Hello", width = 8, justify = "l")
print(result)

# Justfy string with center.
result <- format("Hello", width = 8, justify = "c")
print(result)

[1] "23.1234568"
[1] "6.000000e+00" "1.314521e+01"
[1] "23.47000"
[1] "6"
[1] "  13.7"
[1] "Hello   "
[1] " Hello  "


<b>Counting Number of Characters In A String - NCHAR() Function</b>

This function counts the number of characters including spaces in a string.

SYNTAX
The basic syntax for nchar() function is −

Following is the description of the parameters used −

- x is the vector input.

See example - 

In [7]:
result <- nchar("Count the number of characters")
print(result)

[1] 30


<b>Changing the Case - TOUPPER() & TOLOWER() Function</b>

These functions change the case of characters of a string.

SYNTAX
The basic syntax for toupper() & tolower() function is −

Following is the description of the parameters used −

- x is the vector input.

See example 

In [8]:
# Changing to Upper case.
result <- toupper("Changing To Upper")
print(result)

# Changing to lower case.
result <- tolower("Changing To Lower")
print(result)

[1] "CHANGING TO UPPER"
[1] "changing to lower"


<b>Extracting Parts of a String - SUBSTRING() Function</b>

This function extracts parts of a String.

SYNTAX
The basic syntax for substring() function is −

Following is the description of the parameters used −

- x is the character vector input.

- first is the position of the first character to be extracted.

- last is the position of the last character to be extracted.

See example - 

In [9]:
# Extract characters from 5th to 7th position.
result <- substring("Extract", 5, 7)
print(result)

[1] "act"


- - -
<!--NAVIGATION-->
Module 13. [Introduction](./00.ipynb) | [Basic Syntax](./01.ipynb)  | [Data Types](./02.ipynb) | [Variables](./03.ipynb) | [Operators](./04.ipynb) | [Decision Making](./05.ipynb)  | [Functions](./06.ipynb) | [Strings](./07.ipynb) | [Vectors](./08.ipynb) | [Lists](./09.ipynb) | [Matrices](./10.ipynb) | [Arrays](./11.ipynb) | [Factors](./12.ipynb) | [Data Frames](./13.ipynb) | [Data Reshaping](./14.ipynb) | [Exercises](./15.ipynb)