## Strings Data Type
* Python handles strings with sophistication
* Single or doble quotes define a string
* Backslash \ is considered as an escape character for strings in Python

In [1]:
'Hello'
# the above line declares a string which is equivalent to declaration in the below line
"hello"

'hello'

In [2]:
"hello"

'hello'

In [3]:
'Insert "double quote" between two single quotes'

'Insert "double quote" between two single quotes'

In [4]:
"Insert \"double quote\" between two double quotes using an escape character"

'Insert "double quote" between two double quotes using an escape character'

In [5]:
# Use double quotes to inlcude single quote within a string
"How's it?"

"How's it?"

In [6]:
# Use escape character to inlcude single quote within a string
'How\'s it?'

"How's it?"

In [7]:
# Use the following format to include single quote as well as double quotes in a string.
# Use escape character for a single quote, and leave double quotes as such
'It\'s, "out!"'

'It\'s, "out!"'

## Strings are immutable
Immutable implies cannot be modified or updated

In [8]:
# Illustration of Strings Immutability
string1 = "Python"
print(string1)

Python


In [11]:
string1[5-1]

'o'

In [9]:
string1[1] = "i"

TypeError: 'str' object does not support item assignment

## Strings are encoded as UTF-8
* Python does not detect string literals as bytes.
* Assume you declared x variable with a string, then x is considered as string literal
* x = "Hello Python!"
* x is a string literal in Python
* x defined as string literal can be converted into bytes format using encode() function
* Likewise, any string literal in byte format cane be converted to string format using decode() function

Illustration in the following cells:

In [12]:
# Assign x to mu
x = "μ"
#Check the length of the string x
len(x)

1

In [14]:
# Verify if x is string literal format
isinstance(x, str)

True

In [13]:
# Verify if x is in bytes format
isinstance(x, bytes)

False

In [15]:
# Convert string literal x to bytes format using encode() function
x.encode('utf8')
y = x.encode('utf8')
print(y)

b'\xce\xbc'


In [16]:
# Verify if y is string literal format
isinstance(y,str)

False

In [17]:
# Verify if x is bytes format
isinstance(y, bytes)

True

In [18]:
# Convert y to string using decode() function
y.decode('utf8')

'μ'

## Operations on strings
Using re module, i.e., regular expressions module, powerful operations in strings may be performed
A few common operations:
* Slicing strings
* Concatenate strings
* Duplication of strings
* Strings substitution
* Formatting Strings

### Slicing Strings
Say a string contains employee ID in this format
* Year of joining (four digits)
* Department (three characters)
* Six digit unique code (Six digits)
* Examples:
    * 2009HRM9842
    * 2017ITO8901
* Say you want to extract year, department or unique empolyee code
* You would use strings slicing operations

In [19]:
# Assign employee ID 1 and Employee ID 2
empID1 = "2009HRM9842"
empID2 = "2017ITO8901"

* Python is zero indexed
* Value in the first position begins at position 0
* Value in the last position is identified at n-1
* Values between 0 and n-1 may be accordingly found
* When slicing, between two positions (m and n), begin the slicing at m-1 and end at n

Illustrations provided below:

In [22]:
## Extract first element in empID1
print(empID1[0])
## Extract last element in empID1
print(empID1[10]) #orempID1[-1]

# extract R
print(empID1[6-1]) 

2
2
R


In [23]:
empID1 = "2009HRM9842"
print(empID1[0])
print(empID1[1])
print(empID1[2])
print(empID1[3])
print(empID1[4])

2
0
0
9
H


In [24]:
# Extract year from Employee ID 1 and Employee ID 2
print(empID1[0:4])
print(empID2[0:4])

2009
2017


In [25]:
# Extract Department
empID1[4:7]

'HRM'

In [26]:
empID2[4:7]

'ITO'

In [28]:
empID1[11]

IndexError: string index out of range

In [None]:
# Extract Unique Code
empID1[7:11]

In [None]:
empID2[7:11]

In [29]:
# Select all values from 2nd position to the last position
empID1[2:]

'09HRM9842'

In [30]:
# Select all values from 0'th position to 2nd position
empID1[:2]

'20'

## Concatenate Strings
* use "+" operator to concatenate strings. or
* use both strings with parantheses and without comma (Note: Do not assign strings to variables)

In [32]:
# Concatenate strings with "+" operator
string1 = "BITS"
string2 = "Pilani"
print(string1)
print(string2)
print(string1 + string2) #concatenate two strings
print(string1 + " " + string2 +"!") #concatenate two strings with space and exclamation

BITS
Pilani
BITSPilani
BITS Pilani!


In [31]:
("BITS" "Pilani")

'BITSPilani'

In [33]:
# Concatenate strings with parantheses
print(("BITS" "Pilani")) #concatenate two strings
print(("BITS" " " "Pilani" "!")) #concatenate two strings with space and exclamation

BITSPilani
BITS Pilani!


## Duplication of strings
* use "*" operator to duplicate strings

In [34]:
# Duplicate string1 for five times
string1 = "BITS"
string1 * 5

'BITSBITSBITSBITSBITS'

In [35]:
# Duplicate string "Pilani" thrice
'Pilani'*3 #or "Pilani"*3

'PilaniPilaniPilani'

In [36]:
"Pilani"*3

'PilaniPilaniPilani'

* use replace() function from re module (regular expressions module)

In [37]:
string1 = "BITS Pilani"
string2 = "Goa"

#substitute Pilani with Goa in string1
string1
string1.replace('Pilani', 'Goa')

'BITS Goa'

## Formatting Strings
Several methods to format strings in Python. A few techniques discussed below:
* Append type followed by value (in paranthesis)
* use "+" operator to combine string and value

Note:
* use %d for integer data types
* use %f for float data types
* use 5s for string data types

Illustrations shown below:

In [38]:
# Append type followed by value (in paranthesis)
"The result of 5x3 = %d" %(15)

'The result of 5x3 = 15'

In [None]:
# Append type followed by value (in paranthesis): integer data type
"The result of 5x3.5 = %d" %(17.5)

In [39]:
# Append type followed by value (in paranthesis) : float data type
"The result of 10/4 = %f" %(2.5)

'The result of 10/4 = 2.500000'

In [40]:
# Append type followed by value (in paranthesis) : string data type
"You are programming in %s" %("Python")

'You are programming in Python'

In [41]:
# Append using + operator: integer data type
"The result of 5x3 = " +str(15)

'The result of 5x3 = 15'

In [42]:
# Append using + operator : float data type
"The result of 10/4 = " +str(10/4)

'The result of 10/4 = 2.5'

In [43]:
# Append using + operator : string data type
"You are programming in " +str("Python")

'You are programming in Python'