# **Python String**

*A string is a sequence of characters. A character is simply a symbol. For example, the English language has 26 characters.*

Computers do not deal with characters. They deal with numbers (binary). Even though we may see characters on our screen, internally, it is stored and manipulated as a combination of 0s and 1s. This conversion of a character to a number is called encoding, and the reverse process is decoding. ASCII and Unicode are some of the popular encodings used.

In Python, a string is a sequence of Unicode characters. Unicode was introduced to include every character in all languages and bring uniformity in encoding. We can learn about Unicode from Python Unicode.

### **References:**

> [**Python Strings - Programiz**](https://www.programiz.com/python-programming/string)

### **How to create a string in Python?**

Strings can be created by enclosing characters inside a single quote or double-quotes. Even triple quotes can be used in Python but are generally used to represent multi-line strings and docstrings.

In [1]:
# Define strings in Python.
my_string = "Hello"
print(my_string)

my_string = "Hello"
print(my_string)

my_string = """Hello"""
print(my_string)

# Triple quotes string can extend multiple lines.
my_string = """Hello, welcome to
               the world of Python."""
print(my_string)

Hello
Hello
Hello
Hello, welcome to
               the world of Python.


### **How to access the characters in a string?**

We can access individual characters using indexing and a range of characters using slicing. Index starts from 0. Trying to access a character out of the index range will raise an $IndexError$. The index must be an integer. We can't use floats or other types as it will result in $TypeError$.

Python allows negative indexing for its sequences. The index of $-1$ refers to the last item, $-2$ to the second last item, and so on. We can access a range of items in a string by using the slicing operator "$:$" (colon).

In [2]:
# Accessing string characters in Python.
string = "ARITRA"
print("string = ", string)

# First Character.
print("string[0] = ", string[0])

# Last Character.
print("string[-1] = ", string[-1])

# Slicing 2nd to 5th Character.
print("string[1:5] = ", string[1:5])

# Slicing 6th to 2nd last Character.
print("string[5:-2] = ", string[3:-2])

# Accessing an index out of the range or using numbers other than an integer will get errors.
print("string[15] = ", string[15])  # IndexError: String Index out of range.

string =  ARITRA
string[0] =  A
string[-1] =  A
string[1:5] =  RITR
string[5:-2] =  T


IndexError: ignored

### **How to change or delete a string?**

Strings are immutable. It means that elements of a string cannot be changed once they have been assigned. We can simply reassign different strings to the same name.

In [3]:
my_string = "ARITRA"
my_string[6] = "G"  # TypeError: 'str' object does not support item assignment.

TypeError: ignored

In [4]:
my_string = "GANGULY"
print(my_string)

GANGULY


We cannot delete or remove characters from a string. But deleting the string entirely is possible using the $del$ keyword.

In [5]:
del my_string[1]  # TypeError: 'str' object doesn't support item deletion.

TypeError: ignored

In [6]:
del my_string  # Delete the entire string.

## **Python String Operations**

Python allows many string operations making it one of the most used data types.

#### **Concatenation of Two or More Strings**

*Joining two or more strings into a single one is called concatenation.*

The $+$ operator does this in Python. Simply writing two-string literals together also concatenates them.

The $*$ operator can be used to repeat the string a given number of times.

In [7]:
# Python String Operations.
str1 = "Hello "
str2 = "World!"

print("str1 + str2 = ", str1 + str2)

print("str1 * 3 = ", str1 * 3)

str1 + str2 =  Hello World!
str1 * 3 =  Hello Hello Hello 


#### **Iterate through a String**

Iterate through a string using a $for()$ loop. Count the number of '$l$'s in a string.

In [8]:
# Iterate through a String.
count = 0
for letter in "Hello World":
    if letter == "l":
        count += 1
print(count, " letters found.")

3  letters found.


#### **Searching for Substrings**

We can search in a string for one or more adjacent characters, also known as a substring. In other words, to count the number of occurrences, determine whether a string contains a substring or determine the index at which a substring resides in a string.

**Counting Occurrences**

String method $count()$ returns the number of times its argument occurs in the string on which the method is called.

In [9]:
sentence = "to be or not to be that is the question"
sentence.count("to")

2

If we specify as the second argument, i.e., a $start\_index$, count searches only the slice $string[start\_index:]$, i.e., from the $start\_index$ through the end of the string.

In [10]:
sentence.count("to", 12)

1

If we specify the second and third arguments, i.e., the $start\_index$ and the $end\_index$, count searches only the slice $string[start\_index:end\_index]$, i.e., from the $start\_index$ up to, but not including, the $end\_index$.

In [11]:
sentence.count("that", 12, 25)

1

#### **String Membership Test**

We can test if a substring exists within a string or not, using the keyword $in$.



In [12]:
"a" in "program"

True

In [13]:
"at" not in "battle"

False

In [14]:
"that" in sentence

True

In [15]:
"THAT" in sentence

False

**Locating a Substring at the Beginning or End of a String**

String methods $startswith()$ and $endswith()$ return True if the string starts with or ends with a specified substring.

In [16]:
sentence = "to be or not to be that is the question"

In [17]:
sentence.startswith("to")

True

In [18]:
sentence.startswith("be")

False

In [19]:
sentence.endswith("question")

True

In [20]:
sentence.endswith("quest")

False

#### **Built-in functions to work with Python**

Various built-in functions that work with sequence, work with strings as well.

Some of the commonly used ones are $enumerate()$ and $len()$. The $enumerate()$ function returns an enumerate object. It contains the index and value of all the items in the string as pairs. It can be useful for iteration.

In [21]:
string = "cold"

# Enumerate.
list_enumerate = list(enumerate(string))
print("list(enumerate(string) = ", list_enumerate)

# Character Count.
print("len(string) = ", len(string))

list(enumerate(string) =  [(0, 'c'), (1, 'o'), (2, 'l'), (3, 'd')]
len(string) =  4


### **Comparison Operators for Strings**

Strings may be compared with the comparison operators. Recall that strings are compared based on their underlying integer numeric values. So uppercase letters compare as less than lowercase letters because uppercase letters have lower integer values. For example, the letter '$A$' is 65, and the letter '$a$' is 97.

In [22]:
print(f'A: {ord("A")}; a: {ord("a")}')

A: 65; a: 97


Compare the strings "**Orange**" and "**orange**" using the comparison operators.

In [23]:
"Orange" == "orange"

False

In [24]:
"Orange" != "orange"

True

In [25]:
"Orange" < "orange"

True

In [26]:
"Orange" <= "orange"

True

In [27]:
"Orange" > "orange"

False

In [28]:
"Orange" >= "orange"

False

## **Python String Formatting**

**Escape Sequence**

If we want to print a text like (He said, "What's there?"), we can neither use single quotes nor double quotes. It will result in a $SyntaxError$ as the text itself contains both single and double quotes.

In [29]:
print("He said, "What's there?"")   # SyntaxError: Invalid Syntax.

print('He said, "What's there?"')   # SyntaxError: Invalid Syntax.

SyntaxError: ignored

One way to get around this problem is to use triple quotes. Alternatively, we can use escape sequences.

An escape sequence starts with a backslash and is interpreted differently. If we use a single quote to represent a string, all the single quotes inside the string must be escaped. Similar is the case with double quotes.

In [30]:
# Use Triple Quotes.
print('''He said, "What's there?"''')

# Escaping Single Quotes.
print('He said, "What\'s there?"')

# Escaping Double Quotes.
print('He said, "What\'s there?"')

He said, "What's there?"
He said, "What's there?"
He said, "What's there?"


**Raw String to ignore escape sequence.**

Sometimes we may wish to ignore the escape sequences inside a string. To do this, we can place '$r$' or '$R$' in front of the string. It will imply that it is a raw string, and any escape sequence inside it will be ignored.

In [31]:
print("This is \x61 \ngood example")

This is a 
good example


In [32]:
print(r"This is \x61 \ngood example")

This is \x61 \ngood example


**The $format()$ Method for Formatting Strings.**

The $format()$ method is available with the string object and is very versatile and powerful in formatting strings. Format strings contain curly braces **{ }** as placeholders or replacement fields that get replaced.

We can use positional arguments or keyword arguments to specify the order.

In [33]:
# Python string format() method.

# Default(implicit) order.
default_order = "{}, {} and {}".format("John", "Bill", "Sean")
print("\n--- Default Order ---")
print(default_order)

# Order using Positional Argument.
positional_order = "{1}, {0} and {2}".format("John", "Bill", "Sean")
print("\n--- Positional Order ---")
print(positional_order)

# Order using Keyword Argument.
keyword_order = "{s}, {b} and {j}".format(j="John", b="Bill", s="Sean")
print("\n--- Keyword Order ---")
print(keyword_order)


--- Default Order ---
John, Bill and Sean

--- Positional Order ---
Bill, John and Sean

--- Keyword Order ---
Sean, Bill and John


**Old Style Formatting**

In [34]:
x = 12.3456789
print("The value of x is %3.2f" % x)
print("The value of x is %3.4f" % x)

The value of x is 12.35
The value of x is 12.3457


**Common Python String Methods**

There are numerous methods available with the string object. Some of the commonly used methods are $lower()$, $upper()$, $join()$, $split()$, $find()$, $replace()$, etc.

In [35]:
my_string = "ArItrAGanGulY"

print(my_string.upper())
print(my_string.lower())

ARITRAGANGULY
aritraganguly


In [36]:
"This will split all words into a list".split()

['This', 'will', 'split', 'all', 'words', 'into', 'a', 'list']

In [37]:
" ".join(["This", "will", "join", "all", "words", "into", "a", "string"])

'This will join all words into a string'

In [38]:
text = "Happy New Year"

In [39]:
text.find("ew")

7

In [40]:
text.replace("Happy", "Brilliant")

'Brilliant New Year'