# **Python Strings.**

# **Topics**

1. **Introduction** 
2. **Accessing Values in Strings**
3. **Updating strings**
4. **Basic string operations**
5. **String Formatting**
6. **Built-in String Methods**
7. **A String Peculiarity**
8. **Python F-strings**
9. **Python Raw Strings**	 
10. **Python Backslash**

## **Introduction**
* Strings are amongst the most popular types in Python. We can create them simply by enclosing characters in quotes.
* Python treats single quotes the same as double quotes.
* Python string is a built-in type text sequence. It is used to handle textual data in python. 
* Python Strings are immutable sequences of Unicode points.
* All strings in Python 3 are sequences of "pure" Unicode characters, no specific encoding like UTF-8.

In [3]:
x='python strings'
print(x)
print(type(x))

python strings
<class 'str'>


In [2]:
y="python strings"
print(y)
print(type(y))

python strings
<class 'str'>


In [1]:
z=' '
print(z)
print(type(z))

 
<class 'str'>


In [6]:
s3 = 'It doesn't matter!'

SyntaxError: unterminated string literal (detected at line 1) (731531896.py, line 1)

In [4]:
#Single quotes will have to be escaped with a backslash \, if the string is defined with single quotes:
s3 = 'It doesn\'t matter!'
s3

"It doesn't matter!"

In [5]:
#This is not necessary, if the string is represented by double quotes:

s3 = "It doesn't matter!"
s3

"It doesn't matter!"

In [8]:
txt = "He said: "It doesn't matter, if you enclose a string in single or double quotes!""
print(txt)

SyntaxError: unterminated string literal (detected at line 1) (1816875603.py, line 1)

In [7]:
#Analogously, we will have to escape a double quote inside a double quoted string:

txt = "He said: \"It doesn't matter, if you enclose a string in single or double quotes!\""
print(txt) 

He said: "It doesn't matter, if you enclose a string in single or double quotes!"


* They can also be enclosed in matching groups of three single or double quotes. In this case they are called triple-quoted strings. The backslash () character is used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character.

In [9]:
txt = '''A string in triple quotes can extend
over multiple lines like this one, and can contain
'single' and "double" quotes.'''
txt

'A string in triple quotes can extend\nover multiple lines like this one, and can contain\n\'single\' and "double" quotes.'

In [10]:
#The last character of a string can be accessed this way:
s = "Hello World"
s[len(s)-1]

'd'

* A string in Python consists of a series or sequence of characters - letters, numbers, and special characters. Strings can be subscripted or indexed. Similar to C, the first character of a string has the index 0.
![image.png](attachment:image.png)
* A string in Python consists of a series or sequence of characters - letters, numbers, and special characters. Strings can be subscripted or indexed. Similar to C, the first character of a string has the index 0.

## **Accessing Values in Strings.**
* Python does not support a character type i.e it doesn’t have a data type called char; these are treated as strings of length one, thus also considered a substring.hence, we can do indexing and slicing in python strings easily.
* The subscript creates a slice by including a colon within the braces as **string[start:stop:step].**


In [6]:
s="python is the best language."
s[0]

'p'

In [7]:
s[-1]

'.'

In [8]:
s[1:4]

'yth'

In [9]:
s[1:]+s[:1]

'ython is the best language.p'

In [10]:
s[:3]

'pyt'

In [11]:
s[::1]

'python is the best language.'

In [12]:
s[::-1]

'.egaugnal tseb eht si nohtyp'

In [13]:
s[::2]

'pto stebs agae'

## **Updating strings.**
* Strings are immutable and so we cannot replace particular indexed value from string.
* You can "update" an existing string by (re)assigning a variable to another string. The new value can be related to its previous value or to a completely different string altogether. 

In [14]:
s1='i do not like java and cpp'
s1[2]='c'
s1

TypeError: 'str' object does not support item assignment

In [None]:
s1+"hello"

In [None]:
s1[:2]+"hello"

## **Basic string operations.**

In [1]:
"python"+"xv"

'pythonxv'

In [2]:
"python"*12

'pythonpythonpythonpythonpythonpythonpythonpythonpythonpythonpythonpython'

In [4]:
s1="thor"
s1+" ragnarok"

'thor ragnarok'

In [5]:
"p" in s1

False

In [6]:
"y" not in s1

True

## **String Formatting.**
* “%”  is string formatting operator.
* %s  = string conversion via str() prior to formatting and lists also.
* %i = integer values. 
* %d = decimal values with single integer.
* %f = decimal values with all decimals.
* %c = character values.


In [7]:
t=10
s='Ahmedabad'
d=1.14354565
c='@'
l=[1,2,3]
print("i went to %s on wednesday and left %c %i and had a cigarette worth rupee %f nd wrote a code containing list %s"%(s,c,t,d,l))

i went to Ahmedabad on wednesday and left @ 10 and had a cigarette worth rupee 1.143546 nd wrote a code containing list [1, 2, 3]


In [8]:
print("%.2f"%(d)) #decimal value rounded upto 2 decimals.

1.14


### **String formatting using .format() method**

In [9]:
a='Ravi'
b=17
c='age'
d=2.25
e='@'
print("{0} is {1} old and he's very compared to teenager of his {2}. Also. he's {3}m tall and live {4} 4th street".format(a,b,c,d,e))

Ravi is 17 old and he's very compared to teenager of his age. Also. he's 2.25m tall and live @ 4th street


In [10]:
print("{0},{1},{2},{3}".format('python','is','best',4))

python,is,best,4


In [11]:
print("{0}...{1}...{2}".format(*'abc'))

a...b...c


## **Built-in String Methods.**
1. **str.capitalize()** = Capitalizes first letter of string.
2. **isalnum()** = Returns true if string has at least 1 character and all characters are alphanumeric and false otherwise.
3. **isalpha()** = Returns true if string has at least 1 character and all characters are alphabetic and false otherwise.
4. **isdigit()** = Returns true if the string contains only digits and false otherwise.
5. **islower()** = Returns true if string has at least 1 cased character and all cased characters are in lowercase and false otherwise.
6. **isnumeric()** = Returns true if a unicode string contains only numeric characters and false otherwise. 
7. **isspace()** = Returns true if string contains only whitespace characters and false otherwise. 
8. **istitle()** = Returns true if string is properly "titlecased" and false otherwise. 
9. **isupper()** = Returns true if string has at least one cased character and all cased characters are in uppercase and false otherwise.
10. **isdecimal()** = Returns true if a unicode string contains only decimal characters and false otherwise.
11. **max(str)** = Returns the max alphabetical character from the string str. 
12. **min(str)** = Returns the min alphabetical character from the string str.
13. **str.count(str, beg= 0,end=len(string))** = Counts how many times str occurs in string or in a substring of string if starting index beg and ending index end are given.
14. **string.find(str, beg=0 end=len(string))** = Determine if str occurs in string or in a substring of string if starting index beg and ending index end are given returns index if found and -1 otherwise.
15. **“ ”.join(seq)** = Merges (concatenates) the string representations of elements in sequence seq into a string, with separator string.
16. **len(string)** = Returns the length of the string
17. **str.lower()** = Converts all uppercase letters in string to lowercase.
18. **str.upper()** = Converts lowercase letters in string to uppercase.
19. **str.tittler()**  = Converts a string in titled format.
20. **str.swapcase()** = Inverts case for all letters in string
21. **str.split()** = it splits the string in different elements.
22. **endswith(suffix, beg=0, end=len(string))** = Determines if string or a substring of string (if starting index beg and ending index end are given) ends with suffix; returns true if so and false otherwise.
23. **str.ljust(width[, fillchar])** = The method ljust() returns the string left justified in a string of length width. Padding is done using the specified fillchar (default is a space). The original string is returned if width is less than len(s).
24. **str.rjust(width[, fillchar])** = The method ljust() returns the string right justified in a string of length width. Padding is done using the specified  fillchar (default is a space). The original string is returned if width is less than len(s).
25. **str.lstrip()** = Removes all leading whitespace in string.
26. **str.rstrip()** = Removes all trailing whitespace of string.
27. **str.center(s, width[, fillchar])** = The center() method will center align the string, using a specified character (space is default) as the fill character.
28. **str.rjust(s, width[, fillchar])**  = This function returns a new string of specified length with right-justified source string. We can specify the character to use for the padding, the default is whitespace. If the specified length is smaller than the source string, then the source string is returned.
29. **str.ljust(s, width[, fillchar])**  = Python string ljust() is very similar to the rjust() function. The only difference is that the original string is right-justified.


In [13]:
s1='python is Good'
s1.capitalize()

'Python is good'

In [16]:
s1.islower()

True

In [17]:
s1.istitle()

False

In [18]:
 s1 #to demonstrate that whether built-in methods changes the original string or not.

'python'

In [19]:
s1.isupper()

False

In [20]:
s2='python@435#$sdf'
s2.isalnum()

False

In [21]:
s3="39431032139099"
max(s3)

'9'

In [22]:
min(s3)

'0'

In [23]:
s4='python is best language'
s4.split()

['python', 'is', 'best', 'language']

In [27]:
s4.split('is')

['python ', ' best language']

In [39]:
s4.split(",")

['python is best language']

In [40]:
s4.split(" ")

['python', 'is', 'best', 'language']

In [29]:
s5="dfsgdhsadsgdfard3546580877$%^^#"
print('s5.isdecimal(): ',s5.isdecimal())
print('s5.isnumeric(): ',s5.isnumeric())
print('s5.isdigit(): ',s5.isdigit())
print('len(s5): ',len(s5))
print('s5.count("d"): ',s5.count("d"))
print('s5.count("d",5,10): ',s5.count("d",5,10))
print('s5.find("d",2,10): ',s5.find("d",2,10))
print('s5.count("d"): ',s5.count("d"))

s5.isdecimal():  False
s5.isnumeric():  False
s5.isdigit():  False
len(s5):  31
s5.count("d"):  5
s5.count("d",5,10):  1
s5.find("d",2,10):  4
s5.count("d"):  5


In [30]:
','.join(s5)

'd,f,s,g,d,h,s,a,d,s,g,d,f,a,r,d,3,5,4,6,5,8,0,8,7,7,$,%,^,^,#'

In [36]:
' '.join(s5)

'd f s g d h s a d s g d f a r d 3 5 4 6 5 8 0 8 7 7 $ % ^ ^ #'

In [37]:
''.join(s5)

'dfsgdhsadsgdfard3546580877$%^^#'

In [38]:
print("s5.swapcase(): ",s5.swapcase())
print("s5.lower(): ",s5.lower())
print("s5.upper(): ",s5.upper())
print("s5.title(): ",s5.title())

s5.swapcase():  DFSGDHSADSGDFARD3546580877$%^^#
s5.lower():  dfsgdhsadsgdfard3546580877$%^^#
s5.upper():  DFSGDHSADSGDFARD3546580877$%^^#
s5.title():  Dfsgdhsadsgdfard3546580877$%^^#


In [46]:
print("s5.endswith("f"): ",s5.endswith("f"))
print("s5.endswith(""): ",s5.endswith(""))
print("s5.endswith("f",0,5): ",s5.endswith("f",0,5))
print("s5.startswith('d'): ",s5.endswith('d'))
print("s5.startswith(''): ",s5.endswith(''))

s5.endswith():  False
s5.endswith():  True
s5.endswith(,0,5):  False
s5.startswith('d'):  False
s5.startswith(''):  True


In [15]:
s6="  sdfkdsfdls21313%$#%      "
s6.lstrip()

'sdfkdsfdls21313%$#%      '

In [16]:
s6.rstrip()

'  sdfkdsfdls21313%$#%'

In [18]:
len(s6)

27

In [23]:
s = 'Hello'
s.rjust(20)

'               Hello'

In [24]:
s.rjust(20,"#")

'###############Hello'

In [25]:
s.ljust(20)

'Hello               '

In [26]:
s.ljust(20,"*")

'Hello***************'

In [27]:
s.center(20)

'       Hello        '

In [28]:
s.center(20,"&")

'&&&&&&&Hello&&&&&&&&'

# **A String Peculiarity**
* Strings show a special effect, which we will illustrate in the following example. We will need the "is"-Operator. If both a and b are strings, "a is b" checks if they have the same identity, i.e., share the same memory location. 
* If "a is b" is True, then it trivially follows that "a == b" has to be True as well. Yet, "a == b" True doesn't imply that "a is b" is True as well!

In [29]:
a = "Linux"
b = "Linux"
a is b

True

In [30]:
a==b

True

In [2]:
a ="Dharmendra" 
b ="dharmendra"
a is b

False

* Okay, but what happens, if the strings are longer? We use the longest village name in the world in the following example. It's a small village with about 3000 inhabitants in the South of the island of Anglesey in the North-West of Wales:

In [3]:
a = "Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch"
b = "Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch"
a is b

True

## **Python F-strings**
* Python 3.6 introduced the f-strings that allow you to format text strings faster and more elegant. The f-strings provide a way to embed expressions inside a string literal using a clearer syntax than the format() method.

In [1]:
name = 'John'
s = f'Hello, {name}!'
print(s)

Hello, John!


### **How it works.**

* First, define a variable with the value 'John'.
* Then, place the name variable inside the curly braces {} in the literal string. Note that you need to prefix the string with * * the letter f to indicate that it is an f-string. It’s also valid if you use the letter in uppercase (F).
* Finally, print out the string s.

* It’s important to note that Python evaluates the expressions in f-string at runtime. It replaces the expressions inside an f-string by their values.

In [2]:
#The following example calls the upper() method to convert the name to uppercase inside the curly braces of an f-string:
name = 'John'
s = F'Hello, {name.upper()}!'
print(s)

Hello, JOHN!


In [3]:
#The following example uses multiple curly braces inside an f-string:
first_name = 'John'
last_name = 'Doe'
s = F'Hello, {first_name} {last_name}!'
print(s)

Hello, John Doe!


### **Multiline f-strings**
* Python allows you to have multiline f-strings. To create a multiline f-string, you place the letter f in each line. 
* For example:

In [4]:
name = 'John'
website = 'PythonTutorial.net'

message = (
    f'Hello {name}. '
    f"You're learning Python at {website}." 
)

print(message)

Hello John. You're learning Python at PythonTutorial.net.


* If you want to spread an f-string over multiple lines, you can use a backslash (\) to escape the return character like this:

In [5]:
name = 'John'
website = 'PythonTutorial.net'

message = f'Hello {name}. ' \
          f"You're learning Python at {website}." 

print(message)

Hello John. You're learning Python at PythonTutorial.net.


* The following example shows how to use triple quotes (""") with an f-string:

In [6]:
name = 'John'
website = 'PythonTutorial.net'

message = f"""Hello {name}.
You're learning Python at {website}."""

print(message)

Hello John.
You're learning Python at PythonTutorial.net.


### **Curly braces**
* When evaluating an f-string, Python replaces double curly braces with a single curly brace. However, the doubled curly braces do not signify the start of an expression.

* It means that Python will not evaluate the expression inside the double curly brace and replace the double curly braces with a single one. For example:

In [7]:
s = f'{{1+2}}'
print(s)

{1+2}


In [9]:
#The following shows an f-string with triple curly braces:
#In this example, Python evaluates the {1+2} as an expression, which returns 3. 
#Also, it replaces the remaining doubled curly braces with a single one.
s = f'{{{1+2}}}'
print(s)

{3}


In [11]:
#To add more curly braces to the result string, you use more than triple curly braces:
s = f'{{{{1+2}}}}'
print(s)
#In this example, Python replaces each pair of doubled curly braces with a single curly brace.

{{1+2}}


### **Evaluation order of expressions in Python f-strings**
* Python evaluates the expressions in an f-string in the left-to-right order. 
* This is obvious if the expressions have side effects like the following example:

In [12]:

def inc(numbers, value):
    numbers[0] += value
    return numbers[0]

numbers = [0]

s = f'{inc(numbers,1)},{inc(numbers,2)}'
print(s)

1,3


* After first inc function call, the numbers[0] is one. And the second call increases the first number in the numbers list by 2, which results in 3.

### **Format numbers in f-strings**
* To format a number in an f-string, you use this simplified syntax:
![image.png](attachment:image.png)

In [13]:
previous = 99.2
current = 110.3
vs_previous = (current - previous) / previous

print(f'Current vs. previous year: {vs_previous:.2%}')

Current vs. previous year: 11.19%


## **Python Raw Strings**
* In Python, when you prefix a string with the letter r or R such as r'...' and R'...', that string becomes a raw string. Unlike a regular string, a raw string treats the backslashes (\) as literal characters.

* Raw strings are useful when you deal with strings that have many backslashes, for example, regular expressions or directory paths on Windows.

* To represent special characters such as tabs and newlines, Python uses the backslash (\) to signify the start of an escape sequence. For example:

In [1]:
s = 'lang\tver\nPython\t3'
print(s)

lang	ver
Python	3


* However, raw strings treat the backslash (\) as a literal character. For example:

In [2]:
s = r'lang\tver\nPython\t3'
print(s)

lang\tver\nPython\t3


* A raw string is like its regular string with the backslash (\) represented as double backslashes (\\):

In [3]:
s1 = r'lang\tver\nPython\t3'
s2 = 'lang\\tver\\nPython\\t3'

print(s1 == s2) # True

True


* In a regular string, Python counts an escape sequence as a single character:

In [4]:
s = '\n'
print(len(s)) # 1

1


* However, in a raw string, Python counts the backslash (\) as one character:

In [5]:
s = r'\n'
print(len(s)) # 2

2


* Since the backslash (\) escapes the single quote (') or double quotes ("), a raw string cannot end with an odd number of backslashes.

* For example:

In [6]:
s = r'\'

SyntaxError: unterminated string literal (detected at line 1) (3865254185.py, line 1)

In [7]:
s = r'\\\'

SyntaxError: unterminated string literal (detected at line 1) (2666811007.py, line 1)

* Use raw strings to handle file path on Windows
* Windows OS uses backslashes to separate paths. For example:
![image.png](attachment:image.png)
* If you use this path as a regular string, Python will issue a number of errors:

In [8]:
dir_path = 'c:\user\tasks\new'

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \uXXXX escape (4204228918.py, line 1)

* Python treats \u in the path as a Unicode escape but couldn’t decode it.

* Now, if you escape the first backslash, you’ll have other issues:

In [9]:
dir_path = 'c:\\user\tasks\new'
print(dir_path)

c:\user	asks
ew


* In this example, the \t is a tab and \n is the newline.

* To make it easy, you can turn the path into a raw string like this:

In [10]:
dir_path = r'c:\user\tasks\new'
print(dir_path)

c:\user\tasks\new


* Note that the result raw string has the quote at the beginning and end of the string. To remove them, you can use slices:

In [11]:
s = '\n'
raw_string = repr(s)[1:-1]
print(raw_string)


\n


## **Python Backslash**
* In Python, the backslash(\) is a special character. If you use the backslash in front of another character, it changes the meaning of that character.

* For example, the t is a literal character. But if you use the backslash character in front of the letter t, it’ll become the tab character (\t).

* Generally, the backslash has two main purposes.

* First, the backslash character is a part of special character sequences such as the tab character \t or the new line character \n.

* The following example prints a string that has a newline character:

In [12]:
print('Hello,\n World')

Hello,
 World


* The \n is a single character, not two. For example:

In [13]:
s = '\n'
print(len(s)) # 1

1


* Second, the backslash (\) escape other special characters. 
* For example, if you have a string that has a single quote inside a single-quoted string like the following string, you need to use the backslash to escape the single quote character:

In [14]:
s = '"Python\'s awesome" She said'
print(s)

"Python's awesome" She said


### Backslash in f-strings
* PEP-498 specifies that an f-string cannot contain a backslash character as a part of the expression inside the curly braces {}.

* The following example will result in an error:

In [15]:
colors = ['red','green','blue']
s = f'The RGB colors are:\n {'\n'.join(colors)}'
print(s)

SyntaxError: unexpected character after line continuation character (3226697727.py, line 2)

* To fix this, you need to join the strings in the colors list before placing them in the curly braces:

In [16]:
colors = ['red','green','blue']
rgb = '\n'.join(colors)
s = f"The RGB colors are:\n{rgb}"
print(s)

The RGB colors are:
red
green
blue


### **Backslash in raw strings**
* Raw strings treat the backslash character (\) as a literal character. The following example treats the backslash character \ as a literal character, not a special character:

In [17]:
s = r'\n'
print(s)

\n
