
### Python Programming
##### by Narendra Allam
Copyright 2019
# Chapter 2
## Strings

#### Topics Covering

- Strings
    - Commenting in python
    - Define a string - Multiple quotes and Multiple lines
    - String functions
    - String slicing - start, end & step
    - Negative indexing 
    - Scalar multiplication
- Exercise Programs

### Commenting in python

Comments are used in the code for describing the logic. This helps the new developers, understanding code better.<br>

In python,<br>
* Hash (#) is uded for single line comments
* Triple single quotes (''' ''')  are used for multiline comments
* Triple double quotes (""" """) are used for doc strings (describing function parameters or class properties etc.,)

Check all the three types of comments in the below code snippet.

In [1]:
# s =  'John's Byke' # This gives an error
s =  "John's Byke" # Enclose with proper quotes
print(s)

John's Byke


In the below cell, a single line string spanned in multiple lines using a backslash(<b> \ </b>)

In [2]:
s = 'Apple is sweet. ' \
'But Orange is Sour.'
print(s)

Apple is sweet. But Orange is Sour.


## Strings

* String is a collection of characters.
* Any pair of quotes can be used to represent a string.
* Strings are immutable, we cannot add, delete, modify individual characters in a string.
* Python 2 default character encoding is ASCII, in python 3 it is UNICODE

Individual characters in a string can be accessed using square brackets and indexing. Indexing starts from zero. <br>
s[0] is 'A' <br>
s[1] is 'p' <br>
and so on.

In [3]:
s = 'Apple'
print(s[0], s[1], s[2])

A p p


##### Internal representation of a string

<img src='drawing5.png' width='400'>

In [4]:
print(id(s[0]), id(s[1]), id(s[2]))

140154567352248 140154568108104 140154568108104


In the above example, 'p' is stored only once and its reference(address)
is placed two times, at index 1 and 2, in the list of characters.

__Finding length of the string - number of character in a string__ <br>
len() function:

In [5]:
s = "Hello World!"
print(len(s)) # length of the string

12


__Strings are immutable__
* we cannot change individual characters
* We cannot add or delete characters

In [6]:
# **** Strings are immutable, we cannot change the characters
s = "Hello World!"
print(s)

Hello World!


__ASCII and Unicode encoding__

In python 3 characters are stored in Unicode encoding. We use prefix 'u' to define unicode strings in python 2

In [7]:
import sys
s = 'Apple'
print(type(s), sys.getsizeof(s))

<class 'str'> 54


### String slicing
Slicing the technique of extracting sub string or a set of characters form a string.

syntax :<br>
```python
string[start:end:step]
```
* start - index start at which slicing is started
* end - index at which slicing is ended, end index is exclusive
* step - step value is with which start value gets incremented/decremented. 

__Note:__ Default step value is 1.

Lets see some examples,

In [8]:
s = "Hello World!"
print(s[6:11]) # returns a substring of characters from 6 to 11, excluding 11

World


In [9]:
print(s)

Hello World!


In [10]:
s[:4] # assumes start as 0

'Hell'

In [11]:
s[6:] # assumes end as the length of the string

'World!'

In [12]:
s[1:9] # returns a substring of characters from 1 to 8, excluding 9

'ello Wor'

__Step count__ - Default step count is 1

In [13]:
s[1:9:3]

'eoo'

In [14]:
s[:10:2]

'HloWr'

In the above example,<br>
start is 1,<br>
end is 9 and<br>
step is 3. <br>

first it prints s[1],<br>
then s[1 + step] => s[1 + 2] => s[3]<br>
prints s[3]<br>
thne s[3 + step] which is s[5] and so on,<br>
until it crosses 8.<br>

In [15]:
s[:] # Returns Entire string

'Hello World!'

In [16]:
s[::] # Returns Entire string, same as above

'Hello World!'

In [17]:
s[::2]

'HloWrd'

 In the above example, it takes entire string, but step is 2, default start value is 0. so indices produced are 0, 2, 4, 6, 8, and 10.

In [18]:
s[9:2]

''

In [19]:
s[9:2:-1]

'lroW ol'

#### -ve indexing
Python supports -ve indexing. Index of last character is -1, last but one is -2 and so on.

In [20]:
s = "Hello World!"
s[-1]

'!'

In [21]:
s[-2]

'd'

_Slicing using -ve indexing:_

In [22]:
s[-9:-3]

'lo Wor'

default step value is 1, <br>
-9 + 1 ==> -8<br>
-8 + 1 ==> -7<br>
start value -9 is goin towards -3,<br>
-9 ==> -3, so s[-9:-3] is a valid slice. <br>

In [23]:
s[-3: -10]

''

Above is not a valid slice, because<br>

step is 1, default.<br>
-3 + 1 ==> -2<br>
-2 + 1 ==> -1<br>
so on<br>
-3 <== -10<br>
-3 is not going towards -10, it never reaches -10, so invalid slice.<br>
It returns ''(null string)<br>

Some more examples,

In [24]:
s[-3: -10:-1]

'lroW ol'

In [25]:
s[-4:-1:1]

'rld'

__Reversing a string__

In [26]:
s[::]

'Hello World!'

In [27]:
s[::-1]

'!dlroW olleH'

In [28]:
s

'Hello World!'

Unfortunately this is the only standard way we can reverse a string in python. There are other complicated ways but not used in production.

In [29]:
s[3::-1]

'lleH'

In [30]:
s[:3]

'Hel'

In [31]:
s[:3:-1]

'!dlroW o'

#### String functions
There are some useful functions on strings, below is the listing.

In [32]:
s = "hello World! 123$"

__capitalize():__ Captilize the first character and make remaining characters small

In [33]:
print(s.capitalize()) # no effect on non-alphabets

Hello world! 123$


__Note:__ String functions do not effect original string, instead they take a copy of original string, process it and returns.

__count():__ Counts number of chars/substrings it has

In [34]:
s

'hello World! 123$'

In [35]:
s.count('l') # number of 'l's in the string

3

In [36]:
s.count('hell') # number of 'hell's in the string

1

__upper() and lower():__ changing case to upper and lower, no effect on numbers and other characters.

In [37]:
s.upper()

'HELLO WORLD! 123$'

In [38]:
s.lower()

'hello world! 123$'

In [39]:
s

'hello World! 123$'

__Validation functions__

In [40]:
s = 'hello World! 123$'

In [41]:
s.endswith("3$") # does s ends with '3$'

True

In [42]:
s.endswith("5$") # does s ends with '5$'

False

In [43]:
s.startswith("Apple") # does s starts with 'Apple'

False

In [44]:
s.startswith("hello") # does s starts with 'hello'

True

In [45]:
s = 'Apple123'
s.isalpha() # check the string is having only alphabets are not

False

In [46]:
s = 'Apple'
s.isalpha() # check the string is having only alphabets are not

True

In [47]:
s = "2314"
s.isdigit() # check the string is having only digit chars are not

True

__replace():__ replaces all the occurances of substring in target string

In [48]:
s = 'Apple'
s.replace('p', '$')
print(s)

Apple


As we discussed, original string doesn't get changed, we just have to capture the modified string if we want to, as below

In [49]:
s = 'Apple'
s1 = s.replace('App', '$Tupp')
print(s1, s)

$Tupple Apple


__strip()__: Strips spaces on both the sides of the string. We can pass any custom chars/substrings if we want to strip.
Below are the examples.

In [50]:
s = ' Apple '
print (len(s), s)
s = s.strip()
print (len(s), s)

7  Apple 
5 Apple


In [51]:
s = ' Apple'
print(len(s))
s = s.lstrip() # lstrip() works only on start of the string
print(len(s))

6
5


In [52]:
s = 'Apple '
print(len(s))
s = s.rstrip() # rstrip() works only on end of the string
print(len(s))

6
5


__stripping custom chars/substrings__

In [53]:
s = '$$$Telangana'
s.strip('$')

'Telangana'

In [54]:
s

'$$$Telangana'

In [55]:
s = 'ApApTelangana'
s.strip('gaAn')

'pApTel'

__split():__ Splits entire string into multiple words seperated by spaces. We can pass custom seperators if want to.

In [56]:
date = '12/02/1984'
l = date.split('/') 
print(l, type(l))
print()
print(l[-1], type(l[-1]))

['12', '02', '1984'] <class 'list'>

1984 <class 'str'>


In [57]:
date = '12/02/1984'
l = date.split('/', 1) # splits one-time

print(l, type(l))
print()
print(l[-1], type(l[-1]))

['12', '02/1984'] <class 'list'>

02/1984 <class 'str'>


In [58]:
s = '''Once upon a time in India, there was a king called Tippu.
India was a great country.'''

print(s.find('India'))
print(s.find('America'))

20
-1


__rfind():__ searching from the end

In [59]:
s.rfind('India') 

58

__Index:__

In [60]:
s.index('India')

20

In [61]:
s.index('America')

ValueError: substring not found

__Note:__ Difference between find() and index() is, index() throws ValueError if word is not found, whereas find() returns -1.

__Exercise:__ Guess the output

In [62]:
s = '''Once upon a time in India, there was a king called Tippu.
India was a great country.'''

print(s[s.find('great'):])

great country.


List of chars to string:

In [63]:
l = ['A', 'p', 'p', 'l', 'e']
print(''.join(l))

Apple


In [64]:
l = ['A', 'p', 'p', 'l', 'e']
print('|'.join(l))

A|p|p|l|e


In [65]:
s = 'Once upon a time in Inida.'
words = s.split()
print(words)

['Once', 'upon', 'a', 'time', 'in', 'Inida.']


In [66]:
' '.join(words)

'Once upon a time in Inida.'

In [67]:
emp_data = ['1234', 'John', '23400.0', 'Chicago']

print(','.join(emp_data))

1234,John,23400.0,Chicago


__Program:__ Reverse the word 'India' in-place in the below string.

In [68]:
s = '''Once upon a time in India, there was a king called Tippu. India was a great country.'''
word = 'India'

print(s.replace(word, word[::-1]))

Once upon a time in aidnI, there was a king called Tippu. aidnI was a great country.


__Program:__ Count all the vowels in the given string.

In [69]:
s = '''once upon a time in india, there was a king called tippu. india was a great country.'''

s.count('a')+ s.count('e') + s.count('i') + s.count('o') + s.count('u')

29

#### Scalar multiplication

In [70]:
'Apple' * 5

'AppleAppleAppleAppleApple'

__Concatenating Strings__

In [71]:
'Apple' + 'Orange'

'AppleOrange'

__Character encoding__
In python 2, a prefix 'u' is required to write unicode strings.

In [72]:
s = u'Apple'
print(s)

Apple


In [73]:
import math

In [74]:
math.sin(90)

0.8939966636005579

In [75]:
from math import sin
sin(90)

0.8939966636005579

In [76]:
help(sin)

Help on built-in function sin in module math:

sin(...)
    sin(x)
    
    Return the sine of x (measured in radians).



# Exercise Programs

1. Add a comma between the characters.
If the given woord is 'Apple', it should become 'A,p,p,l,e'
3. Remove the given word in all the places in a string?

## Comprehension Quiz

<b> Think you’ve got it? Here’s a tiny quiz: </b>

1. What’s different about Python 2.x and Python 3.x in regards to string handling/Unicode/UTF-8?
2. What is the output when following statement is executed?
```python
    print ('Tech'  'Beamers')
        a. Beamers
        b. Tech
        c. TechBeamers
        d. Tech Beamers 
```
3. What is the output when following statement is executed?
```python
    print (R'Tech\nBeamers')
        a. Tech Beamers
        b. Tech\nBeamers
        c. 'RTech' then 'Beamers' in a New line
        d. 'Tech' then 'Beamers' in a New line 
    ```
4. Which of the following is the output of the below Python code?
```python
    str='Hello World'
    print (str.find('o'))
        a. 4
        b. 4, 7
        c. 7
        d. 2 
```
5. What is the output when following code will be executed?
```python
    str='Recurssion'
    print  (str.rfind('s'))
        a. 5
        b. 6
        c. 4
        d. 2 
```
<b> Answers are below </b>

```python
1. Strings are UTF-8 by default in Python 3.x whereas strings are ascii by default and a prefix 'u' is required 
to write unicode strings in python 2.x
2. c (Note:- string literals when separated with space are written together they get concatenated.)
3. b (Note:- ‘R’ stands for raw string this suppresses the meaning of escape characters and they get printed in the String.) 
4. a (Note:- Find method returns the lowest index at which the string is found.)
5. b (Note: rFind method returns the highest index at which the string is found.)
```