# Strings

A string is a sequence of characters. 
You can access the characters one at a time with the bracket operator:

In [1]:
>>> fruit = 'banana'
>>> letter = fruit[1]

In [2]:
>>> print(letter)

a


In [3]:
>>> letter = fruit[0]
>>> print(letter)

b


So b is the 0th letter ("zero-eth") of "banana", a is the 1th letter ("one-eth"), and n is the 2th ("two-eth") letter.

In [4]:
>>> letter = fruit[1.5]

TypeError: string indices must be integers

### len
len is a built-in function that returns the number of characters in a string:

In [None]:
>>> fruit = 'apple'
>>> len(fruit)


In [7]:
>>> length = len(fruit)
>>> last = fruit[length]

IndexError: string index out of range

The reason for the IndexError is that there is no letter in 'banana' with the index 6. Since we started counting at zero, the six letters are numbered 0 to 5. To get the last character, you have to subtract 1 from length:

In [8]:
>>> last = fruit[length-1]
>>> print(last)

e


In [9]:
fruit[-1]

'e'

In [10]:
fruit[-5]

'a'

In [11]:
fruit[-2]

'l'

## Traversal through a string with a loop

>>A lot of computations involve processing a string one character at a time. 
>>Often they start at the beginning, select each character in turn, do something to it, and continue until the end.
--This pattern of processing is called a traversal. One way to write a traversal is with a while loop:

In [12]:
index = 0
while index < len(fruit):
    letter = fruit[index]
    print(letter)
    index = index + 1


a
p
p
l
e


This loop traverses the string and displays each letter on a line by itself. The loop condition is index \< len(fruit), so when index is equal to the length of the string, the condition is false, and the body of the loop is not executed.

In [13]:
#another way:
for char in fruit:
    print(char)

a
p
p
l
e


In [18]:
>>> ''.join(reversed('apple'))

'elppa'

## String slices

In [20]:
>>> s = 'Monty Python'
>>> print(s[0:5])
>>> print(s[6:12])

Monty
Python


In [23]:
>>> fruit = 'banana'
>>> fruit[:3]



'ban'

In [24]:
>>> fruit[3:]

'ana'

### empty string

In [25]:
>>> fruit = 'banana'
>>> fruit[3:3]

''

An empty string contains no characters and has length 0, but other than that, it is the same as any other string.

In [26]:
fruit[:]

'banana'

### Strings can't be changed

In [27]:
>>> greeting = 'Hello, world!'
>>> greeting[0] = 'J'

TypeError: 'str' object does not support item assignment

In [28]:
>>> greeting = 'Hello, world!'
>>> new_greeting = 'J' + greeting[1:]
>>> print(new_greeting)

Jello, world!


## Looping and counting

In [29]:
# The following program counts the number of times the letter a appears in a string:
word = 'banana'
count = 0
for letter in word:
    if letter == 'a':
        count = count + 1
print(count)

3


## The in operator
The word in is a boolean operator that takes two strings and returns True if the first appears as a substring in the second:

In [31]:
>>> 'o' in 'poorvi'



True

In [32]:
>>> 'seed' in 'infotmation'

False

## String comparison

In [2]:
word=input('enter the fruit')
if word == 'Pineapple':
    print('All right, Pineapple.')

enter the fruitPineapple
All right, Pineapple.


In [3]:
word=input('enter the fruit')
if word < 'banana':
    print('Your word,' + word + ', comes before banana.')
elif word > 'banana':
    print('Your word,' + word + ', comes after banana.')
else:
    print('All right, bananas.')

enter the fruitPineapple
Your word,Pineapple, comes before banana.


#### Python does not handle uppercase and lowercase letters the same way that people do. All the uppercase letters come before all the lowercase letters, so try Pineapple:

## string methods

In [4]:
>>> stuff = 'Hello world'
>>> type(stuff)



str

In [5]:
>>> dir(stuff)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


In [6]:
>>> word = 'poorvi'
>>> new_word = word.upper()
>>> print(new_word)

POORVI


In [7]:
# there is a string method named find that searches for the position of one string within another
>>> word = 'banana'
>>> index = word.find('a')
>>> print(index)

1


In [8]:
# The find method can find substrings as well as characters
>>> word.find('na')

2

In [10]:
#task is to remove white space (spaces, tabs, or newlines) 
#from the beginning and end of a string using the strip method:
line = '  Here we go  '
line.strip()

'Here we go'

In [11]:
>>> line = 'Have a nice day'
>>> line.startswith('Have')

True

In [15]:
>>> line.lower()

'have a nice day'

In [12]:
>>> line.startswith('h')

False

In [14]:
>>> line.lower().startswith('h')

True

## Parsing strings

In [None]:
# 'From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008'
# we wanted to pull out only the second half of the address (i.e., uct.ac.za) from each line

In [16]:
>>> data = 'From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008'
>>> atpos = data.find('@')
>>> print(atpos)

21


In [17]:
#finding space:
>>> sppos = data.find(' ',atpos)
>>> print(sppos)

31


In [18]:
>>> host = data[atpos+1:sppos]
>>> print(host)

uct.ac.za


## Format operator
The format operator, % allows us to construct strings, replacing parts of the strings with the data stored in variables. When applied to integers, % is the modulus operator. 
But when the first operand is a string, % is the format operator.

For example, the format sequence "%d" means that the second operand should be formatted as an integer (d stands for "decimal"):

In [19]:
>>> camels = 42
>>> '%d' % camels

'42'

In [20]:
>>> camels = 42
>>> 'I have spotted %d camels.' % camels

'I have spotted 42 camels.'

In [21]:
# The following example uses "%d" to format an integer, "%g" to format a floating-point number (don't ask why),
# and "%s" to format a string:
>>> 'In %d years I have spotted %g %s.' % (3, 0.1, 'camels')

'In 3 years I have spotted 0.1 camels.'

In [22]:
>>> '%d' % 'dollars'

TypeError: %d format: a number is required, not str

In [23]:
>>> '%d %d %d' % (1, 2)

TypeError: not enough arguments for format string

## Extra : how to reverse

### Option 1: List Slicing Trick

You can use Python’s slicing syntax to create a reversed copy of a string

In [19]:
>>> 'TURBO'[::-1]

'OBRUT'

### Option 2: reversed() and str.join()

In [16]:
>>> ''.join(reversed('TURBO'))

'OBRUT'

reversed() returns an iterator that iterates over the characters in the string in reverse order
This character stream needs to be combined into a string again with the str.join() function
This is slower than slicing, but arguably more readable

In [26]:
>>> mylist = [1, 2, 3, 4, 5]
>>> mylist

[1, 2, 3, 4, 5]

In [27]:
# mylist[start:end:step]
>>> mylist[1:3]

[2, 3]

In [28]:
>>> mylist[::2]

[1, 3, 5]

In [29]:
>>> mylist[::-1]

[5, 4, 3, 2, 1]