# Strings

We already encountered strings before, when we talked about the **print** function. In general, however, string is another data type in python, which consists of ordered text based data represented by enclosing them in single/double/triple quotes.

Since data and script wrangling is one very important application area of python, python offers a lot of built-in functions that deal with strings and string processing.

In [None]:
String0 = 'Germany has good beer'
String1 = "Germany has good beer"
String2 = '''Germany
has
good beer'''

In [None]:
print(String0 , type(String0))
print(String1, type(String1))
print(String2, type(String2))

String Indexing and String Slicing are very similar to Lists.

In [None]:
print(String0[4])
print(String0[4:])

### Built-in Functions

The **find( )** function returns the index value of the input that is to be found in the string. If it is not found it returns **-1**. Remember to not confuse the returned -1 for a reverse indexing value!

In [None]:
print(String0.find('od'))
print(String0.find('yt'))

The index value returned is the index of the first element in the input data.

In [None]:
print(String0[7])

One can also input **find( )** function between which index values it has to search.

In [None]:
print(String0.find('y',1))
print(String0.find('y',6,8))

**capitalize( )** is used to capitalize the first element in the string.

In [None]:
String3 = 'observe the first letter in this sentence.'
print(String3.capitalize())

**center( )** is used to center align the string by specifying the field width.

In [None]:
String0.center(70)

One can also fill the left out spaces with any other character.

In [None]:
String0.center(70,'-')

**zfill( )** is used for zero padding by specifying the field width.

In [None]:
String0.zfill(30)

**expandtabs( )** allows you to change the spacing of the tab character. '\t' is set by default set to 8 spaces.

In [None]:
s = 'h\te\tl\tl\to'
print(s)
print(s.expandtabs(1))
print(s.expandtabs())

**index( )** works the same way as the **find( )** function with the difference that it throws a ValueError when the input element is not found in the string (rather than '-1' as a return value as for **find( )**)

In [None]:
print(String0.index('Germany'))
print(String0.index('Germany',0))
print(String0.index('Germany',10,20))

**endswith( )** function is used to check if the given string ends with the particular char which is given as input.

In [None]:
print(String0.endswith('y'))

The start and stop index values can also be specified.

In [None]:
print(String0.endswith('r',0))
print(String0.endswith('G',0,5))

**count( )** function counts the number of char in the given string. The start and the stop index can also be specified or left blank. (These are Implicit arguments which will be dealt with in the functions-chapter later)

In [None]:
print(String0.count('e',0))
print(String0.count('e',5,10))

**join( )** function is used add a char in between the elements of the input string.

In [None]:
'a'.join('*_-')

'*_-' is the input string and char 'a' is added in between each element

**join( )** function can also be used to convert a list into a string.

In [None]:
a = list(String0)
print(a)
b = ''.join(a)
print(b)

Before converting it into a string **join( )** function can be used to insert any char in between the list elements.

In [None]:
c = '/'.join(a)[18:]
print(c)

**split( )** function is used to convert a string back to a list. Think of it as the opposite of the **join()** function.

In [None]:
d = c.split('/')
print(d)

In the **split( )** function one can also specify the number of times you want to split the string or the number of elements the new returned list should contain. The number of elements is always one more than the specified number - this is because it is split the number of times specified.

In [None]:
e = c.split('/',3)
print(e)
print(len(e))

**lower( )** converts any capital letter to lower caps - this comes in handy for text processing.

In [None]:
print(String0)
print(String0.lower())

**upper( )** converts any small letter to capital letter.

In [None]:
String0.upper()

**replace( )** function replaces the element with another element.

In [None]:
String0.replace('Germany','Belgium')

**strip( )** function is used to delete elements from the right end and the left end which is not required.

In [None]:
f = '    hello      '

If no char is specified then it will delete all the white spaces that are present at the right and left hand side of the string.

In [None]:
f.strip()

If you specify a character for the **strip( )** function, it will delete the char (if it is present) at both ends.

In [None]:
f = '   ***----hello---*******     '

In [None]:
f.strip('*')

We asked the function to delete the asterisk, but it was not deleted. This is because there are white space characters at both the right and left hand ends of the string. Therefore, for the **strip( )** function, the characters need to be input in the exact order in which they are present.

In [None]:
print(f.strip(' *'))
print(f.strip(' *-'))

**lstrip( )** and **rstrip( )** function have the same functionality as **strip( )** with the difference that **lstrip( )** deletes only from the left and **rstrip( )** from the right side.

In [None]:
print(f.lstrip(' *'))
print(f.rstrip(' *'))

## Dictionaries

Dictionaries are more used like a database because here you can index a particular sequence with your user defined string.

Dictionaries are often used in structured data.

To define a dictionary, equate a variable to { } or dict()

In [None]:
d0 = {}
d1 = dict()
print(type(d0), type(d1))

A dictionary works somewhat like a list but has the added advantage of assigning its own index (the "hash" in computer science lingo, or "key" in python lingo).

In [None]:
d0['One'] = 1
d0['OneTwo'] = 12
print(d0)

That is how a dictionary looks like. Now you are able to access '1' by the index value set at 'One'

In [None]:
print(d0['One'])

Two lists which are related can be merged to form a dictionary.

In [None]:
names = ['One', 'Two', 'Three', 'Four', 'Five']
numbers = [1, 2, 3, 4, 5]

In this case, we can use the **zip( )** function to combine the two lists into one.

In [None]:
d2 = zip(names,numbers)
print(d2)

The two lists are combined to form a single list containing a tuple.

To convert the above lists into a dictionary, we can use the **dict( )** function.

In [None]:
a1 = dict(d2)
print(a1)

### Built-in Functions

**clear( )** function is used to erase the entire dictionary that was created.

In [None]:
a1.clear()
print(a1)

A dictionary can also be built using loops (see later for how loops work...).

In [None]:
for i in range(len(names)):
    a1[names[i]] = numbers[i]
print(a1)

**values( )** function returns a list with all the assigned values in the dictionary.

In [None]:
a1.values()

**keys( )** function returns all the index or the keys to which contains the values that it was assigned to.

In [None]:
a1.keys()

**items( )** is returns a list containing both the list but each element in the dictionary is inside a tuple. This is same as the result that was obtained when zip function was used.

In [None]:
a1.items()

**pop( )** function can be used to get and remove that particular element. The returned element can also be assigned to a new variable. Remember that only the value is returned and not the key.

In [None]:
a2 = a1.pop('Four')
print(a1)
print(a2)