## Strings 

The built-in **str** class has many built-in functions for text processing. 

Strings can be enclosed in single or double quotes, and string literals inside triple quotes can span multiple lines.

Python doesn't support a char class, so a single character is a string of length 1. 

You can access the individual characters in a string with [n], where n ranges from 0 to the length of the string.

Python also supports negative indices. That is [-1] is the last element, [-2] is the next to last, and so on.

In [1]:
string1 = 'hello'
for i in range(len(string1)):
    print(i, string1[i])

0 h
1 e
2 l
3 l
4 o


In [2]:
print(string1[-1])
print(string1[-2])

o
l


### slicing

string3[m:n] is a *slice* of string3 from string3[m] to string3[n-1]

In [3]:
string3 = 'The University of Texas at Dallas'
string3[18:23]

'Texas'

You can omit the first or last index of a slice.

In [4]:
print(string3[:4], '\t', string3[27:])

The  	 Dallas


a[:n] + a[n:] gives you the whole string

In [5]:
string3[:9] + string3[9:]

'The University of Texas at Dallas'

If you omit both indices you get a copy

In [6]:
string3_cp = string3[:]
print(string3_cp)

# is it a new copy or just pointing to the other string?
print(hex(id(string3)), hex(id(string3_cp)))

The University of Texas at Dallas
0x103ecadf0 0x103ecadf0


Initially, they are pointing to the same location.

However, if you change one, Python will create a new location.


In [7]:
string3 = 'changed'
print(hex(id(string3)), hex(id(string3_cp)))

0x103edac00 0x103ecadf0


### concatenate

* use the + operator
* make sure they are of the same type

In [8]:
string2 = 'my favorite number is ' + str(3)
string2

'my favorite number is 3'

### other string operators

* \* repetition
* in
* % format

In [9]:
print('a' + 'b')
print('a' * 3)
print('a' in string3)
print("Format an int: %d, a float: %2.2f, a string: %s" % (5, 5.6, 'hi'))

ab
aaa
True
Format an int: 5, a float: 5.60, a string: hi


### string methods

There are dozens of built-in string functions. You can read the documentation:

https://docs.python.org/3/library/stdtypes.html#string-methods

Here are some that are commonly used:

* upper() and lower() to change case
* isalpha(), isdigit(), isspace()
* startswith() and endswith()
* strip() to remove whitespace from start and end
* split() to split into a list of strings
* join() to join a list of strings into one string
* find() - return index or -1
* count() - count unique occurrences


### upper(), lower()

These functions return a new string.

In [10]:
string_pp = 'Pied Piper'
print(string_pp.lower(), string_pp.upper())

pied piper PIED PIPER


### isalpha(), isdigit(), isspace()

These functions return a Boolean value.

In [11]:
string4 = 'number = 3'
print(string4[0].isalpha())
print(string4[-1].isdigit())
print(string4[6].isspace())

True
True
True


### startswith() and endswith()

These functions return a Boolean value.

In [12]:
string_hello = 'hello world'
print(string_hello.startswith('hello'))
print(string_hello.endswith('world'))

True
True


### strip()

Returns a string with the whitespace removed from both ends.

In [13]:
spacey = " hello "
not_spacey = spacey.strip()
len(not_spacey)

5

### split()

inputs a string, returns a list

By default it splits on whtiespace but you can specify delimiters in the optional argument. 

In [14]:
long_string = 'this is a lot of text in a string'
tokens = long_string.split()
for token in tokens:
    print(token)

this
is
a
lot
of
text
in
a
string


### join()

inputs a list, returns a string

In [15]:
print(''.join(tokens))
print(' '.join(tokens))
print('*'.join(tokens))

thisisalotoftextinastring
this is a lot of text in a string
this*is*a*lot*of*text*in*a*string


### find()

returns the index of the found item or -1

In [16]:
string_utd = 'The University of Texas at Dallas'
i = string_utd.find('Dallas')
string_utd[i:]

'Dallas'

### count()

returns an integer

In [17]:
count_a = string_utd.count('a')
count_a

4

### Practice

1. given a string, return the first and last characters joined into a new string; if the string is less than 2 characters, return the string
2. given a string, return the number of vowels 
3. given a string, return a string containing all found vowels in order
4. given a string, return a string containing 'aeiou' if all vowels found, 'ai' if only vowels a and i were found, etc.

# Write to a file

Writing to a file involves 3 steps:
* open the file
* write to the file
* close the file

All 3 are demonstrated below. Note that the write() function doesn't write newline so we need to.

In [18]:
f = open('temp.txt', 'w')
f.write('This is the first line\n')
f.write('This is another line\n')
f.close()

Let's read the file in and print each line to the screen.

f.read() reads the file while .splitlines() separates on newline, getting rid of newline in the process.

In [19]:
with open('temp.txt','r') as f:
    lines = f.read().splitlines()
for line in lines:
    print(line)

This is the first line
This is another line


## Formatting output

There are two ways to format output:
* the old way looks like this: '%d %s' % (1, 'text')
* the new way looks like this: '{} {}'.format(1, 'text')

They work the same as shown below. 

In [20]:
print('%d %s' % (1, 'text'))
print('{} {}'.format(1, 'text'))

1 text
1 text


In [21]:
num = 3
gpa = 3.7
name = 'Ralph'
f = open('temp.txt', 'w')
f.write('Name: {:8} Favorite number is {:d}      GPA is {:.2f}'.format(name, num, gpa))
f.close

# read back in
with open('temp.txt', 'r') as f:
    lines = f.read().splitlines()
for line in lines:
    print(line)

Name: Ralph    Favorite number is 3      GPA is 3.70


Formatting is a lengthy and boring subject. When you need to know details, refer to the Python documentation or [this link](https://pyformat.info/).