## Strings

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/b/b4/0321_DNA_Macrostructure.jpg/300px-0321_DNA_Macrostructure.jpg" width="33%">

A [*string*](https://en.wikipedia.org/wiki/String_%28computer_science%29) is a sequence of characters. We might commonly call it *text*.

A *string literal* is a sequence of characters, between single or double quotes, written directly in the source code.

Let's look at some things we can do with strings in Python.

### Valid strings

In [None]:
# Valid strings

print('Single quoted: NYU')
print()

print("Double quoted: NYU")
print()

print('''Triple quoted: NYU has established the Center for
Social Media and Politics, which will examine the production,
flow, and impact of social media content in the political sphere,
as well as support research that uses social media data to study politics.

Photo credit: adamkaz/Getty Images
Will Focus on Production, Flow, and Impact of Content—and Methods
to Use Social Media Data to Study Politics''')

someString = '''Triple quoted: NYU has established the Center for
Social Media and Politics, which will examine the production,
flow, and impact of social media content in the political sphere,
as well as support research that uses social media data to study politics.

Photo credit: adamkaz/Getty Images
Will Focus on Production, Flow, and Impact of Content—and Methods
to Use Social Media Data to Study Politics'''


### Strings as a Sequence

We can "get at" a particular character in a string using the square-bracket `[]` operator. Recall, programmers like to start counting at 0!

In [3]:
#H e l l o   N e w   Y o r k !
#-----------------------------
#0 1 2 3 4 5 6 7 8 9 1 1 1 1 1
#                    0 1 2 3 4
#-----------------------------
#        (negatives... below)
#1 1 1 1 1 1 9 8 7 6 5 4 3 2 -1
#5 4 3 2 1 0

my_string = "Hello New York!"

print('my_string =', my_string)
print('my_string[0] =', my_string[0])
print('my_string[14] =', my_string[14])
print('my_string[-1] =', my_string[-1])
print('my_string[-15] =', my_string[-15])

my_string = Hello New York!
my_string[0] = H
my_string[14] = !
my_string[-1] = !
my_string[-15] = H


What happens if we use an invalid index?

In [4]:
print('my_string[15]=', my_string[15])

IndexError: string index out of range

What will be the result of running the following code?

a) It will print 'H'  
b) It will print 'e'  
c) It will print 'Hello New York!'  
d) We will get an 'Index Error' message

In [5]:
print(my_string[-16])

IndexError: string index out of range

### Special Characters

"Special" characters don't have a direct typed representation. They are specified using the backslash character followed by their special designation.

In [None]:
# Non-printing characters

# Some characters perform necessary operations but show up as whitespace in
# the output

# Newline - \n
# Tab -     \t

print("this is the first line\nthis is the second line.\n\t a tabbed\tline.")
print("aaaa\fform feed")
print("abcdefghijklmnopqrstuvwxyz\rcarriage return")
print("abcdef\v\vghijkl\t\123\xF2\a")

#\newline   Ignored
#\\ Backslash (\)
#\' Single quote (')
#\" Double quote (")
#\a ASCII Bell (BEL)
#\b ASCII Backspace (BS)
#\f ASCII Formfeed (FF)
#\n ASCII Linefeed (LF)
#\r ASCII Carriage Return (CR)
#\t ASCII Horizontal Tab (TAB)
#\v ASCII Vertical Tab (VT)
#\ooo   ASCII character with octal value ooo
#\xhhh    ASCII character with hex value hhh


### String Are Immutable

That means they can't be changed! (But we can create a new string from an old one, as we will see.)

Let's try to change a string:

s = "Fazzlebop"
s[0] = "D"

### Traversing a string

"To traverse" is to pass through.

We can use a `for` loop to traverse a string:

In [7]:
s = "Goodbye mama and papa"
for c in s:
    print(c)

G
o
o
d
b
y
e
 
m
a
m
a
 
a
n
d
 
p
a
p
a


### Transposing a string

"To transpose" means to reverse the elements of some sequence.

How can we transpose a string?

We're going to use the Python built-in function `len()`, which returns the length of a sequence.

In [12]:
s = "The transitive nightfall of diamonds"
for i in range(-1, -len(s) - 1, -1):
    print(s[i], end="")

sdnomaid fo llafthgin evitisnart ehT

### String Slicing

In [None]:
my_string = "Gregor Samsa awoke in his bed one morning..."
print(my_string)

print("my_string[6:11] =", my_string[6:11]) # specifies starting and ending

print("my_string[6:] =", my_string[6:]) # specifies starting only, goes to end

print("my_string[:6] =", my_string[:6]) # slices from start to end (6)

# specifies start and end, end is neg.
print("my_string[3:-2] =", my_string[3:-2]) 

print("my_string[::2] =", my_string[::2]) # every other char

print("my_string[::-1] =", my_string[::-1]) # backward from end of string

# reverse chars starting from 2nd last letter
print("my_string[-2::-2] =", my_string[-2::-2]) 

input("Hit enter to continue.")
print()

In [4]:
my_string = "Gregor Samsa awoke in his bed one morning..."

alias_of_string = my_string
print("id of my_string:      ", id(my_string))
print("id of alias_of_string:", id(alias_of_string))

copy_of_string = my_string[:]
print("id of copy_of_string: ", id(copy_of_string))

# Strings are iterable
for char in my_string:
    print(char, end='..')

id of my_string:       140333676901168
id of alias_of_string: 140333676901168
id of copy_of_string:  140333676901168
G..r..e..g..o..r.. ..S..a..m..s..a.. ..a..w..o..k..e.. ..i..n.. ..h..i..s.. ..b..e..d.. ..o..n..e.. ..m..o..r..n..i..n..g...........

### String Methods

Methods are like functions (which we have used.... math.sqrt is a function).

Methods are called in conjunction with the things on which they operate.



In [6]:
my_string = 'Python is great!'
new_string = my_string.upper()
print("New: ", new_string)
print("My: ", my_string)

New:  PYTHON IS GREAT!
My:  Python is great!


`find()` locates strings within a string.

An example with a single argument ("t") -- we get the *first* instance:

In [7]:
t_index = my_string.find("t")  
print(t_index)

2


`find()` returns -1 if argument not found.

In [8]:
z_index = my_string.find('z')
print(z_index)

-1


We can chain methods:

In [9]:
print(my_string.upper().find('S'))

8


The next example finds the first 'o', starting at index 0:

In [10]:
o_index = my_string.find('o')
print(o_index)

4


This example finds the next 'o', starting after 1st 'o'.


In [11]:
print(my_string.find('o', o_index + 1)) 

-1


### More String Methods

`count()` returns the number of occurrences of a substring in the given string.

In [21]:
quote = "  Mr Leopold Bloom ate with relish the inner organs " \
        + "of beasts and fowls. He liked thick giblet soup, " \
        + "nutty gizzards, a stuffed roast heart, liverslices " \
        + "fried with crustcrumbs, fried hencods' roes...\t"
print(quote.count(" it "))

0


`find()` returns the index of first occurrence of the
substring (if found). If not found, it returns -1.

In [14]:
print(quote.find("it"))  

24


`index()` returns the index of a substring inside the string
(if found). If the substring is not found, it raises an exception.

In [15]:
print(quote.index("it"))

24


`isalnum()` returns `True` if all characters in the string are
alphanumeric (either alphabets or numbers). If not, it returns `False`.  

In [16]:
print(quote.isalnum())

False


`isalpha()` returns `True` if all characters in the string are alphabetical. If not, it returns `False`.

In [17]:
s = "Letters"
print(s.isalpha())                             

True


`isdigit()` returns `True` if all characters in a string are digits. If not, it returns `False`.

In [19]:
digits = "012-34-56789"
print(digits.isdigit()) 

False


`lower()` converts all uppercase characters in a string into lowercase characters in the return string.

In [20]:
s = quote.lower()
print(s)
print(quote)

  mr leopold bloom ate with relish the inner organs of beasts and fowls. he liked thick giblet soup, nutty gizzards, a stuffed roast heart, liverslices fried with crustcrumbs, fried hencods' roes...

  Mr Leopold Bloom ate with relish the inner organs of beasts and fowls. He liked thick giblet soup, nutty gizzards, a stuffed roast heart, liverslices fried with crustcrumbs, fried hencods' roes...



`strip()` returns a copy of the string passed with both leading and trailing whitespace characters removed.

In [22]:
s = quote.strip()
print(s)
print(quote)

Mr Leopold Bloom ate with relish the inner organs of beasts and fowls. He liked thick giblet soup, nutty gizzards, a stuffed roast heart, liverslices fried with crustcrumbs, fried hencods' roes...
  Mr Leopold Bloom ate with relish the inner organs of beasts and fowls. He liked thick giblet soup, nutty gizzards, a stuffed roast heart, liverslices fried with crustcrumbs, fried hencods' roes...	


`upper()` returns a string with all lowercase characters in the original string changed into uppercase characters in returned, modified string.

In [23]:
s = quote.upper()
print(s)
print(quote)

  MR LEOPOLD BLOOM ATE WITH RELISH THE INNER ORGANS OF BEASTS AND FOWLS. HE LIKED THICK GIBLET SOUP, NUTTY GIZZARDS, A STUFFED ROAST HEART, LIVERSLICES FRIED WITH CRUSTCRUMBS, FRIED HENCODS' ROES...	
  Mr Leopold Bloom ate with relish the inner organs of beasts and fowls. He liked thick giblet soup, nutty gizzards, a stuffed roast heart, liverslices fried with crustcrumbs, fried hencods' roes...	


### String Formatting

The print function is easy, but provides no real control over the format of the output (think about how floats are printed, for example). The string `format()` method can help with this:

**Usage:**
`"string to format".format(data1, data2, ...)`


In [None]:
date_str = "03/10/20"
print("Midterm #{} will be held on {}".format(1, date_str))

In [2]:
# The way each object is formatted in the string is done by default based on
# the object's type. Can include "formatting commands"

# General Structure ([] indicate optional args):
#   {:[align] [min_width] [.precision] [descriptor]}
import math

print ("Total sale: ${:<10.2f}".format(25.667899999999))

Midterm #1 will be held on 03/10/20
Total sale: $25.67     


In [9]:
print ("Student name: '{:^10s}'".format("Peter"))
print ("Class size: {:>12f}".format(50))
print ("Pi is {:<1.12f}".format(math.pi))

Student name: '  Peter   '
Class size:    50.000000
Pi is 3.141592653590


In [4]:
# Alignment: < (left), ^ (center), > (right)
# Descriptors: s (string), d (integer), f (float), e (exponent), % (float as %), b (boolean)

print("{:<10s} is {:>8.3f} years old".format("Peter", 2**6-8))
print("{:<10s} is {:>8.3f} years old".format("Rebecca", 2**4+5))

Peter      is   56.000 years old
Rebecca    is   21.000 years old


### More Formatting



In [6]:
import math

print (math.pi)
print ("{:<8f}".format(math.pi))    # Note the rounding
print ("{:>8.3f}".format(math.pi))  # Note the rounding

print("{:8.2%}".format(2/3))


3.141592653589793
3.141593
   3.142
  66.67%
