## Strings

A *string* is a sequence of characters. We might commonly call it *text*.

Let's look at some things we can do with strings in Python.

### Valid strings

In [None]:
# Valid strings

print('Single quoted: NYU')
print()

print("Double quoted: NYU")
print()

print('''Triple quoted: NYU has established the Center for
Social Media and Politics, which will examine the production,
flow, and impact of social media content in the political sphere,
as well as support research that uses social media data to study politics.

Photo credit: adamkaz/Getty Images
Will Focus on Production, Flow, and Impact of Content—and Methods
to Use Social Media Data to Study Politics''')

someString = '''Triple quoted: NYU has established the Center for
Social Media and Politics, which will examine the production,
flow, and impact of social media content in the political sphere,
as well as support research that uses social media data to study politics.

Photo credit: adamkaz/Getty Images
Will Focus on Production, Flow, and Impact of Content—and Methods
to Use Social Media Data to Study Politics'''


### Strings as a Sequence

In [None]:
# Strings as a sequence

#H e l l o   N e w   Y o r k !
#-----------------------------
#0 1 2 3 4 5 6 7 8 9 1 1 1 1 1
#                    0 1 2 3 4
#-----------------------------
#        (negatives... below)
#1 1 1 1 1 1 9 8 7 6 5 4 3 2 -1
#5 4 3 2 1 0

my_string = "Hello New York!"

print('my_string=', my_string)
print('my_string[0]=', my_string[0])
print('my_string[14]=', my_string[14])
print('my_string[-1]=', my_string[-1])
print('my_string[-15]=', my_string[-15])

print('my_string[15]=', my_string[15])  # out of range, no such index


### Special Characters

"Special" characters don't have a direct typed representation. They are specified using the backslash character followed by their special designation.

In [None]:
# Non-printing characters

# Some characters perform necessary operations but show up as whitespace in
# the output

# Newline - \n
# Tab -     \t

print("this is the first line\nthis is the second line.\n\t a tabbed\tline.")
print("aaaa\fform feed")
print("abcdefghijklmnopqrstuvwxyz\rcarriage return")
print("abcdef\v\vghijkl\t\123\xF2\a")

#\newline   Ignored
#\\ Backslash (\)
#\' Single quote (')
#\" Double quote (")
#\a ASCII Bell (BEL)
#\b ASCII Backspace (BS)
#\f ASCII Formfeed (FF)
#\n ASCII Linefeed (LF)
#\r ASCII Carriage Return (CR)
#\t ASCII Horizontal Tab (TAB)
#\v ASCII Vertical Tab (VT)
#\ooo   ASCII character with octal value ooo
#\xhhh    ASCII character with hex value hhh


### String Indexing and Slicing

In [None]:
my_string = "Gregor Samsa awoke in his bed one morning..."
print(my_string)

print("my_string[6:11] =", my_string[6:11]) # specifies starting and ending

print("my_string[6:] =", my_string[6:]) # specifies starting only, goes to end

print("my_string[:6] =", my_string[:6]) # slices from start to end (6)

# specifies start and end, end is neg.
print("my_string[3:-2] =", my_string[3:-2]) 

print("my_string[::2] =", my_string[::2]) # every other char

print("my_string[::-1] =", my_string[::-1]) # backward from end of string

# reverse chars starting from 2nd last letter
print("my_string[-2::-2] =", my_string[-2::-2]) 

input("Hit enter to continue.")
print()

alias_of_string = my_string
print("id of my_string:", id(my_string))
print("id of alias_of_string", id(alias_of_string))

input("Hit enter to continue.")
print()

copy_of_string = my_string[:]
print("id of copy_of_string", id(copy_of_string))

# Strings are iterable
for char in my_string:
    print(char, end='..')


### String Methods

Methods are like functions (which we have used.... math.sqrt is a function).

Methods are called in conjunction with the things on which they operate.



In [None]:
my_string = 'Python is great!'
new_string = my_string.upper()
print(new_string)

`find()` locates strings within a string.

An example with a single argument ("t"):

In [None]:
t_index = my_string.find("t")  
print(t_index)

`find()` returns -1 if argument not found.

In [None]:
z_index = my_string.find('z')
print(z_index)

We can chain methods:

In [None]:
print(my_string.upper().find('S'))

The next example finds the first 'o', starting at index 0:

In [None]:
o_index = my_string.find('o')
print(o_index)

This example finds the next 'o', starting after 1st 'o'.


In [None]:
print(my_string.find('o', o_index + 1)) 

### More String Methods

`count()` returns the number of occurrences of a substring in the given string.

In [1]:
quote = "  Mr Leopold Bloom ate with relish the inner organs " \
        + "of beasts and fowls. He liked thick giblet soup, " \
        + "nutty gizzards, a stuffed roast heart, liverslices " \
        + "fried with crustcrumbs, fried hencods' roes...\n"
print(quote.count("it"))

2


`find()` returns the index of first occurrence of the
substring (if found). If not found, it returns -1.

In [None]:
print(quote.find("it"))  

`index()` returns the index of a substring inside the string
(if found). If the substring is not found, it raises an exception.

In [None]:
print(quote.index("it"))

`isalnum()` returns `True` if all characters in the string are
alphanumeric (either alphabets or numbers). If not, it returns `False`.  

In [None]:
print(quote.isalnum())

`isalpha()` returns `True` if all characters in the string are alphabetical. If not, it returns `False`.

In [None]:
print(quote.isalpha())                             

`isdigit()` returns `True` if all characters in a string are digits. If not, it returns `False`.

In [None]:
print(quote.isdigit()) 

`lower()` converts all uppercase characters in a string into lowercase characters in the return string.

In [2]:
s = quote.lower()
print(s)
print(quote)

  mr leopold bloom ate with relish the inner organs of beasts and fowls. he liked thick giblet soup, nutty gizzards, a stuffed roast heart, liverslices fried with crustcrumbs, fried hencods' roes...

  Mr Leopold Bloom ate with relish the inner organs of beasts and fowls. He liked thick giblet soup, nutty gizzards, a stuffed roast heart, liverslices fried with crustcrumbs, fried hencods' roes...



`strip()` returns a copy of the string passed with both leading and trailing whitespace characters removed.

In [3]:
s = quote.strip()
print(s)
print(quote)

Mr Leopold Bloom ate with relish the inner organs of beasts and fowls. He liked thick giblet soup, nutty gizzards, a stuffed roast heart, liverslices fried with crustcrumbs, fried hencods' roes...
  Mr Leopold Bloom ate with relish the inner organs of beasts and fowls. He liked thick giblet soup, nutty gizzards, a stuffed roast heart, liverslices fried with crustcrumbs, fried hencods' roes...



`upper()` returns a string with all lowercase characters in the original string changed into uppercase characters in returned, modified string.

In [None]:
s = quote.upper()
print(s)
print(quote)

### String Formatting

The print function is easy, but provides no real control over the format of the output (think about how floats are printed, for example). The string `format()` method can help with this:

**Usage:**
`"string to format".format(data1, data2, ...)`


In [None]:
date_str = "03/10/20"
print("Midterm #{} will be held on {}".format(1, date_str))

In [2]:
# The way each object is formatted in the string is done by default based on
# the object's type. Can include "formatting commands"

# General Structure ([] indicate optional args):
#   {:[align] [min_width] [.precision] [descriptor]}
import math

print ("Total sale: ${:<10.2f}".format(25.667899999999))

Midterm #1 will be held on 03/10/20
Total sale: $25.67     


In [9]:
print ("Student name: '{:^10s}'".format("Peter"))
print ("Class size: {:>12f}".format(50))
print ("Pi is {:<1.12f}".format(math.pi))

Student name: '  Peter   '
Class size:    50.000000
Pi is 3.141592653590


In [4]:
# Alignment: < (left), ^ (center), > (right)
# Descriptors: s (string), d (integer), f (float), e (exponent), % (float as %), b (boolean)

print("{:<10s} is {:>8.3f} years old".format("Peter", 2**6-8))
print("{:<10s} is {:>8.3f} years old".format("Rebecca", 2**4+5))

Peter      is   56.000 years old
Rebecca    is   21.000 years old


### More Formatting



In [6]:
import math

print (math.pi)
print ("{:<8f}".format(math.pi))    # Note the rounding
print ("{:>8.3f}".format(math.pi))  # Note the rounding

print("{:8.2%}".format(2/3))


3.141592653589793
3.141593
   3.142
  66.67%
