**Strings in Python**

We can define a string in python and assign that string to some variable.

Quotation marks are used to indicate that we have a literal string on the right hand side.
Then print that variable with a print statement.

In [None]:
x="string plus another set of characters"
print(x)

**Assignment**

Recall that the way assignments work: the right hand side is evaluated and then that value is stored as a variable whose name is in the left hand side.

In [None]:
x="dog"
print(x)

Observe that the print statement strips the quotation marks.

We can also use single quotes to assign the same value.

In [None]:
y='string dwehgfwhg fwhgfhwgh fgwhg fhwghf '
print(y)
x="string dwehgfwhg fwhgfhwgh fgwhg fhwghf "

In [None]:
x==y

**Length of string**

If we want to know how many characters there are in a string, we can use the **len** function.

We will see this *len* function used for many different Python objects. 

(We say that the function is *overloaded*.)

In [None]:
st="dogs and cats"
len(st)

**Selection**

We can refer to particular elements in a string using
square brackets with values 0,1,...,length of string-1.


In [None]:
st="do you prefer to have a dog or a cat as a pet?"
print(len(st))
print(st[0])
print(st[1])
print(st[len(st)-1])

**Errors**

If we specify a value outside of the correct range we get an error.

In [None]:
st[len(st)-1]

**Negative indices**

We can select characters by counting from the end of the string backwards. 
Index -1 refers to the last character, -2 the second to last, and so on.

In [None]:
print(st[-1])
print(st[-2])
print(st[-3])

**Slices/Substrings**

The colon character is used to specify a range of characters in a string. 
If we use i:j then this means the indices are i,i+1,...,j-1 so that i is included and j is not.

In [None]:
print(st)
print(st[0])
print(st[0:5]) # 0,1,2,3,4
print(st[2:4]) # 2,3
print(st[2:])  # 2,3,... end of string
print(st[:3])  # 0,1,2

In [None]:
print(st[-1])
print(st[-2])
print(st[0:-1])
print(st[0:-2])
print(st[-4:-2])


**Concatenation**

In [1]:
str1="dog"
str2="cat"
s=str1+str2
print(s)

dogcat


In [2]:
str1="dog"
str2="cat"
str3="bird"
s=str1+str2+str3
print(s)

dogcatbird


**Addition/Assignment Shortcut**

When working with strings, we often find ourselves wanting to add to a first string the characters from a second string, and keep the name of the first string.

This can be done using the += operator. 


In [3]:
str1="dogs"
str2="cats"
str1+=str2 # shortcut for str1=str1+str2
print(str1)
str3="fish" #shortcute for str1=str1+str3
str1+=str3
print(str1)

dogscats
dogscatsfish


In [4]:
x=7
y=5
x+=y # instead of x=x+y
print(x)

x*=y # instead of x=x*y
print(x)

x/=y # instead of x=x/y
print(x)

x-=y # instead of x=x-y
print(x)

12
60
12.0
7.0


**Conversion to a string**

Conversion from one type to another comes up all of the time.
We will often need *stringify* an object, i.e. to convert it to string.

In [5]:
x=98.6
print(type(x))
st=str(x) # str is a function
print(type(st))
print(st)

<class 'float'>
<class 'str'>
98.6


These two representations are not the same.

In [6]:
st==x

False

Here's a situation where we would want to stringify.

This fails:

In [7]:
str(98.6)
print(98.6)
print("my temp is " + 98.6)

98.6


TypeError: can only concatenate str (not "float") to str

and this works.

In [8]:
print("my temp is " + str(98.6))

my temp is 98.6


The point is that we can't concatenate a string and a float, but we can concatenate a string with a stringified float.

In [None]:
str1="my temp is "
str2=str(98.6)
print(str2)
str3=str1+str2
print(str3)

**Warning: Consider this example**

In [9]:
str="mystring"
print(str)

mystring


In [10]:
str(98.6)

TypeError: 'str' object is not callable

**The function str() stopped working!!!**

If you define str to be a string, it will no longer represent the internal function you want it to. To make str() work again, you need to delete the str object you created.

In [None]:
str="abc"
print(str)
str(98.6)

**Determining all local variables**

Python stores all of the variables as a list of *local* variables we've created and we can query what's going on.

In [11]:
locals()

{'__name__': '__main__',
 '__doc__': 'Automatically created module for IPython interactive environment',
 '__package__': None,
 '__loader__': None,
 '__spec__': None,
 '__builtin__': <module 'builtins' (built-in)>,
 '__builtins__': <module 'builtins' (built-in)>,
 '_ih': ['',
  'str1="dog"\nstr2="cat"\ns=str1+str2\nprint(s)',
  'str1="dog"\nstr2="cat"\nstr3="bird"\ns=str1+str2+str3\nprint(s)',
  'str1="dogs"\nstr2="cats"\nstr1+=str2 # shortcut for str1=str1+str2\nprint(str1)\nstr3="fish" #shortcute for str1=str1+str3\nstr1+=str3\nprint(str1)',
  'x=7\ny=5\nx+=y # instead of x=x+y\nprint(x)\n\nx*=y # instead of x=x*y\nprint(x)\n\nx/=y # instead of x=x/y\nprint(x)\n\nx-=y # instead of x=x-y\nprint(x)',
  'x=98.6\nprint(type(x))\nst=str(x) # str is a function\nprint(type(st))\nprint(st)',
  'st==x',
  'str(98.6)\nprint(98.6)\nprint("my temp is " + 98.6)',
  'print("my temp is " + str(98.6))',
  'str="mystring"\nprint(str)',
  'str(98.6)',
  'locals()'],
 '_oh': {6: False},
 '_dh': ['E:\\O

## We see that str was made into a local variable. Oops. Let's fix that.

In [12]:
del(str)
str(98.6)

'98.6'

**replace**

We can replace portions of a string that match some pattern with another string using the *replace* method.

In [13]:
st="my dog ate my homework"
st.replace("dog","cat")

'my cat ate my homework'

In [14]:
st

'my dog ate my homework'

Importantly, this doesn't do anything to the original string.

Strings have a property called **immutability** - they can't be changed.

To save that result we need to do an assignment.

In [15]:
st="my dog ate my homework"
st=st.replace("dog","cat")
print(st)

my cat ate my homework


Note that we typically get a new id, confirming that we did not actually change the object previously referred to by that name.

In [16]:
st="my dog at my homework"
print(id(st))
st=st.replace("dog","cat")
print(id(st))

2537489539360
2537489541440


**Special characters**

A backslash preceding a character is used to give special meaning to tha character.
In particular

- \n = new line character
- \t = tab character
- \\\\ = backslash character
- \\" = to put quotes inside a string (the backslash "escapes" the meaning of the ")
- \\' = to put a single quote inside a string

Here's an example of putting literal quotes in a string.

In [19]:
st="He said \" let's go get a cup of coffee sometime.\""
print(st)

He said " let's go get a cup of coffee sometime."


If the apostrophe were not in the sentence, we could use single quotes.

In [20]:
st='He said "let us go get a cup of coffee sometime."'
print(st)

He said "let us go get a cup of coffee sometime."


But the apostrophe in "let's" causes a problem.

In [21]:
st='He said "let's us go get a cup of coffee sometime"'
print(st)

SyntaxError: invalid syntax (<ipython-input-21-9f200e6b3c80>, line 1)

But we can use the escape to eliminate that issue.

In [22]:
st='He said \"let\'s go get a cup of coffee sometime.\"'
print(st)

He said "let's go get a cup of coffee sometime."


The newline character often appears in strings.

In [33]:
st="Roses are red.\nViolets are blue.\nI work with Python.\nWhy shouldn't you?"
print(st)

Roses are red.
Violets are blue.
I work with Python.
Why shouldn't you?


**Other string methods**

There is a host of other methods that can be used on a string. In addition to reading the Python documentations you can use dir(object) to determine the attributes and methods associated with an object.

Generally, Python objects have methods (as well as attributes, to be discussed later). 

We've seen an example of a method - the replace method for a string.

A **method** associated with an object is a function that is called using the syntax:

    object.method_name()

Using the **dir()** function we can determine all methods associated with an object.
This is demonstrated with a string object **st** that we created above.

In [25]:
st

"Roses are red.\nViolets are blue.\nI work with Python.\nWhy shouldn't you?"

In [24]:
dir(st)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


**Examples and Getting Help**

In [32]:
st="Rosesarered"
st=st.lower()
print(st)

rosesarered


In [35]:
st.split("\n")

['Roses are red.',
 'Violets are blue.',
 'I work with Python.',
 "Why shouldn't you?"]

In [54]:
st=st="Roses are red.\nViolets are blue.\nI work with Python.\nWhy shouldn't you?"

In [55]:
print(st)

Roses are red.
Violets are blue.
I work with Python.
Why shouldn't you?


In [57]:
print(st.upper())

ROSES ARE RED.
VIOLETS ARE BLUE.
I WORK WITH PYTHON.
WHY SHOULDN'T YOU?


In [58]:
help(st.rstrip)

Help on built-in function rstrip:

rstrip(chars=None, /) method of builtins.str instance
    Return a copy of the string with trailing whitespace removed.
    
    If chars is given and not None, remove characters in chars instead.



In [59]:
st="This is a string with some whitespace chars \n \t "

In [60]:
st.rstrip()

'This is a string with some whitespace chars'