# 03 Strings, Tuples and Sets

Lesson goals:

  1.  Examine the string class in greater detail.
  2.  Introduce two basic container types, Tuples and Sets

# 03.0. Strings

To start understanding the string type, let's use the built in helpsystem.

In [None]:
help(str)

The help page for string is very long, and it may be easier to keep it open
in a browser window by going to the [online Python
documentation](http://docs.python.org/library/stdtypes.html#sequence-types-str-unicode-list-tuple-bytearray-buffer-xrange)
while we talk about its properties.

At its heart, a string is just a sequence of characters. Basic strings are
defined using single or double quotes.

In [85]:
s = "This is a string."
s2 = 'This is another string that uses single quotes'
print(len(s))

17


The reason for having two types of quotes to define a string is
emphasized in these examples:

In [5]:
s = "Bob's mom called to say hello."
s = 'Bob\'s mom called to say hello.'

The second one should be an error: Python interprets it as `s = 'Bob'` then the
rest of the line breaks the language standard.



## Working with Strings

Strings are iterables, which means . For instance, characters can
be accessed individually or in sequences:

In [6]:
s = 'abcdefghijklmnopqrstuvwxyz'
s[0]

'a'

In [16]:
print(s[-1])
print(s[1:4])

z
bcd


They can also be compared using sort and equals.

In [17]:
'str1' == 'str2'

False

In [None]:
'str1' == 'str1'

In [28]:
'str11' < 'str2'

True

**Hands on example**

Try each of the following functions on a few strings. What does the
function do?

In [66]:
s = "this is a string"

In [30]:
s.startswith("This")

True

In [31]:
s.split(" ")

['This', 'is', 'a', 'string']

In [42]:
s.strip() # This won't change every string!

'This is a string 123'

In [58]:
S = "sThisisastring123s"

In [59]:
S.strip("s")

'Thisisastring123'

In [56]:
str = "0000000this is string example....wow!!!0000000";
print (str.strip( '0' ))

this is string example....wow!!!


In [67]:
s.capitalize()

'This is a string'

In [63]:
s.lower()   #.lower() string method

'this is a string'

In [64]:
s.upper()   #.upper() string method

'THIS IS A STRING'

 Investigate what the .count() and .find() string methods do and test them. The .replace() string method could also be interesting to look at.




In [75]:
print(s.count('s'))
print(s)
print(s.replace('s','r'))
print(s)

3
this is a string
thir ir a rtring
this is a string


There are operations that can be done with strings.

In [80]:
title = "Jr."
firstname = "John"
lastname = "Nash"

When concatenating strings, we must explicitly use the concatenation operator +.  Computers don't understand context.

In [77]:
fullname = firstname + lastname
print (fullname)

JohnNash


In [78]:
fullname = firstname + " " + lastname
print (fullname)

John Nash


### Exercise

John's father's name is also John. Print his full name, including his title: John Nash, Jr.

In [82]:
fullname = title + firstname + " " + lastname
print (fullname)

Jr.John Nash


# Bonus Exercise: Transcribe DNA to RNA
### Motivation:
During transcription, an enzyme called RNA Polymerase reads the DNA sequence and creates a complementary RNA sequence. Furthermore, RNA has the nucleotide uracil (U) instead of thymine (T). 
### Task:
Write a function that mimics transcription. The input argument is a string that contains the letters A, T, G, and C. Create a new string following these rules: 

* Convert A to U

* Convert T to A

* Convert G to C

* Convert C to G

Hint: You can iterate through a string using a for loop similary to how you loop through a list.

In [135]:
def transcribe(seq):
    seq = seq.replace('A','U')
    seq = seq.replace('T','A')
    seq = seq.replace('G','C')
    seq = seq.replace('C','G')
    return seq

Check your work:

In [137]:
s = 'ATGC'
print(test(s))

UAGG


In [138]:
transcribe('ATGC') == 'UACG'

False

In [139]:
transcribe('ATGCAGTCAGTGCAGTCAGT') == 'UACGUCAGUCACGUCAGUCA'

False

# 03.1. Tuples

To start understanding the string type, let's use the built in helpsystem.

Tuples are one of Python's basic container data types. Tuples are **immutable**. Once data is placed into a tuple, the tuple cannot be changed. You define a tuple as follows:

In [120]:
tup = ("red", "white", "blue") 

In [121]:
type(tup)

tuple

In [151]:
res = ("yellow","black","green")
x = tup + res
print(x)

('red', 'white', 'blue', 'yellow', 'black', 'green')


In [None]:
summ = 0


### Exercise 

Make a tuple of the _remaining colors_ (not red, white, and blue) of the South African flag. Can you figure out a way to combine the two groups of colors into a single tuple?

<img src=za_flag.png>

# 03.2. Sets



Most introductory python courses do not go over sets this early (or at all), but I've found this data type to be useful. The python set type is similar to the idea of a mathematical set: it is an unordered collection of unique things. Consider:

In [21]:
fruits = {"apple", "banana", "pear", "banana"}

Since sets contain only unique items, there's only one banana in the set fruits.

In [22]:
print (fruits)

{'pear', 'apple', 'banana'}


You can also add things to sets.

In [23]:
fruits.add('pineapple')
print (fruits)

{'pineapple', 'pear', 'apple', 'banana'}


You can do things like intersections, unions, etc. on sets just like in math. Here's an example of an intersection of two sets (the common items in both sets).

In [1]:
bowl1 = {"apple", "banana", "pear", "peach"}
bowl2 = {"peach", "watermelon", "orange", "apple"}

In [2]:
bowl1 & bowl2 # intersection

{'apple', 'peach'}

In [4]:
bowl1 | bowl2 # union

{'apple', 'banana', 'orange', 'peach', 'pear', 'watermelon'}

In [7]:
bowl3 = bowl1 | bowl2
print(bowl3)

{'pear', 'peach', 'orange', 'banana', 'apple', 'watermelon'}


In [9]:
bowl1 - bowl2

{'banana', 'pear'}

In [11]:
bowl3 - bowl1

{'orange', 'watermelon'}

In [8]:
# bowl3 - bowl1
for i in range (0,len(bowl3)):
    if not(list(bowl3)[i] in bowl1):
        print(list(bowl3)[i])

orange
watermelon


In [150]:
list(bowl3)[i]

'apple'

In [17]:
bowl4 = {"apple"}

In [18]:
bowl3,bowl4

({'apple', 'banana', 'orange', 'peach', 'pear', 'watermelon'}, {'apple'})

In [19]:
bowl3 - bowl4

{'banana', 'orange', 'peach', 'pear', 'watermelon'}

In [20]:
bowl4 - bowl3

set()

In [13]:
length = int(input("Enter subset length>>> "))
for i in range (0,length):
    length = int(input("Enter subset length>>> "))

set

You can check out more info using the help docs. We won't be returning to sets, but its good for you to know they exist.