# Strings, Lists, Dictionaries and Sets

* *Strings* are arrays of bytes representing Unicode characters. Python does not have a character data type, so a single character is simply a string with a length of 1.  Strings are **iterable**, **immutable** and **ordered**.  Duplicates are allowed in strings.

* *Lists* are like arrays in other languages. Python lists can be heterogeneous, which makes them very powerful. For example, this is a perfectly legal list in Python: <br>`["Hello", 0, 3.141]`. Lists are **iterable**, **mutable** and **ordered**.  Duplicates are allowed in lists.

* *Dictionaries* are collections of key-value pairs. Keys in dictionaries can be any **immutable** type; for example, strings and numbers can always be keys. A Dictionary in Python works in a similar way to a Dictionary in a real world.  Dictionaries are **iterable**, **mutable** and **unordered**.  Duplicate *keys* are not allowed in dictionaries.

* *Sets* in Python are equivalent to sets in mathematics. Python sets can be heterogeneous, which makes them very powerful. For example, this is a perfectly legal set in Python: <br>`{"Hello", 0, 3.141}`. Sets are **iterable**, **mutable** and **unordered**.  Duplicates are not allowed in sets (this actually makes them really useful, as we'll see later).

Strings, Lists, Dictionaries and Sets are generally termed *Collection Classes*, meaning they are collections of other data types.

*Like any other programming language, there are many ways to approach problems in Python.  Two completely different approaches may be equally valid.  Keep in mind that as you go through these examples you'll often see multiple ways of doing the same thing.  No particular approach is presented here as "the best" or "the only" way do do things.  Find your own unique programming style and embrace it!*

### Strings
Here are examples of some common design patterns using strings. Let's start with finding the length of a string and accessing individual characters.

<br clear="all" />
<img src="../../images/check.png" align="left" />
<br clear="all" />

Why do we access the last character in the string below with `s[length-1]` instead of `s[length]`?

In [None]:
s = input("Enter a string: ")
length = len(s)
print("There are {0:d} characters in your string".format(length))
print("The first character in your string is: \"{0:s}\"".format(s[0]))
print("The last character in your string is: \"{0:s}\"".format(s[length-1]))

Splitting stings in python is known as *slicing*.  While strings are immutable, you can slice them and put them back together (concatenate) in various ways.  String concatenation in Python is performed with the plus sign (`+`).  Here are some syntax examples for various slicing operations (assume we have a string `S`):

`S[start:end]` From the character at position `start` to the character at position `end`-1.

`S[start:]` From the character at position `start` to the end of the string.

`S[:end]` From the beginning of the string to the character at position `end`-1.

`S[:]` A complete copy of `S`.

*For brevity, the code sample below has no exception handling.*

<br clear="all" />
<img src="../../images/check.png" align="left" />
<br clear="all" />

What's the difference between division using `/` and division using `//`?

In [None]:
s1 = input("Enter a string: ")
word = input("Enter a word to insert in the middle: ")
middle = len(s1) // 2
part1 = s1[:middle] # end = middle - 1
part2 = s1[middle:] # start = middle
print("Your new string is: {0:s}".format(part1 + word + part2))

Reversing a string.  I'm using the power of formatted strings here to line up my prompts to the user.

In [None]:
s = input("{0:>26s}".format("Enter a string:"))
s2 = ""
for i in range(len(s)-1,-1,-1):
    s2 += s[i]
print("Your string in reverse is: {0:s}".format(s2))

### Lists

Here are examples of some common design patterns using lists. Let's start with iterating through a list.

In [None]:
L = [12,55,33,81]
for item in L:
    print(item)

Next, let's examine how to add, remove and sort items in a list using different techniques.

In [None]:
L = [] # Empty list
L.append(23) # Append 23 to the end
L += [15,55] # We can "concatenate" two lists together
print("L = ",L)
L.insert(1,99) # Insert 99 at index 1.  This adds to the list.
print("L = ",L)
n = L.pop() # Return (and remove) the last item in a list
print("n = ",n)
print("L = ",L)
L.sort() # Sorting is performed in place
print("L sorted = ",L)
L[1] = 8 # Change the value at index 1
print("L = ",L)
L.pop(0) # Return (and remove) the item at index 0.  Ignore the return value
print("L = ",L)

You can test if an item is contained in a list using the `in` operator.

*Note: For brevity, the code below does not use exception handling.*

In [None]:
import random

# Create a list with 10 random integers from 1 to 100
L = []
# We can use "_" to mean "I don't need a specific iterator, I just want to iterate 10 times".
for _ in range(10):
    L.append(random.randint(1,100))
    
print("L = ",L)
n = int(input("Enter an integer: "))

if n in L:
    print("{0:d} is in L".format(n))
else:
    print("{0:d} is not in L".format(n))

Strings are immutable, but you can convert strings into lists, manipulate individual characters, then turn the list back into a string.

*Spend some time exploring the Python `join()` method for strings.  It's very powerful.*

In [None]:
# This program toggles the case of the input string (converts uppercase letters to lowercase
# and lowercase letters to uppercase)

s = input("Enter a string:")
L = list(s)
print(s)
print(L)

for i in range(len(L)):
    if L[i].islower():
        L[i] = L[i].upper()
    else:
        L[i] = L[i].lower()

print(L)
s = "".join(L)
print(s)

The previous example was illustrative, but a bit overkill for the required task.  A more "Pythonic" way to do it would be as shown below :-)

In [None]:
# This program toggles the case of the input string (converts uppercase letters to lowercase
# and lowercase letters to uppercase)

s = input("Enter a string:")
print(s)
print("".join(c.lower() if c.isupper() else c.upper() for c in s))

### Dictionaries

Here are examples of some common design patterns using dictionaries. Let's start with creating a dictionary and adding / accessing *key:value* pairs.  Let's assume we're using strings for both keys and values in our dictionary.

Note the syntax below for adding a *key:value* pair to a dictionary.  It looks a lot like the syntax for accessing items in a list.

In [None]:
# This is how we create an empty dictionary
D = {}
# Add key-value pairs
D["Jan"] = "This is the first month."
D["Aug"] = "This month is very warm."
D["Dec"] = "This month has holidays."
# Note that dictionaries are unordered.  The order in which the key:value pairs print
# may not match the order in which they were entered.
print(D)
key = input("Enter a month:")
# Dictionaries support the "in" operator when looking for keys
if key in D:
    print(D[key])
else:
    print("Sorry, {0:s} is not in the dictionary".format(key))

It's clear that dictionaries are unordered, but what if we want to traverse a dictionary in order by sorted keys?  Python allows you to extract the keys from a *dictionary* object using the `keys()` method, which returns a *dict_keys* object. You can then cast the *dict_keys* object into a list, using the `list()` function.  From there you can sort it and iterate across it.

<br clear="all" />
<img src="../../images/check.png" align="left" />
<br clear="all" />

After exploring the code below, try inserting print statements with different variables and re-running it.  What do you get if you print `D.keys()`?  How about if you print `letters`?

In [None]:
import random
import string

# Create a dictionary with 10 random integers as keys and random 8-character strings as values.
letters = string.ascii_lowercase
D = {}
# We can use "_" to mean "I don't need a specific iterator, I just want to iterate 10 times".
for _ in range(10):
    key = random.randint(1,100)
    # Make sure to generate unique keys.
    while key in D:
        key = random.randint(1,100)
    value = "".join(random.choice(letters) for _ in range(8))
    D[key] = value
    
# Unordered iterating
print("Unordered iterating:")
for key,value in D.items():
    print(key,value)
    
# Ordered iterating
print("\nOrdered iterating:")
keys = list(D.keys())
keys.sort()
for key in keys:
    print(key,D[key])

A few final thoughts on dictionaries and one last example.

* If you assign a value to a dictionary key, and the key is not in the dictionary, Python will add the new *key:value* pair to the dictionary.  If the key does exist in the dictionary, Python will replace the current value assigned to the given key with the new value.
* You can use the `len()` function on dictionaries to get the number of entries.
* You can't copy a dictionary simply by typing `D1 = D2`.  Dictionaries are objects, and the variables we use for them are pointers.  If you use D1 = D2, then D1 and D2 will point to the *same* dictionary.  You can use the `copy()` method for dictionaries instead.  *NOTE: The `copy()` method performs what's known as a "shallow" copy and works as expected when the values in your dictionary are simple, immutable data types.  If your dictionary contains objects as values, then you must perform a "deep" copy using the `deepcopy()` method.  This applies to sets and lists as well. See the Additional Resources section below for more information.*

In [None]:
D1 = {}
D1[1] = "Number 1"
D1[2] = "Number 2"
D1[3] = "Number 3"

print(D1)
D1[2] = "Number two"
print(D1)

print("The number of entries in D1 =",len(D1))

# Does not make a copy, changing D2 will affect D1
print("\nIncorrect copy")
D2 = D1
D2[2] = "Number two"
print(D1)
print(D2)

# Makes a shallow copy, changing D2 will not affect D1
print("\nCorrect, shallow copy")
D2 = D1.copy()
D2[3] = "Number three"
print(D1)
print(D2)

### Sets

Here are examples of some common design patterns using sets. Let's start with creating a set and adding / accessing values.  Remember that sets are **unordered**.

NOTE: You can't initialize an empty set with this syntax: `S = {}`; that's the syntax for an empty dictionary.  You create an empty set like this: `S = set()`

In [None]:
S1 = set()

S1.add(10)
S1.add(11)
S1.add(12)
print(S1)

S2 = {13,14,15}
print(S2)

Items in sets are unique (no duplicates)

<br clear="all" />
<img src="../../images/check.png" align="left" />
<br clear="all" />

What if you try to remove an item that's not in the set?  After exploring the code below, modify it and see what happens.

In [None]:
S1 = {2,4,6,8}
print(S1)
S1.add(6) # Our set already contains 6
print(S1)
S1.remove(2)
print(S1)

Here's a really powerful use-case for sets.  Let's say you have a string and you want to know how many unique letters it contains and what they are.  This is super easy with sets!

In [None]:
strInput = input("Enter a string:")
S = set(strInput) # Cast the string to a set.  It will toss duplicate letters.
print("There are {0:d} unique characters in your string.".format(len(S)))
L = list(S)
L.sort() # Sort works "in place" so we need to create the list then sort it.
print("They are:","".join(L)) # Convert the list of unique characters into a string

Iterating across sets

In [None]:
import random

# Create a set with random numbers and print it
S = set()
for _ in range(10):
    S.add(random.randint(1,100))
print(S)

# Iterate through the set and print each item
for item in S:
    print(item)

Python offers a robust library of methods to operate on sets.  Here are a few of the most commonly used:

<html>
<head>
<style>
table {
  font-family: arial, sans-serif;
  border-collapse: collapse;
  width: 100%;
}

td, th {
  border: 1px solid #dddddd;
  text-align: left;
  padding: 8px;
}

tr:nth-child(even) {
  background-color: #dddddd;
}
</style>
</head>
<body>

<table align="left">
  <tr>
    <th align="left">Method</th>
    <th align="left">Symbol</th>
    <th align="left">Description</th>
  </tr>
  <tr>
    <td align="left">S.add(n)</td>
    <td align="left">N/A</td>
    <td>Adds n to the set S.</td>
  </tr>
  <tr>
    <td align="left">clear(S)</td>
    <td align="left">N/A</td>
    <td>Removes all the elements from S.</td>
  </tr>
  <tr>
    <td align="left">S1.difference(S2)</td>
    <td align="left">S1 - S2</td>
    <td>Returns a set containing elements in S1 that are not in S2.</td>
  </tr>
  <tr>
    <td align="left">S1.intersection(S1)</td>
    <td align="left">S1 &amp; S2</td>
    <td>Returns a set containing elements common to S1 and S2.</td>
  </tr>
  <tr>
    <td align="left">S1.isdisjoint(S2)</td>
    <td align="left">N/A</td>
    <td>Returns True if S1 and S2 have no intersection; False otherwise.</td>
  </tr>
  <tr>
    <td align="left">S1.issubset(S2)</td>
    <td align="left">S1 &lt;&#61; S2</td>
    <td>Returns True if every element in S1 is in S2; False otherwise.</td>
  </tr>
  <tr>
    <td align="left">S1.isssuperset(S2)</td>
    <td align="left">S1 >= S2</td>
    <td>Returns True if every element in S2 is in S1; False otherwise.</td>
  </tr>
  <tr>
    <td align="left">S1.remove(n)</td>
    <td align="left">N/A</td>
    <td>Removes n from S1.</td>
  </tr>
  <tr>
    <td align="left">S1.difference(S2)</td>
    <td align="left">S1 - S2</td>
    <td>Returns a set with elements in S1 but not in S2.</td>
  </tr>
  <tr>
    <td align="left">S1.symmetric_difference(S2)</td>
    <td align="left">S1 ^ S2</td>
    <td>Returns a set with elements in either S1 or S2, but not both.</td>
  </tr>
  <tr>
    <td align="left">S1.union(S2)</td>
    <td align="left">S1 | S2</td>
    <td>Returns a set with elements in both S1 and S2.</td>
  </tr>
</table>

</body>
</html>

In [None]:
S1 = {1,2,3,4,5}
S2 = {3,4,5,6,7}
S3 = {3,4}

print("S1",S1)
print("S2",S2)
print("S3",S3)
print("\nS1 - S2",S1 - S2)
print("S1 & S2",S1 & S2)
print("S3 <= S2",S3 <= S2)
print("S2 >= S1",S2 >= S1)
print("S1 ^ S2",S1 ^ S2)
print("S1 | S2",S1 | S2)

## Additional Resources

[Data Structures (list, dict, tuples, sets, strings)](http://thomas-cokelaer.info/tutorials/python/data_structures.html)

[Python Data Structures – Lists, Tuples, Sets, Dictionaries](https://data-flair.training/blogs/python-data-structures-tutorial/)

[Shallow vs Deep Copying of Python Objects](https://realpython.com/copying-python-objects/)

[Shallow and deep copy operations - Python Documentation](https://docs.python.org/3/library/copy.html)

<hr>

*MIT License*

*Copyright 2019-2020 Peter Nardi*

*Terms of use:*

*Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:*

*The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.*

*THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.*