<a href="https://colab.research.google.com/github/ddoberne/colab/blob/main/lessons/17_Membership_and_Sets.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 17 Membership and Sets

If we wanted to check to see if a list has a certain number or string in it, we could do something like this:

In [15]:
def check_membership(x, mylist: list) -> bool:
  """Checks to see if x is in mylist."""
  for member in mylist:
    if x == member:
      return True
  return False

In [16]:
a = 'banana'
b = 3
fruits = ['avocado', 'banana', 'cherry', 'durian']

print(check_membership(a, fruits))
print(check_membership(b, fruits))

True
False


This function iterates through the given list and sees if ```x``` is equal to any of its members, returning True if that's the case and False if it reaches the end. Fortunately, there's a much easier way to do this:

In [17]:
a in fruits

True

... making it very easy to check if a specific value is in a list.

```<member> in <list>```

... will yield a ```True``` or ```False``` depending on whether the member is in the list or not.

Sometimes, we want a list-like data type that ignores duplicates, and only cares about whether or not a certain number, string, or object is in it. Say for example you are putting a bunch of fruits into a basket; at the end, you want to see what different types of fruit are in it, and not necessarily every single fruit in the basket. This becomes especially important when working with larger lists, where you wouldn't want to sift through 200 individual members to find all the uniques.

In [18]:
import random
big_basket = ['avocado'] * 15
big_basket.extend(['banana'] * 8)
big_basket.extend(['cherry'] * 30)
big_basket.extend(['durian'])
big_basket.extend(['grape'] * 20)
random.shuffle(big_basket)
print(big_basket)

['avocado', 'grape', 'avocado', 'cherry', 'cherry', 'avocado', 'grape', 'cherry', 'cherry', 'grape', 'cherry', 'cherry', 'cherry', 'grape', 'durian', 'cherry', 'grape', 'grape', 'banana', 'avocado', 'grape', 'cherry', 'cherry', 'grape', 'banana', 'banana', 'avocado', 'avocado', 'cherry', 'grape', 'grape', 'cherry', 'banana', 'avocado', 'cherry', 'grape', 'cherry', 'cherry', 'cherry', 'banana', 'grape', 'grape', 'avocado', 'cherry', 'cherry', 'avocado', 'avocado', 'grape', 'cherry', 'grape', 'cherry', 'grape', 'banana', 'cherry', 'grape', 'cherry', 'avocado', 'avocado', 'cherry', 'grape', 'cherry', 'cherry', 'cherry', 'banana', 'cherry', 'cherry', 'grape', 'avocado', 'avocado', 'avocado', 'banana', 'cherry', 'grape', 'cherry']


Can you find the durian? If you were handed the list (and not the above instructions of what to put in it), it would be easy to miss!

For these situations, it becomes helpful to use **sets**. Converting a list to a set is as easy as ```set(<list>)```.

In [19]:
myset = set(big_basket)
print(myset)

{'avocado', 'durian', 'grape', 'banana', 'cherry'}


When checking for membership in sets, use ```in``` the same way you would use it for lists.

In [20]:
print('durian' in myset)
print('orange' in myset)

True
False


It's important to note that sets are unordered, so indexing (something like ```myset[1]```) does not work on them. You can still iterate through each member of a set with a for loop, but just don't be dependent on any specific order.

In [21]:
for member in myset:
  print(member)

avocado
durian
grape
banana
cherry


To add or remove members from a set, use ```<set>.add(<member>)``` or ```<set>.remove(<member>)```.

In [22]:
# This doesn't do anything because there's already an apple...
myset.add('apple') 
# ... but it doesn't cause an error either.
myset.add('tomato')
myset.remove('durian')
print(myset)

{'avocado', 'grape', 'tomato', 'banana', 'cherry', 'apple'}


In [None]:
# However, trying to remove something that isn't in the set will cause an error.
myset.remove('blackberry')

Finally, to convert a set to a list, you can use ```list(<set>)```. The set doesn't keep track of how many of each member were added or its order, so keep that in mind.

In [23]:
mylist = list(myset)
print(mylist)

['avocado', 'grape', 'tomato', 'banana', 'cherry', 'apple']


Notice how when printing the set in earlier cells, it was surrounded by curly braces ('{' and '}'), whereas the list version was surrounded by square brackets. This is Python's way of telling you what type your variable is!

# Practice!

Write a function, ```union()```, that takes in two lists and returns a set with all of the unique members of both lists combined.

Write a function, ```intersection()``` that takes in two lists and returns a set with all of the members present in both lists.

In [24]:
### YOUR CODE HERE ###

In [None]:
# Don't modify the contents of this cell!
fibonacci = [1, 1, 2, 3, 5, 8, 13, 21, 34]
parabola = [1, 5, 8, 9, 8, 5, 1]

# Should print {1, 2, 3, 5, 8, 9, 13, 21, 34}
print(union(fibonacci, parabola))

# Should print {1, 5, 8}
print(intersection(fibonacci, parabola))