### CS 125 Assignment

Before you turn this assignment in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [1]:
NAME = "Leo Wheeler"
COLLABORATORS = ""

---
---

# Lab 10

Objectives:

- Learn about sets in Python
- Gain experience using sets to answer questions

---


## Created by:
- Michael Stobb
- With collaborators: 
    - None
- Lab10

Remember that you are **encouraged** to collaborate on lab activities – just be sure to: 
1. Document your collaborators and sources.
2. Don’t electronically share the code.
3. Understand what you submit. 


---

## Getting Started

The following readings should be done **before** starting the lab assignment.  Be sure you fully understand what each line is doing and what it means before moving on.  If something doesn't make sense, you should:
1. Try different things (write some code, see what happens)
2. Ask your friend in the class
3. Ask the instructor

### Sets in Python

We have been encountering many problems that require us to reason about membership in a collection of data.  Specifically, we wanted to track the unique occurrences of an item in a collection (think Green Eggs & Ham) or the number of of letters in a word (ScrabbleWords); sometimes we just want to know whether an item is in the list.  In fact, there is a whole branch of mathematics, known as *Set Theory*, that studies these kinds of problems. We have found that we were able to create solutions for these problems using a lists or dictionaries. However, problems that rely on these kinds of operations are common enough that we don’t want to have to recreate functional support for them every time we need them. So, as we have seen before, Python has included a built-in structure to help us manage these kinds of problems: the **set**.

#### 1
Create three sets that contain some letters:

In [2]:
set1 = set("abcd")
set2 = set("cdef")
set3 = set("aabbccdd")

In [3]:
set1

{'a', 'b', 'c', 'd'}

In [4]:
set2

{'c', 'd', 'e', 'f'}

In [5]:
set3

{'a', 'b', 'c', 'd'}

Examine the contents of these sets and make sure you understand what Python created.

#### 2
When dealing with other collections, we have a couple of attributes that inform what we can do with them.  See if you can figure out how to answer the following questions:
- Are sets mutable?
- Are sets ordered?
- Are sets indexable?
- Are sets iterable?
- Can I determine the length?
- Can I determine if a specific element is in the set?

1. Sets are mutable
2. Sets are not ordered
3. Sets are not indexable
4. Sets are iterable
5. Yes, using the `len()` function
6. Yes, using the `in` operator

#### 3
Notice that sets are denoted with {}, which conflicts with dictionaries.  If we wanted to create an empty set (which is a very important concept), you have to do it explicitly:

In [6]:
s = set()

In [7]:
s

set()

#### 4
How can you differentiate sets and dictionaries when they have members?  Which of the following is a dictionary and which is a set?

In [8]:
a = {"abc", "cde", "efh"}
b = {"abc":1, "cde":2, "efh":3}

### The dictionary has keys and values, the set just has the values

#### 5
Since we care about membership in sets, how do we add and remove elements from an existing set?  Notice that the set has the ability to **add**, **remove**, and **pop**. Try out these operations to make sure that you understand how they behave.  Most importantly, are they consistent with other collections (lists, dictionaries)?

In [9]:
set5 = {1,2,3,4,4,4,5,6,7,7,8,9,0}

In [10]:
set5.pop()

0

In [11]:
set5

{1, 2, 3, 4, 5, 6, 7, 8, 9}

#### 6
Sets allow us to reason about membership in groups. There are dedicated operations that help facilitate this. Perform the following operations and make sure that you understand the outcomes.

In [12]:
set1 = set("abcd")
set2 = set("cdef")
set3 = set("aabbccdd")

In [13]:
set1.intersection(set2)

{'c', 'd'}

In [14]:
set1.union(set2)

{'a', 'b', 'c', 'd', 'e', 'f'}

In [15]:
set1.difference(set2)

{'a', 'b'}

In [16]:
set1.symmetric_difference(set2)

{'a', 'b', 'e', 'f'}

Do these functions change the original set or return a **new** set?

### Return a new set

In [17]:
help(set)

Help on class set in module builtins:

class set(object)
 |  set() -> new empty set object
 |  set(iterable) -> new set object
 |  
 |  Build an unordered collection of unique elements.
 |  
 |  Methods defined here:
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __contains__(...)
 |      x.__contains__(y) <==> y in x.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iand__(self, value, /)
 |      Return self&=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __ior__(self, value, /)
 |      Return self|=value.
 |  
 |  __isub__(self, value, /)
 |      Return self-=value.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __ixor__(self, value, /)
 |      Re

In [18]:
help(set.difference)

Help on method_descriptor:

difference(...)
    Return the difference of two or more sets as a new set.
    
    (i.e. all elements that are in this set but not the others.)



In [19]:
help(set.symmetric_difference)

Help on method_descriptor:

symmetric_difference(...)
    Return the symmetric difference of two sets as a new set.
    
    (i.e. all elements that are in exactly one of the sets.)



# Assignment

#### Problem 0

The Venn Diagram below represents the relationship between three sets of data. Each circle represents a defined set of data. Some of the data is shared between these sets.
![venn_diagram.png](attachment:venn_diagram.png)

1. Write python code to create the three defined sets that are consistent with the diagram.  You might call the sets `red`, `green`, and `blue`.

2. Write python code to manipulate your original sets that give you the following results:
    1. {i,j}
    2. {a,b,c}
    3. {i,j,n,o,p}
    4. {d}
    5. {k,l,m,n,o,p}

In [20]:
red = set('abcdklmij')
blue = set('ijklmnopqrst')
green = set('defghijnop')

In [21]:
red.intersection(blue,green)

{'i', 'j'}

In [22]:
red.difference(blue,green)

{'a', 'b', 'c'}

In [23]:
green.intersection(blue)

{'i', 'j', 'n', 'o', 'p'}

In [24]:
red.union(green).difference(blue).intersection(red,green)

{'d'}

In [25]:
red.intersection(blue).union(green.intersection(blue)).difference(red.intersection(blue,green))

{'k', 'l', 'm', 'n', 'o', 'p'}

#### Problem 1
The data files /srv/data/lab10/cs.txt (on Moodle as cs.txt) and /srv/data/lab10/math.txt (on Moodle as math.txt) contain the names of the computer science majors and math majors. Write some python code that allows you to determine:
1. All of the math-cs double majors.
2. All of the people majoring in math or cs.
3. All of the people who are strictly cs majors.

In [26]:
def major_check():
    open_file = open('/srv/data/lab10/cs.txt')
    open_file2 = open('/srv/data/lab10/math.txt')
    cs_majors = open_file.read().strip('\t\n').split()
    mt_majors = open_file2.read().strip('\t\n').split()
    
    math_set = set([i for i in mt_majors])
    cs_set = set([i for i in cs_majors])
    both_set = math_set.union(cs_set)
    double_majors = set(i for i in cs_majors if i in math_set)
    
    major = input('What major are you looking for? (CS, CS and Math, or Double Majors) ')
    
    if major.upper() == 'CS':
        return cs_set
    elif major.upper() == 'CS AND MATH':
        return both_set
    elif major.upper() == 'DOUBLE MAJORS':
        return double_majors
    else:
        return print('That is not a valid input, please re-run the function.')
    
    
    return

In [27]:
major_check()

What major are you looking for? (CS, CS and Math, or Double Majors) cs


{'Abel,Maclead',
 'Adell,Lipkin',
 'Aja,Gehrett',
 'Alaine,Bergesen',
 'Ammie,Corrio',
 'Art,Venere',
 'Arthur,Farrow',
 'Audry,Yaw',
 'Beatriz,Corrington',
 'Beckie,Silvestrini',
 'Bernardine,Rodefer',
 'Brandon,Callaro',
 'Brock,Bolognia',
 'Buddy,Cloney',
 'Cammy,Albares',
 'Caprice,Suell',
 'Carlee,Boulter',
 'Carma,Vanheusen',
 'Catarina,Gleich',
 'Christiane,Eschberger',
 'Claribel,Varriano',
 'Clay,Hoa',
 'Cristy,Lother',
 'Delmy,Ahle',
 'Dick,Wenzinger',
 'Donette,Foller',
 'Eden,Jayson',
 'Elouise,Gwalthney',
 'Elvera,Benimadho',
 'Erick,Ferencz',
 'Felicidad,Poullion',
 'France,Buzick',
 'Gail,Kitty',
 'Gayla,Schnitzler',
 'Gladys,Rim',
 'Glory,Kulzer',
 'Golda,Kaniecki',
 'Gracia,Melnyk',
 'Graciela,Ruta',
 'Gregoria,Pawlowicz',
 'Harrison,Haufler',
 'Herminia,Nicolozakes',
 'Ilene,Eroman',
 'Irma,Wolfgramm',
 'Izetta,Funnell',
 'Jamal,Vanausdal',
 'Jennie,Drymon',
 'Jerry,Dallen',
 'Jina,Briddick',
 'Johnetta,Abdallah',
 'Jolene,Ostolaza',
 'Joseph,Cryer',
 'Josephine,Darak

#### Problem 2
The file /srv/data/lab10/studentYear.txt (on Moodle as studentYear.txt) has the same students (+ a few extra) ranked according to their class year. (1=freshmen, 2=sophomore, 3 = junior, 4 = senior).
1. Find all sophomore level CS majors.
2. Find all Freshmen who are not majoring in math or CS
3. Find all the Senior Math and CS majors.

In [28]:
def year_finder():
    open_file = open('/srv/data/lab10/cs.txt')
    open_file2 = open('/srv/data/lab10/math.txt')
    open_file3 = open('/srv/data/lab10/studentYear.txt')
    cs_majors = open_file.read().strip('\t\n').split()
    mt_majors = open_file2.read().strip('\t\n').split()
    student_year = open_file3.readlines()
    
    math_set = set([i for i in mt_majors])
    cs_set = set([i for i in cs_majors])
    both_set = math_set.union(cs_set)
    soph_cs = set()
    fresh_none = set()
    sen_both = set()
    
    for line in student_year[:-1]:
        line = line.split()
        if line[0] == '2':
            if line[1] in cs_majors:
                soph_cs.add(line[1])
        if line[0] == '1':
            if line[1] not in cs_majors and line[1] not in mt_majors:
                fresh_none.add(line[1])
        if line[0] == '4':
            if line[1] in both_set:
                sen_both.add(line[1])
    
    inquiry = input('What are you looking for? (Soph CS Majors, Freshmen not Math or CS, Senior Math or CS): ').upper()
    
    if inquiry == 'SOPH CS MAJORS':
        return soph_cs
    elif inquiry == 'FRESHMEN NOT MATH OR CS':
        return fresh_none
    elif inquiry == 'SENIOR MATH OR CS':
        return sen_both
    else:
        return print('That is not a valid input, please re-run the function')
    

In [29]:
year_finder()

What are you looking for? (Soph CS Majors, Freshmen not Math or CS, Senior Math or CS): soph cs majors


{'Cammy,Albares',
 'Christiane,Eschberger',
 'Clay,Hoa',
 'Elouise,Gwalthney',
 'Elvera,Benimadho',
 'France,Buzick',
 'Gladys,Rim',
 'Irma,Wolfgramm',
 'Jerry,Dallen',
 'Jina,Briddick',
 'Joseph,Cryer',
 'Kattie,Vonasek',
 'Kris,Marrier',
 'Leota,Dilliard',
 'Leota,Ragel',
 'Mollie,Mcdoniel',
 'Noah,Kalafatis',
 'Nu,Mcnease',
 'Raina,Brachle',
 'Roosevelt,Hoffis',
 'Rosio,Cork',
 'Tiffiny,Steffensmeier',
 'Vi,Rentfro',
 'Vincenza,Zepp',
 'Willow,Kusko',
 'Yoko,Fishburne',
 'Yuki,Whobrey'}