<a href="https://colab.research.google.com/github/noelmtv/Colab-Learning/blob/Pyhon-for-Data-Analytics/%5B10%5D_Sets.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **What Are Sets in Python?**
A **set** in Python is an **unordered** collection of **unique** items. It’s a powerful data structure when you need to store distinct elements and perform operations like unions, intersections, and differences.

- **Unordered**: Items in a set do not have a specific order.
- **Unique**: Duplicate elements are automatically removed.
- **Mutable**: You can add or remove elements from a set, but its elements must be immutable (e.g., numbers, strings, tuples).

---

### **Syntax**
```python
my_set = {1, 2, 3}
```
Or, use the `set()` constructor:
```python
my_set = set([1, 2, 3])
```

---

### **Why Use Sets?**
- To eliminate duplicates from a list.
- To perform mathematical operations like union, intersection, and difference.
- To check for membership quickly.

---

### **Sets in Google Colab**
In Google Colab, working with sets is the same as any Python environment. Colab makes it easy to visualize the results of set operations, which is helpful for learning.

---

### **Key Operations with Examples**
1. **Creating a Set**
   ```python
   fruits = {"apple", "banana", "cherry"}
   print(fruits)
   ```

2. **Adding Elements**
   ```python
   fruits.add("orange")
   print(fruits)
   ```

3. **Removing Elements**
   ```python
   fruits.remove("banana")  # Throws an error if the item is not found
   print(fruits)

   # Use discard() to avoid errors if the item is not found
   fruits.discard("mango")
   print(fruits)
   ```

4. **Checking Membership**
   ```python
   print("apple" in fruits)  # True
   print("grape" in fruits)  # False
   ```

5. **Union of Sets**
   Combines all unique elements from two sets.
   ```python
   set1 = {1, 2, 3}
   set2 = {3, 4, 5}
   print(set1 | set2)  # {1, 2, 3, 4, 5}
   ```

6. **Intersection of Sets**
   Finds common elements between two sets.
   ```python
   print(set1 & set2)  # {3}
   ```

7. **Difference of Sets**
   Finds elements in one set but not in another.
   ```python
   print(set1 - set2)  # {1, 2}
   ```

8. **Symmetric Difference**
   Finds elements in either set but not in both.
   ```python
   print(set1 ^ set2)  # {1, 2, 4, 5}
   ```

9. **Converting a List to a Set to Remove Duplicates**
   ```python
   numbers = [1, 2, 2, 3, 4, 4, 5]
   unique_numbers = set(numbers)
   print(unique_numbers)  # {1, 2, 3, 4, 5}
   ```

---

### **Practical Example in Colab**
Here’s a hands-on example to try in Google Colab:

```python
# Create two sets
class_A = {"Alice", "Bob", "Charlie"}
class_B = {"Charlie", "David", "Eve"}

# Union: Students in either class
print("Union:", class_A | class_B)

# Intersection: Students in both classes
print("Intersection:", class_A & class_B)

# Difference: Students only in class_A
print("Difference (A - B):", class_A - class_B)

# Symmetric Difference: Students in one class but not both
print("Symmetric Difference:", class_A ^ class_B)

# Check membership
if "Alice" in class_B:
    print("Alice is in Class B")
else:
    print("Alice is NOT in Class B")
```

---

### **Why Sets Are Great in Colab**
- Sets are lightweight and fast for membership testing.
- They simplify complex operations, such as finding common or unique elements between groups.

Would you like to explore advanced use cases for sets, like handling large datasets or using them in combination with other data structures?

# Lecture Practice

In [None]:
job_skills = {"Python", "SQL", "R", "Tableau", "Power BI"}
print(job_skills)

{'Power BI', 'SQL', 'R', 'Tableau', 'Python'}


In [None]:
job_skills
# sets do organize their contents but dont always expect it to be ordered the way you expect

{'Power BI', 'Python', 'R', 'SQL', 'Tableau'}

In [None]:
job_skills[1]

TypeError: 'set' object is not subscriptable

In [None]:
# sets are not indexed

In [None]:
help(set)

Help on class set in module builtins:

class set(object)
 |  set() -> new empty set object
 |  set(iterable) -> new set object
 |  
 |  Build an unordered collection of unique elements.
 |  
 |  Methods defined here:
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __contains__(...)
 |      x.__contains__(y) <==> y in x.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iand__(self, value, /)
 |      Return self&=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __ior__(self, value, /)
 |      Return self|=value.
 |  
 |  __isub__(self, value, /)
 |      Return self-=value.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __ixor__(self, value, /)
 |      Re

In [None]:
job_skills.add('Looker')

In [None]:
job_skills

{'Looker', 'Power BI', 'Python', 'R', 'SQL', 'Tableau'}

In [None]:
job_skills.add("SQL")

In [None]:
job_skills

{'Looker', 'Power BI', 'Python', 'R', 'SQL', 'Tableau'}

In [None]:
# notice "SQL" is not repeated, sets only contain unique values

In [None]:
job_skills.pop("Tableau")

TypeError: set.pop() takes no arguments (1 given)

In [None]:
# no index and unordered so pop fuction is randomised
job_skills.pop()

'Power BI'

In [None]:
job_skills
# Power BI randomly removed

{'Looker', 'Python', 'R', 'SQL', 'Tableau'}

In [None]:
job_skills.remove("R")

In [None]:
job_skills
# R properly removed

{'Looker', 'Python', 'SQL', 'Tableau'}

## Converting lists into sets

In [None]:
skill_list = ["Python", "SQL", "R", "Tableau", "Power BI", "Python", "SQL"]

In [None]:
set(skill_list)
# see how the set organises the list and removes duplicates

{'Power BI', 'Python', 'R', 'SQL', 'Tableau'}

In [None]:
list(set(skill_list))
# now we convert the set bak into a list after cleaning it up

['Power BI', 'Python', 'SQL', 'R', 'Tableau']