# Sets

- A set is an unordered collection of unique elements. It is defined by {}.

- Sets can be used to perform mathematical set operations like union, intersection, symmetric difference etc.

In [1]:
#import builtins

# Creating Sets
print("\nCreating Sets:")

# Creating an empty set
empty_set = set()
print("Empty Set: ", empty_set)

# Creating a set from a list
my_list = [1, 2, 3, 4, 5, 5, 6, 6, 7, 7]
my_l_set = set(my_list)
print("Set from list: ", my_l_set)

# Creating a set with elements
my_set = {1, 2, 3, 4, 5,5}
print("Set with elements: ", my_set)

# Creating a set with different data types
fr_set = frozenset({1,3,4})

my_set = {1, "Hello", (1, 2, 3),fr_set}
print("Set with different data types: ", my_set)


Creating Sets:
Empty Set:  set()
Set from list:  {1, 2, 3, 4, 5, 6, 7}
Set with elements:  {1, 2, 3, 4, 5}
Set with different data types:  {1, (1, 2, 3), frozenset({1, 3, 4}), 'Hello'}


In [2]:
# Accessing Elements
print("\nAccessing Elements:")

# Check if an element exists in a set
print("Is 1 in my_set: ", 1 in my_set)

# Loop through the elements of a set
for element in my_set:
    print(element)


Accessing Elements:
Is 1 in my_set:  True
1
(1, 2, 3)
frozenset({1, 3, 4})
Hello


In [3]:
# Set Operations
print("\nSet Operations:")
# Union
set1 = {1, 2, 3}
set2 = {4, 5, 6}
print("Union: ", set1.union(set2)) #{1,2,3,4,5,6}

# Intersection
set1 = {1, 2, 3}
set2 = {2, 3, 4}
print("Intersection: ", set1.intersection(set2)) #{2,3}

# Difference
set1 = {1, 2, 3}
set2 = {2, 3, 4}
print("Difference: ", set1.difference(set2)) #{1}

# Symmetric Difference
set1 = {1, 2, 3}
set2 = {2, 3, 4}
print("Symmetric Difference: ", set1.symmetric_difference(set2)) #{1,4}


Set Operations:
Union:  {1, 2, 3, 4, 5, 6}
Intersection:  {2, 3}
Difference:  {1}
Symmetric Difference:  {1, 4}


In [4]:
my_set ={1,"hello",(1,1,2)}

# Modifying Sets
print("\nModifying Sets:")
# Add elements to a set
my_set.add(6)
print("After adding 6: ", my_set) #{1,"hello",(1,1,2),6}

# Remove elements from a set
my_set.remove(1)
print("After removing 1: ", my_set) #{"hello",(1,1,2),6}

# Clear all elements from a set
my_set.clear()
print("After clearing: ", my_set)


Modifying Sets:
After adding 6:  {1, 6, (1, 1, 2), 'hello'}
After removing 1:  {6, (1, 1, 2), 'hello'}
After clearing:  set()


In [5]:
set1 = {1, 2, 3}
set2 = {2, 3, 4}

# len() method
print("Length: ", len(set1))

# copy() method
set3 = set1.copy()
print("Copy s3: ", set3)

print("Set 1 again:", set1)
print("Reprint set3:", set3)

# difference_update() method
set1.difference_update(set2)
print("difference_update: ", set1)
print(set1)
print(set2)

# isdisjoint() method
set1 = {1, 2, 3}
set2 = {4, 5, 6}
print("two set disjoint:", set1.isdisjoint(set2))

Length:  3
Copy s3:  {1, 2, 3}
Set 1 again: {1, 2, 3}
Reprint set3: {1, 2, 3}
difference_update:  {1}
{1}
{2, 3, 4}
two set disjoint: True


## Bitwise operation on set

- The bitwise operator `|` is used to perform the union operation,

- `&` is used to perform the intersection operation, 

- `-` is used to perform the difference operation, 

- and `^` (XOR) is used to perform the symmetric difference operation.

In [6]:
# Creating Sets
set1 = {1, 2, 3}
set2 = {2, 3, 4}

# Union
union = set1 | set2
print("Union: ", union)

# Intersection
intersection = set1 & set2
print("Intersection: ", intersection)

# Difference
difference = set1 - set2
print("Difference: ", difference)

# Symmetric Difference
symmetric_difference = set1 ^ set2
print("Symmetric Difference: ", symmetric_difference)


Union:  {1, 2, 3, 4}
Intersection:  {2, 3}
Difference:  {1}
Symmetric Difference:  {1, 4}


__Practice Problems__

1. Given a list of numbers, create a new set that contains only the unique numbers from the list.

2. Given two sets, A and B, check if A is a subset of B.

3. Given a list of words, create a new set that contains only the words that have a length greater than 3 characters.

4. Given a set of integers, remove all even numbers from the set

In [7]:
numbers = [1, 2, 3, 4, 5, 3, 2, 1, 7, 8, 9, 8, 7]

# Your code here:
unique_numbers = set(numbers)

print(unique_numbers) # {1, 2, 3, 4, 5, 7, 8, 9}


{1, 2, 3, 4, 5, 7, 8, 9}


In [8]:
A = {1, 2, 3}
B = {1, 2, 3, 4, 5}

# Your code here:
is_subset = A.issubset(B)

print(is_subset) # True


True


In [9]:
list_word = ['the', 'cat', 'in', 'the', 'hat','elephant']

# Your code here:
long_words = {word for word in list_word if len(word) > 3}

print(long_words) # {'elephant'}

{'elephant'}


In [10]:
numbers = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

# Your code here:
odd_numbers = {x for x in numbers if x % 2 != 0}

print(odd_numbers) # {1, 3, 5, 7, 9}


{1, 3, 5, 7, 9}


# Code profiling 

Code profiling is the process of measuring the performance of a program and analyzing its behavior. This is done to identify performance bottlenecks, find areas that need optimization, and improve the overall efficiency of the code.

There are several ways to profile code in Python:

- The __timeit__ module: The __timeit__ module is a built-in Python library that allows you to measure the execution time of small bits of Python code. You can use the __timeit__ module to measure the time it takes to run a specific piece of code, and then compare the results to see if the code is running as efficiently as it should be.

- The __cProfile__ module: The __cProfile__ module is another built-in Python library that provides a more in-depth view of the performance of a program. It tracks the time spent in each function and provides information about the number of calls to each function and the amount of time spent in each call. This information can help you identify areas where you need to make performance improvements.

- Third-party profilers: There are also several third-party profilers available for Python, including PyCharm, Pyflame, and PyCallGraph, which provide even more detailed performance analysis and visualization tools.

Here's a simple example of how to use the timeit module to profile a piece of code:

```Python
import timeit

def my_function():
    # your code here

print(timeit.timeit(my_function, number=1000))
```

In [11]:
import timeit

print(timeit.timeit("[i**2 for i in range(10000)]", number=1000))

1.7915010000000002


In [12]:
import time
start_time = time.time()
for i in range(1000):
    x = sum([i**2 for i in range(10000)])
end_time = time.time()
time = end_time -start_time
print(time)

1.8433310985565186


## cProfile 

- cProfile is a profiler included in the Python standard library. It is used to measure the performance of Python programs, specifically the execution time of individual functions and the number of function calls. 

- The cProfile module provides a simple way to measure the performance of your Python code, allowing you to identify which parts of your code are taking the most time to execute and potentially optimize those parts for better performance.

Here's an example of how to use the cProfile module to profile a Python script:

```Python
import cProfile
import my_script

cProfile.run("my_script.main()")
```

In [14]:
import cProfile
import my_new_script
cProfile.run("my_new_script.main()")

332833497.5800276
         35 function calls in 0.011 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.010    0.010 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 iostream.py:202(schedule)
        2    0.000    0.000    0.000    0.000 iostream.py:429(_is_master_process)
        2    0.000    0.000    0.000    0.000 iostream.py:448(_schedule_flush)
        2    0.000    0.000    0.000    0.000 iostream.py:518(write)
        1    0.000    0.000    0.000    0.000 iostream.py:90(_event_pipe)
        1    0.000    0.000    0.000    0.000 my_new_script.py:1(maths)
        1    0.000    0.000    0.000    0.000 my_new_script.py:2(<listcomp>)
        1    0.000    0.000    0.010    0.010 my_new_script.py:4(trig)
        1    0.010    0.010    0.010    0.010 my_new_script.py:6(<listcomp>)
        1    0.000    0.000    0.010    0.010 my_new_script.py:8(main)
        1    0.000    0.

# Practice Problems

These problems go beyond this lecture. They are for you to review what you have learned so far. Have fun!

__Problem 1:__ Given a string. Write down a function to generate a dictionary whose keys are characters of the string, whose value are the keys' frequency in the string. 

__Hint:__

```Python
    def string_to_dict(mystr):
       #code here
       return  mydict
```
Expectation:
```Python
    mystr = "How are you doing?"
    string_to_dict(mystr) #{'d': 1, 'a': 1, 'y': 1, 'i': 1, 'n': 1, 'e': 1, ' ': 3, 'r': 1, 'H': 1, '?': 1, 'o': 3, 'u': 1, 'w': 1, 'g': 1}
```

__Problem 2:__ Given a string, extract all the email adresses and phone numbers that it has. 

NOTE: 
- email needs a domain, like: gmail.com, uidaho.edu. Example: bob@gmail.com, donal@uidaho.edu, etc.

- a phone number has 10 digits, they can be seperated into up to $3$groups, which have $3$,$3$, and $4$ digits, respectively. There may be a hyphen between the group. The first group may have parentheses. For example, the following format are acceptable: (1) xxx-xxx-xxxx, (2) (xxx)-xxx-xxxx, (3) (xxx)xxxxxxx, etc.

__Hint:__

```Python
    def phone_email(mystr):
        #code here
        return phone_list, email_list
```
__Expection:__

```Python
mystr = """Nick sent me his email yesterday, here it is nick@gmail.com. I ask Tom and Cadie their emails too. However, they didn't give me their email addresses. They gave me their phone numbers instead; (208)-882-3333 and 2888880001"""

phone_list, email_list = phone_email(mystr)
print(phone_list) #[(208)-882-3333, 2888880001]
print(email_list) #["nick@gmail.com"]
```