## Wątki i procesy

1) Wątki i procesy. Wątki współdzielą pamięć. Procesy nie.

2) Python nie ma problemów z wykonywaniem wielowątkowych skryptów. Natomiast wątki mogą być wydajnie w przypadku operacji IO (odczyt/zapis), natomiast nie operacji obciążających CPU (obliczenia).

3) Rozdzielenie procesów w Pythonie jest bardziej skomplikowane, ze względu na implementację Pythona. Natomiast powstały biblioteki obchodzące ten problem.

In [34]:
import threading
from queue import Queue
import requests
import bs4
import time

print_lock = threading.Lock()

def get_url(current_url):

    with print_lock:
        print("\nStarting thread {}".format(threading.current_thread().name))
    res = requests.get(current_url)
    res.raise_for_status()

    current_page = bs4.BeautifulSoup(res.text,"html.parser")
    current_title = current_page.select('title')[0].getText()

    with print_lock:
        print("{}\n".format(threading.current_thread().name))
        print("{}\n".format(current_url))
        print("{}\n".format(current_title))
        print("\nFinished fetching : {}".format(current_url))

def process_queue():
    while True:
        current_url = url_queue.get()
        get_url(current_url)
        url_queue.task_done()

url_queue = Queue()

url_list = ["https://www.github.com"]*5

for i in range(5):
    t = threading.Thread(target=process_queue)
    t.daemon = True
    t.start()

start = time.time()

for current_url in url_list:
    url_queue.put(current_url)

url_queue.join()

print("Execution time = {0:.5f}".format(time.time() - start))


Starting thread Thread-68
Starting thread Thread-69


Starting thread Thread-70
Starting thread Thread-71
Starting thread Thread-72


Thread-72

https://www.github.com

The world’s leading software development platform · GitHub


Finished fetching : https://www.github.com
Thread-68

https://www.github.com

The world’s leading software development platform · GitHub


Finished fetching : https://www.github.com
Thread-70

https://www.github.com
Thread-71
Thread-69
Execution time = 1.34866



The world’s leading software development platform · GitHub
https://www.github.com
https://www.github.com




Finished fetching : https://www.github.comThe world’s leading software development platform · GitHub
The world’s leading software development platform · GitHub




Finished fetching : https://www.github.com
Finished fetching : https://www.github.com



In [35]:
import threading
from queue import Queue
import time

list_lock = threading.Lock()

def find_rand(num):
    sum_of_primes = 0

    ix = 2

    while ix <= num:
        if is_prime(ix):
            sum_of_primes += ix
        ix += 1

    sum_primes_list.append(sum_of_primes)

def is_prime(num):
    if num <= 1:
        return False
    elif num <= 3:
        return True
    elif num%2 == 0 or num%3 == 0:
        return False
    i = 5
    while i*i <= num:
        if num%i == 0 or num%(i+2) == 0:
            return False
        i += 6
    return True

def process_queue():
    while True:
        rand_num = min_nums.get()
        find_rand(rand_num)
        min_nums.task_done()

min_nums = Queue()

rand_list = [1000000, 2000000, 3000000]
sum_primes_list = list()

for i in range(2):
    t = threading.Thread(target=process_queue)
    t.daemon = True
    t.start()

start = time.time()

for rand_num in rand_list:
    min_nums.put(rand_num)

min_nums.join()

end_time = time.time()

sum_primes_list.sort()
print(sum_primes_list)

print("Execution time = {0:.5f}".format(end_time - start))

[37550402023, 142913828922, 312471072265]
Execution time = 22.88750


In [36]:
import threading
from queue import Queue
import time

list_lock = threading.Lock()

def find_rand(num):
    sum_of_primes = 0

    ix = 2

    while ix <= num:
        if is_prime(ix):
            sum_of_primes += ix
        ix += 1

    sum_primes_list.append(sum_of_primes)

def is_prime(num):
    if num <= 1:
        return False
    elif num <= 3:
        return True
    elif num%2 == 0 or num%3 == 0:
        return False
    i = 5
    while i*i <= num:
        if num%i == 0 or num%(i+2) == 0:
            return False
        i += 6
    return True

def process_queue():
    while True:
        rand_num = min_nums.get()
        find_rand(rand_num)
        min_nums.task_done()

min_nums = Queue()

rand_list = [1000000, 2000000, 3000000]
sum_primes_list = list()

for i in range(1):
    t = threading.Thread(target=process_queue)
    t.daemon = True
    t.start()

start = time.time()

for rand_num in rand_list:
    min_nums.put(rand_num)

min_nums.join()

end_time = time.time()

sum_primes_list.sort()
print(sum_primes_list)

print("Execution time = {0:.5f}".format(end_time - start))

[37550402023, 142913828922, 312471072265]
Execution time = 22.87595


In [38]:
from multiprocessing import Pool
import time

def sum_prime(num):
    
    sum_of_primes = 0

    ix = 2

    while ix <= num:
        if is_prime(ix):
            sum_of_primes += ix
        ix += 1

    return sum_of_primes

def is_prime(num):
    if num <= 1:
        return False
    elif num <= 3:
        return True
    elif num%2 == 0 or num%3 == 0:
        return False
    i = 5
    while i*i <= num:
        if num%i == 0 or num%(i+2) == 0:
            return False
        i += 6
    return True

if __name__ == '__main__':
    start = time.time()
    with Pool(2) as p:
        print(p.map(sum_prime, [1000000, 2000000, 3000000]))
    print("Time taken = {0:.5f}".format(time.time() - start))

[37550402023, 142913828922, 312471072265]
Time taken = 15.90869


## Zadanie domowe

Stwórzcie skrypt wykorzystujący multithreading i mutliprocessing do swoich zastosowań i porównajcie wyniki.

Wróćcie z pytaniami :).

## Dodatkowa literatura
* http://www.dabeaz.com/python/UnderstandingGIL.pdf
* https://www.ploggingdev.com/2017/01/multiprocessing-and-multithreading-in-python-3/