MULTITHREADING

When to use multithreading?????

I/O bound operations(
    Tasks that spend more time waiting for I/O operations 
    Ex- File operations, Network requests
)

In [1]:
import threading 
import time
# Indicating some task being done #

def func(sec):
    print(f'Sleeping for {sec} seconds')
    time.sleep(sec)


In [2]:
# Normal code # 
func(4)
func(2)
func(1)

Sleeping for 4 seconds
Sleeping for 2 seconds
Sleeping for 1 seconds


In [4]:
# Using threads # 
t1=threading.Thread(target=func,args=[4])
t2=threading.Thread(target=func,args=[2])
t3=threading.Thread(target=func,args=[1])

# Initialising work #
t1.start()
t2.start()
t3.start()

# It took 0.0023......... second #

Sleeping for 4 seconds
Sleeping for 2 seconds
Sleeping for 1 seconds


In [5]:
# Using threads # 
t1=threading.Thread(target=func,args=[4])
t2=threading.Thread(target=func,args=[2])
t3=threading.Thread(target=func,args=[1])

# Initialising work and wait until work is finished #
t1.start()
t2.start()
t3.start()

t1.join()
t2.join()
t3.join()

# It took 4 seconds #

Sleeping for 4 seconds
Sleeping for 2 seconds
Sleeping for 1 seconds


In [None]:
# Auto way of creating threads #
import concurrent.futures as cf

In [10]:
def funct(sec):
    print(f'Sleeping for {sec} seconds')
    time.sleep(sec)
    return 'Done sleeping.'


In [11]:
with cf.ThreadPoolExecutor() as executor:
    f1=executor.submit(funct,1)# returns a Future instance representing the execution of the callable #
    f2=executor.submit(funct,2)
    print(f1.result())
    print(f2.result())

Sleeping for 1 seconds
Sleeping for 2 seconds
Done sleeping.
Done sleeping.


In [None]:
with cf.ThreadPoolExecutor() as executor:
    secs=[1,2,3,4,5]
    results = [executor.submit(funct,sec) for sec in secs]## Using list comprehension ## Can also use map in spite of this ##

    for f in cf.as_completed(results):
        print(f.result())

Sleeping for 1 seconds
Sleeping for 2 seconds
Sleeping for 3 seconds
Sleeping for 4 seconds
Sleeping for 5 seconds
Done sleeping.
Done sleeping.
Done sleeping.
Done sleeping.
Done sleeping.


## Web scraping ##
Web scraping is the automated process of extracting data from websites. 
It involves making numerous requests to fetch web pages.
These tasks are I/O bound because they spend a lot of time waiting for response from server.

In [1]:
import threading
import requests # type: ignore
from bs4 import BeautifulSoup

In [2]:
urls=[
'https://python.langchain.com/docs/introduction/',
'https://python.langchain.com/docs/tutorials/',
'https://python.langchain.com/docs/tutorials/'
]

We will create 3 threads and will hit all the 3 urls at once.

In [3]:
def fetch_content(url):
    response=requests.get(url)
    soup=BeautifulSoup(response.content,'html.parser')
    print(f'Fetched {len(soup.text)} characters from {url}')

threads=[]

for url in urls:
    thread=threading.Thread(target=fetch_content, args=(url, ))
    threads.append(thread)
    thread.start()

for thread in threads:
    thread.join()

print('All web pages fetched. ')

Fetched 9857 characters from https://python.langchain.com/docs/tutorials/
Fetched 9857 characters from https://python.langchain.com/docs/tutorials/
Fetched 12237 characters from https://python.langchain.com/docs/introduction/
All web pages fetched. 
