### Import libraries
The below code imports the necessary libraries for web scraping, namely requests for making HTTP requests and BeautifulSoup for parsing HTML.

In [2]:
import requests
from bs4 import BeautifulSoup

### URL and Request
It defines the URL of the website to scrape and uses the requests.get() function to fetch the webpage content.

In [3]:
URL = "https://realpython.github.io/fake-jobs/"
page = requests.get(URL)
#print("\n***Page***")
#print(page.text)
#print("***END***")

### Soup Creation
The HTML content of the webpage is parsed using BeautifulSoup, which makes it easier to navigate and extract data from the HTML.

In [4]:
soup = BeautifulSoup(page.content, "html.parser")
#print ("\n***Soup***")
#print(soup)
#print("***END***")

### Find Results Container
It locates the container element on the webpage that holds all the job listings.

#### What is Container element
The container element, which encapsulates the desired content, is typically defined with a unique identifier, such as an 'id' attribute assigned by the webpage developer.
Identifying the target content for web scraping involves utilizing browser developer tools to inspect the webpage, locate the container element—often referred to as the 'contain element'—by hovering over elements and checking the highlighted HTML code. 

In [5]:
results = soup.find(id="ResultsContainer")
#print ("\n***results***")
#print(results.prettify())
#print("***END***")

### Extract Job Elements
It finds all the job elements within the results container and iterates over each job element to extract specific information like title, company, and location.

#### Difference between find_all and find
1. The find_all() method searches for all occurrences of a specified tag or combination of tags within a given context.
2. The find() method searches for the first occurrence of a specified tag or combination of tags within a given context.

#### Why no need mentioned full class name in find function?
When using the find() method in BeautifulSoup to search for elements by class, you can specify only one class name at a time. Therefore, in the code snippet:
- title_element = job_element.find("h2", class_="title")
- company_element = job_element.find("h3", class_="company")

The class attribute specified ("title" for \<h2> and "company" for \<h3>) is just one of the classes that the element may have. BeautifulSoup matches elements that have the specified class among other classes, even if it's not an exact match.


In [7]:
#This line simply prints a separator indicating the start of the section where job elements will be printed.
print ("\n***Job Element***")

#This line searches for all <div> elements with a class attribute set to "card-content" within the results container. 
#These elements represent individual job listings on the webpage.
job_elements = results.find_all("div", class_="card-content")

#This line starts a loop that iterates over each job element found in the previous step.
for job_element in job_elements:
    title_element = job_element.find("h2", class_="title")

#Within each job element, this line searches for an <h2> element with a class attribute set to "title". 
#This is likely the title of the job listing.
    
    company_element = job_element.find("h3", class_="company")
    location_element = job_element.find("p", class_="location")
    print(title_element)
    print(company_element)
    print(location_element)
    print()
    print(title_element.text)
    print(company_element.text)
    print(location_element.text)
    print()
    print(title_element.text.strip())
    print(company_element.text.strip())
    print(location_element.text.strip())
    print()
print("***END***")

#.text is used to extract the raw text content of an HTML element, including any whitespace within the text.
#.strip() is used to clean up the text content by removing leading and trailing whitespace, making it more readable and usable in further processing.


***Job Element***
<h2 class="title is-5">Senior Python Developer</h2>
<h3 class="subtitle is-6 company">Payne, Roberts and Davis</h3>
<p class="location">
        Stewartbury, AA
      </p>

Senior Python Developer
Payne, Roberts and Davis

        Stewartbury, AA
      

Senior Python Developer
Payne, Roberts and Davis
Stewartbury, AA

<h2 class="title is-5">Energy engineer</h2>
<h3 class="subtitle is-6 company">Vasquez-Davidson</h3>
<p class="location">
        Christopherville, AA
      </p>

Energy engineer
Vasquez-Davidson

        Christopherville, AA
      

Energy engineer
Vasquez-Davidson
Christopherville, AA

<h2 class="title is-5">Legal executive</h2>
<h3 class="subtitle is-6 company">Jackson, Chambers and Levy</h3>
<p class="location">
        Port Ericaburgh, AA
      </p>

Legal executive
Jackson, Chambers and Levy

        Port Ericaburgh, AA
      

Legal executive
Jackson, Chambers and Levy
Port Ericaburgh, AA

<h2 class="title is-5">Fitness centre manager</h2>
<h3 cl

In [8]:
print ("\n***Python Job Element***")
python_jobs = results.find_all("h2", string=lambda text: "python" in text.lower())

#This line searches for all <h2> elements which is the job title within the results container that contain the word "python" in their text.
#The string argument is a callable that takes a string as input and returns True if the string contains the word "python" in lowercase.
#"lambda text: "python" in text.lower()" is a lambda function that takes the text content of each <h2> element (represented by the variable text), converts it to lowercase using text.lower(), 
#and then checks if the substring "python" is present in the lowercase text.
print("Number of elements: ", len(python_jobs))
python_job_elements = [
    h2_element.parent.parent.parent for h2_element in python_jobs
]
for job_element in python_job_elements:
    title_element = job_element.find("h2", class_="title")
    company_element = job_element.find("h3", class_="company")
    location_element = job_element.find("p", class_="location")
    print(title_element.text.strip())
    print(company_element.text.strip())
    print(location_element.text.strip())
    print()
print("***END***")


***Python Job Element***
Number of elements:  10
Senior Python Developer
Payne, Roberts and Davis
Stewartbury, AA

Software Engineer (Python)
Garcia PLC
Ericberg, AE

Python Programmer (Entry-Level)
Moss, Duncan and Allen
Port Sara, AE

Python Programmer (Entry-Level)
Cooper and Sons
West Victor, AE

Software Developer (Python)
Adams-Brewer
Brockburgh, AE

Python Developer
Rivera and Sons
East Michaelfort, AA

Back-End Web Developer (Python, Django)
Stewart-Alexander
South Kimberly, AA

Back-End Web Developer (Python, Django)
Jackson, Ali and Mckee
New Elizabethside, AA

Python Programmer (Entry-Level)
Mathews Inc
Robertborough, AP

Software Developer (Python)
Moreno-Rodriguez
Martinezburgh, AE

***END***
