
### **Let's Start Searching Elements with BeautifulSoup**  
We will import **BeautifulSoup** only **once** at the beginning, as it will work throughout the entire file.  

This tutorial has **three sections**:  
1. **[Searching for Elements](#-searching-for-elements)**  
2. **[Navigating Through the DOM](#-navigating-through-the-dom)**  
3. **[Extracting Attributes](#-extracting-attributes)**  

Each section includes **a small, self-contained HTML snippet** (not necessarily following full HTML standards) to keep explanations simple and focused.  

### **Structure of Each Section**  
Each section follows this format:  
- **HTML snippet**  
- **Convert to BeautifulSoup object (`soup`)**  
- **Perform operations**  
    - **Explanation** (what the code does)  
    - **Code implementation**  


In [215]:
from bs4 import BeautifulSoup

#### ✅ **Searching for Elements**

In [216]:
html = """
     <html><head><title>BeautifulSoup Example Page</title></head>
     <bodY>
    <h1>Welcome to My Web Scraping Example Page</h1>
    
    <!-- Paragraphs -->
    <p id="intro">This is an example webpage containing different HTML elements.</p>
    <p class="description">BeautifulSoup can extract text, attributes, and structured data.</p>
    
    </body>
    </html>
 """

In [217]:

soup = BeautifulSoup(html,'html.parser')

### Use of .find('tagname')

Now we will use  soup.find("tagname") or the target_area.find("tagname").
what will you get?
No matter if there are one or many you will get the first element having this tag

In [218]:

title = soup.find('title')
print('title tag: ', title) # <title>BeautifulSoup Example Page</title>

title tag:  <title>BeautifulSoup Example Page</title>


### Use of .find('tagname',attribute)
If you use .find('tagname') you will the first element only .
what if you need a specific one? then you can use the attributes (if any) as:
soup.find('tagname',attribute="value)

In [219]:


intro = soup.find('p',id="intro")
print('"p" tag that has a id named "intro" ',intro) #<p id="intro">This is an example webpage containing different HTML elements.</p>
"""
here find looks for a p tag that have a id with the value of "intro"
 """


"p" tag that has a id named "intro"  <p id="intro">This is an example webpage containing different HTML elements.</p>


'\nhere find looks for a p tag that have a id with the value of "intro"\n '

### soup.find_all("tagname") or soup.find_all("tagname",attribute)
here you can find all elements using the same tag or sharing the same attributes.

In [220]:
all_p = soup.find_all("p")
print(all_p)

[<p id="intro">This is an example webpage containing different HTML elements.</p>, <p class="description">BeautifulSoup can extract text, attributes, and structured data.</p>]


### soup.select('tagname')
this works like the find_all() as it also returns a list.

In [221]:
headings = soup.select("h1")
print(headings)

[<h1>Welcome to My Web Scraping Example Page</h1>]


Sometimes, getting a list can be annoying, right?"
we can specify things with styling we can get a class with a "." and a id with "#"

In [222]:
intro = soup.select('p#intro')
description = soup.select('p.description')
print(intro)
print(description)

[<p id="intro">This is an example webpage containing different HTML elements.</p>]
[<p class="description">BeautifulSoup can extract text, attributes, and structured data.</p>]


#### ✅ **Navigating Through the DOM**

to navigate through the dom we have functions like .children(), .parent(),.find_next_sibling(),find_prev_sibling()
- .children() returns the elements nested inside any element.
- .parent() return the immediate outter layer
- .find_next_sibling() returns the just next elemtnt 
Similarly, find_previous_sibling() works in the opposite direction.


In [223]:

html = """ 

     <html>
     <bodY>
    <ul id="unordered-list">
    <li class="list-item">Item 1</li>
    <li class="list-item">Item 2</li>
    <li class="list-item">Item 3</li>
</ul>
    </body>
    </html>

"""




From the given snippet we would like to have the second element . So it will be easier to show the example of previous and next element

In [224]:
soup = BeautifulSoup(html, 'html.parser')
list_item = soup.find_all("li", class_="list-item")[1]
print(list_item)

<li class="list-item">Item 2</li>


Lets get the parent element .
here the `ul` tag is the parent of the `list_item` .
to get the parent of `list_item` we will use `list_item.parent`
and this will return the whole `ul` tag.

In [225]:
parent = list_item.parent
print("Parent:",parent)  # will print the <ul> tag

Parent: <ul id="unordered-list">
<li class="list-item">Item 1</li>
<li class="list-item">Item 2</li>
<li class="list-item">Item 3</li>
</ul>


Ok now we know how to get the parent . But what about childrens?
to get the childrens we can use `parent.children` it will give an object .
then we can traverse through it or we can keep it as list.

In [226]:
ul_li = list_item.parent
list(ul_li.children)

['\n',
 <li class="list-item">Item 1</li>,
 '\n',
 <li class="list-item">Item 2</li>,
 '\n',
 <li class="list-item">Item 3</li>,
 '\n']

To get the previous siblng of the current element we can use `current_child.find_prevous_sibling()` This will give you the previous element .
Sometimes you may find only `prev_sibling` . It has a feature that it can also return '\n' or ' ' blank space.

In [227]:
prev_sibling = list_item.find_previous_sibling()
print("Prev Sibling:", prev_sibling)  

Prev Sibling: <li class="list-item">Item 1</li>


To get the next sibling we have `current.find_next_sibling()`It works same as `.find_previous_sibling()` but in the opposite way .

In [228]:
print("Next Sibling:", list_item.find_next_sibling())  # Might print '\n'

Next Sibling: <li class="list-item">Item 3</li>


#### ✅ **Extracting Attributes**
- **`.get("attribute_name")`** → Extracts a specific attribute from an element.  
- **`.text` or `.get_text()`** → Extracts text content inside a tag.  

In [229]:
html = """ 
<html>
<body>
<a id="sample-link" href="https://example.com" >Visit Example</a>
</body>
</html>
"""

In [230]:
soup = BeautifulSoup(html,'html.parser')

Lets first get the a tag using find('a') . and check it 

In [231]:
a = soup.find('a')
a

<a href="https://example.com" id="sample-link">Visit Example</a>

to get the id name we will use a.get("id")

In [232]:
id = a.get("id")
id

'sample-link'

in the same way we can get the href also

In [233]:
href = a.get("href")

In [234]:
href

'https://example.com'