# Beautiful Soup 🍲

Beautiful Soup is a Python library that provides tools to scrape and parse HTML or XML documents. It simplifies the process of navigating and searching the HTML or XML tree, making it easier to extract the information you need. Here's a detailed guide on web scraping using Beautiful Soup:

### *▶️* Import pakges

from bs4 import BeautifulSoup<br/>
import requests


### *▶️* Make a request to the webpage:

url = 'https://example.com' <br/>
response = requests.get(url)


### *▶️* Create a Beautiful Soup object:

soup = BeautifulSoup(response.text, 'html.parser')<br/>

Here, 'html.parser' is a parser provided by Python's standard library, but you can also use others like 'lxml' or 'html5lib' depending on your needs.



### *▶️*  Navigate and extract data:

Use Beautiful Soup methods to navigate and extract information from the HTML. Some common methods include find(), find_all(), and select(). . <br/>

### Find a specific element by tag and class

title = soup.find('h1', class_='title')

### Find all elements with a specific tag

paragraphs = soup.find_all('p')

### Extract text content from an element

for paragraph in paragraphs:
    print(paragraph.text)

<br/>

You can also use CSS selectors with the select() method:

### Using CSS selector
title = soup.select('h1.title')








### *▶️*  Handle exceptions:

Web scraping is sensitive to changes in website structure. Always handle exceptions to avoid crashes.


try: <br/>
    title = soup.find('h1', class_='title').text <br/>
except AttributeError as e: <br/>
    title = None <br/>
    print(f"An error occurred: {e}")

<br/>



### *▶️*  Iterate over multiple pages:

If you need to scrape multiple pages, put the scraping logic inside a loop.


for page_num in range(1, 6): <br/>
    url = f'https://example.com/page/{page_num}' <br/>
    response = requests.get(url) <br/>
    soup = BeautifulSoup(response.text, 'html.parser') <br/>
    # Scraping logic for each page

<br/>

### *▶️*  Advanced Usage:
Handling different types of data:

Beautiful Soup can handle both HTML and XML documents. Adjust the parser accordingly.

<br/>

### For XML
soup = BeautifulSoup(xml_data, 'xml')

<br/>

### *▶️*  Navigating the tree:

Beautiful Soup provides various methods to navigate the HTML/XML tree, such as parent, children, descendants, and next_sibling.


parent_element = soup.find('div') <br/>
children_elements = parent_element.children

<br/>

### *▶️*  Modifying the HTML:

You can modify the HTML content and write it back to a file.


title = soup.find('h1') <br/>
title.text = 'New Title' <br/>
with open('new_page.html', 'w') as f: <br/>
    f.write(str(soup))

<br/>    

### *▶️*  Handling forms:

If you need to interact with forms on a website, you may need to use additional libraries like requests to submit data.


payload = {'username': 'your_username', 'password': 'your_password'} <br/>
response = requests.post('https://example.com/login', data=payload)