### Introduction 

When working with web scraping, extracting content efficiently from an HTML document is crucial. BeautifulSoup, a Python library, provides various CSS selectors to locate and retrieve HTML elements using tags, IDs, and classes.

<table>
  <tr>
    <th>CSS Selector</th>
    <th>Description</th>
    <th>Example Pattern Code</th>
    <th>Example Match</th>
  </tr>
  
  <tr>
    <td><span>soup.select('tag')</span></td>
    <td>Selects all elements with the specified tag (e.g., p, h1, div).</td>
    <td>soup.select('p')</td>
    <td>&lt;p&gt;Hello World&lt;/p&gt;</td>
  </tr>
  
  <tr>
    <td><span>soup.select('#id')</span></td>
    <td>Selects elements by their ID using the # symbol.</td>
    <td>soup.select('#header')</td>
    <td>&lt;div id="header"&gt;Header Content&lt;/div&gt;</td>
  </tr>
  
  <tr>
    <td><span>soup.select('.class')</span></td>
    <td>Selects elements by their class using the . symbol.</td>
    <td>soup.select('.button')</td>
    <td>&lt;button class="button"&gt;Click Me&lt;/button&gt;</td>
  </tr>
  
  <tr>
    <td><span>soup.select('div span')</span></td>
    <td>Selects all &lt;span&gt; elements inside &lt;div&gt; elements.</td>
    <td>soup.select('div span')</td>
    <td>&lt;div&gt;&lt;span&gt;Text&lt;/span&gt;&lt;/div&gt;</td>
  </tr>
  
  <tr>
    <td><span>soup.select('div > span')</span></td>
    <td>Selects only &lt;span&gt; elements that are direct children of &lt;div&gt;.</td>
    <td>soup.select('div > span')</td>
    <td>&lt;div&gt;&lt;span&gt;Direct Text&lt;/span&gt;&lt;/div&gt;</td>
  </tr>
</table>


In [2]:
import requests
from bs4 import BeautifulSoup

In [4]:
# URL of the website to scrape
url = "https://www.biography.com/"
response = requests.get(url)

# Parsing the HTML content using BeautifulSoup
soup = BeautifulSoup(response.text, "html.parser")

# Extracting titles using a specific class
# Replace ".css-389x1x" with the actual class name of the title elements
titles = soup.select(".css-389x1x")
for entry in titles:
    print(entry.text.strip())

Inside Dolly Parton’s Private Marriage to Carl Dean
Inside Ruby Franke’s Chilling Spiral from Popular “Momfluencer” to Convicted Felon


In [None]:
## Extracting all links (anchor tags)

In [24]:

links = soup.select("a")
for link in links:
    text = link.get_text(strip=True)
    href = link.get("href", "#")  # Use '#' if href is missing
    print(f"Link to '{text}' ===> https://www.biography.com{href}")

link to 'Search' ===> https://www.biography.com//search
link to 'Women’s History Month' ===> https://www.biography.com//womens-history
link to 'History & Culture' ===> https://www.biography.com//history-culture
link to 'Movies & TV' ===> https://www.biography.com//movies-tv
link to 'Musicians' ===> https://www.biography.com//musicians
link to 'Athletes' ===> https://www.biography.com//athletes
link to 'Artists' ===> https://www.biography.com//artists
link to 'Power & Politics' ===> https://www.biography.com//political-figures
link to 'Business' ===> https://www.biography.com//business-leaders
link to 'Scholars & Educators' ===> https://www.biography.com//scholars-educators
link to 'Scientists' ===> https://www.biography.com//scientists
link to 'Activists' ===> https://www.biography.com//activists
link to 'Notorious Figures' ===> https://www.biography.com//crime
link to 'BIO Buys' ===> https://www.biography.com//bio-buys
link to 'Newsletter' ===> https://www.biography.com//email/biograp

### Explanation:

#### Fetching the Webpage:

    - requests.get(url): Retrieves the webpage content from the provided URL.

    - response.text: Contains the raw HTML data.

#### Parsing HTML with BeautifulSoup:

    - soup = BeautifulSoup(response.text, "html.parser"): Converts the HTML content into a BeautifulSoup object for easy manipulation.

#### Extracting Titles Using a Class Selector:

    - soup.select(".css-389x1x"): Selects elements with the class css-389x1x, which is assumed to be a class used for article titles.

    - entry.text.strip(): Extracts the text and removes unnecessary whitespace.

#### Extracting Links Using a Tag Selector:

    - soup.select("a"): Selects all <a> (anchor) elements.

    - link.get_text(strip=True): Extracts the link text.

    - link.get("href", "#"): Retrieves the href attribute value; if missing, it defaults to #.

    - The output prints each link with its corresponding text and a full URL.