## Turtles All the Way Down: Frames & iFrames

### 1.importing libraries

In [41]:
from bs4 import BeautifulSoup
import requests

### Step 2: Scrape Turtle Family Cards from the Website
***We begin by defining the base URL and constructing the full URL 
that contains turtle family cards. Using the `requests` library,
we send an HTTP GET request to retrieve the webpage content. Then we parse the HTML using BeautifulSoup 
and identify all turtle card sections using the `div` tag with class `turtle-family-card`.
Finally, we prepare an empty dictionary called `turtle_info` to store the extracted turtle names and descriptions.***

In [52]:
# Base URL
base_url = 'https://www.scrapethissite.com'
# URL with turtle family cards
family_list_url = base_url + '/pages/frames/?frame=i'
# Send request to get the page
response = requests.get(family_list_url)
soup = BeautifulSoup(response.content, 'html.parser')
# Find all turtle family cards
family_cards = soup.find_all('div', class_='turtle-family-card')
# Prepare dictionary to store results
turtle_info = {
    'turtle_name': [],
    'description': []
}

### Step 3: Loop Through Turtle Cards and Extract Details

***Now that we've located all turtle family cards on the webpage, we'll loop through each card to extract the turtle name and follow the detail link to get the description. For each card:***
***- We grab the `<h3>` tag with class `family-name` to get the turtle name.***
***- We find the `<a>` tag to locate the detail page URL.***
***- We send another request to the detail page and extract the description from the `<p class="lead">` paragraph.***
***- Each turtle name and its description are added to the `turtle_info` dictionary.***


In [56]:
# Loop through each turtle card
for card in family_cards:
    # Get turtle name
    name_tag = card.find('h3', class_='family-name')
    turtle_name = name_tag.text.strip() if name_tag else "Unknown Family"
    # Get link to detail page
    link_tag = card.find('a', href=True)
    if not link_tag:
        description = "No description available"
    else:
        # Follow the link to get description paragraph
        detail_url = base_url + link_tag['href']
        detail_response = requests.get(detail_url)
        detail_soup = BeautifulSoup(detail_response.content, 'html.parser')
        # Get the first <p class="lead"> paragraph
        description_tag = detail_soup.find('p', class_='lead')
        description = description_tag.get_text(strip=True) if description_tag else "No description available"
    # Add to data
        turtle_info['turtle_name'].append(turtle_name)
        turtle_info['description'].append(description)

### Step 4: Store Turtle Info in a DataFrame
***After collecting turtle names and their descriptions from the website, we organize the data using a Python dictionary called `turtle_info`. Each turtle’s name and description are stored in separate lists. Once all data is collected, we convert this dictionary into a `pandas` DataFrame***

In [57]:
df = pd.DataFrame(turtle_info)
df

Unnamed: 0,turtle_name,description
0,Podocnemididae,ThePodocnemididaefamily of turtles — more comm...
1,Carettochelyidae,TheCarettochelyidaefamily of turtles — more co...
2,Cheloniidae,TheCheloniidaefamily of turtles — more commonl...
3,Chelydridae,TheChelydridaefamily of turtles — more commonl...
4,Dermatemydidae,TheDermatemydidaefamily of turtles — more comm...
5,Dermochelyidae,TheDermochelyidaefamily of turtles — more comm...
6,Emydidae,TheEmydidaefamily of turtles — more commonly k...
7,Geoemydidae,TheGeoemydidaefamily of turtles — more commonl...
8,Kinosternidae,TheKinosternidaefamily of turtles — more commo...
9,Platysternidae,ThePlatysternidaefamily of turtles — more comm...
