## 6. Limitations of BeautifulSoup
- When you load up a website you want to scrape using your browser, the browser will make a request to the page's server to retrieve the page content. That's usually some HTML code, some CSS, and some JavaScript.
- A key difference between loading the page using your browser and getting the page contents using requests is that your browser executes any JavaScript code that the page comes with. Sometimes you will see the initial page content (before the JavaScript runs) for a few moments, and then the JavaScript kicks in.

In [6]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

**Example 1:**

In [41]:
resp = requests.get("http://quotes.toscrape.com/")
print(resp.text)

<!DOCTYPE html>
<html lang="en">
<head>
	<meta charset="UTF-8">
	<title>Quotes to Scrape</title>
    <link rel="stylesheet" href="/static/bootstrap.min.css">
    <link rel="stylesheet" href="/static/main.css">
</head>
<body>
    <div class="container">
        <div class="row header-box">
            <div class="col-md-8">
                <h1>
                    <a href="/" style="text-decoration: none">Quotes to Scrape</a>
                </h1>
            </div>
            <div class="col-md-4">
                <p>
                
                    <a href="/login">Login</a>
                
                </p>
            </div>
        </div>
    

<div class="row">
    <div class="col-md-8">

    <div class="quote" itemscope itemtype="http://schema.org/CreativeWork">
        <span class="text" itemprop="text">“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”</span>
        <span>by <small class="author" itempr

In [42]:
resp = requests.get("http://quotes.toscrape.com/")
soup = BeautifulSoup(resp.text,'lxml')
quote = soup.find('span', class_='text')
print(quote)

<span class="text" itemprop="text">“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”</span>


>-  http://quotes.toscrape.com/js  is the javascript version of http://quotes.toscrape.com
>- So following code returns None

In [43]:
resp = requests.get("http://quotes.toscrape.com/js")
soup = BeautifulSoup(resp.text,'lxml')
quote = soup.find('span', class_='text')
print(quote)

None


**Example 2:**

In [44]:
resp = requests.get("https://arifpucit.github.io/bss2/")
soup = BeautifulSoup(resp.text,'lxml')
price = soup.find('p', class_='price green')
price

<p class="price green">Rs.2000</p>

>-  https://arifpucit.github.io/bss2/js/  is the javascript version of https://arifpucit.github.io/bss2/
>- **Why the following code successfully scrape instead of raising an error?**
>- **Please let me know what is the issue. Thanks**

In [46]:
resp = requests.get("https://hamsof.github.io/bss-js-verison/")
soup = BeautifulSoup(resp.text,'lxml')
price = soup.find('p', class_='price green')
print(price)

None


In [36]:
resp = requests.get("https://hamsof.github.io/bss-js-verison/")
soup = BeautifulSoup(resp.text,'lxml')
#price = soup.find('div', class_='book_container col-sm-4')
soup.find_all('div', class_='book_container col-sm-4')
soup.div

<div class="main-container d-flex align-items-start justify-content-between">
<div class="navbar">
<ul class="nav-links">
<div class="link text-center" id="book_title">Books Titles</div>
<li class="link book_type"><a href="index.html">Operating System</a></li>
<li class="link book_type"><a href="SP.html">System Programming</a></li>
<li class="link book_type"><a href="CA.html">Computer Architecture</a></li>
</ul>
</div>
<!-- main content  -->
<script>
                document.write(
                    `
                    <div class="items" style="" id="main_content">
                <div id="book_page_titile">Operating Systems</div>
                <div class="row">
                    <div class="book_container col-sm-4">
                        <img src="images/OS concepts.jpg" alt="" title="The Linux Programming Interface (TLPI) is the definitive guide &#10; to the Linux and UNIX programming interface—the interface&#10; employed by nearly every application that runs on a &#10;Linu

In [23]:
fd = open('/Users/arif/Downloads/simple-book-show-master3/version4(js)/index.html')
soup = BeautifulSoup(fd,'lxml')
print(soup.prettify())

<!DOCTYPE html>
<html lang="en">
 <head>
  <meta charset="utf-8"/>
  <meta content="width=device-width, initial-scale=1" name="viewport"/>
  <title>
   Version 3
  </title>
  <!-- external style sheet -->
  <link href="./index.css" rel="stylesheet"/>
  <!--Bootstrap style sheet-->
  <link crossorigin="anonymous" href="https://cdn.jsdelivr.net/npm/bootstrap@5.2.0-beta1/dist/css/bootstrap.min.css" integrity="sha384-0evHe/X+R7YkIZDRvuzKMRqM+OrBnVFBL6DOitfPri4tjfHxaWutUpFmBp4vmVor" rel="stylesheet"/>
  <!--for icons of tick and cross ans star for in stock-->
  <link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css" rel="stylesheet"/>
 </head>
 <body>
  <header class="header d-flex align-items-center justify-content-between">
   <img alt="arif" class="image-container" src="./images//arif.jpg"/>
   <p>
    <span class="large_text">
     Scraper Site
    </span>
    <span class="small_text">
     Powered By &amp;copy Dr. Arif Butt
     <span>
     </spa