# Web Scraping With Beautiful Soup
### Author : Alphonse Brandon
### Date : 12 March 2022
### Time : 7:12am 

Importing the required modules

In [1]:
from bs4 import BeautifulSoup
import requests

Consider the following html

In [2]:
%%html
<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<h3><b id='boldest'>Lebron James</b></h3>
<p>Salary: $ 93,000,000</p>
<h3>Stephen: $ 85, 000, 000</p>
<h3>Kevin Durant</h3>
<p> Salary: $ 73, 000, 000</p>
</body>
</html>

Alternatively, storing the html document as a string

In [4]:
html = "<!DOCTYPE html><html><head><title>Page Title</title></head><body><h3><b id='boldest'>Lebron James</b></h3><p> Salary: $ 92,000,000 </p><h3> Stephen Curry</h3><p> Salary: $85,000, 000 </p><h3> Kevin Durant </h3><p> Salary: $73,200, 000</p></body></html>"

In [5]:
soup = BeautifulSoup(html, "html.parser")

Displaying the html string as a nested html using pretify()

In [6]:
print(soup.prettify())

<!DOCTYPE html>
<html>
 <head>
  <title>
   Page Title
  </title>
 </head>
 <body>
  <h3>
   <b id="boldest">
    Lebron James
   </b>
  </h3>
  <p>
   Salary: $ 92,000,000
  </p>
  <h3>
   Stephen Curry
  </h3>
  <p>
   Salary: $85,000, 000
  </p>
  <h3>
   Kevin Durant
  </h3>
  <p>
   Salary: $73,200, 000
  </p>
 </body>
</html>


### Accessing html tags

In [7]:
tag_object = soup.title
print("Tag object:", tag_object)

Tag object: <title>Page Title</title>


Tag object type

In [8]:
print(type(tag_object))

<class 'bs4.element.Tag'>


In [9]:
tag_object = soup.h3
tag_object

<h3><b id="boldest">Lebron James</b></h3>

### Children, Parent and Sibling

In [10]:
tag_child = soup.b 
tag_child

<b id="boldest">Lebron James</b>

In [11]:
parent_tag = tag_child.parent
parent_tag

<h3><b id="boldest">Lebron James</b></h3>

In [12]:
tag_object.parent

<body><h3><b id="boldest">Lebron James</b></h3><p> Salary: $ 92,000,000 </p><h3> Stephen Curry</h3><p> Salary: $85,000, 000 </p><h3> Kevin Durant </h3><p> Salary: $73,200, 000</p></body>

In [13]:
sibling1 = tag_object.next_sibling
sibling1

<p> Salary: $ 92,000,000 </p>

In [14]:
sibling2 = sibling1.next_sibling
sibling2

<h3> Stephen Curry</h3>

### HTML Attributes