# BeautifulSoup 4: Navigation

BeautifulSoup is a powerful package mostly due to the abundance of Navigation methods in the package. Below, the lsit of most used Navigation functions is provided:

- find / findAll
- findNext / findAllNext
- findPrevious / findAllPrevious
- findNextSibling / findNextSiblings
- findParent / findParents
- findChild / findChildren

The functions on the left hand side of the backward slash sign (in a singular form) find and return the very first matching string. Thus, the outcome is a string. The functions on the right hand side of the backward slash sign (in a plural form) find and return all matching strings in a list. Thus, the outcome is a list of strings.

In [1]:
import requests
from BeautifulSoup import *

In [2]:
url = "https://hrantdavtyan.github.io/"

In [3]:
response = requests.get(url)
page = response.text
soup = BeautifulSoup(page)

### findNext() - Finding e-mail trough its label

There are unlimited number of options for matching an e-mail from a page. This time, we will try to find my e-mail by first finding its label (E-mail) and going forward to the e-mail itself using the **findNext()** function.

Thus, those are the steps to take:

- Find all labels,
- Choose the label that we are interested in,
- Navigate one step forward to get the e-mail with tags,
- Get the text component out of the tag.

In [5]:
# finding all labels
label_tags = soup.findAll('label')
print(label_tags)

[<label>Name</label>, <label>Date of birth</label>, <label>Address</label>, <label>Email</label>, <label>Phone</label>, <label>Website</label>, <label for="name">Your Name</label>, <label for="email">Your Email</label>, <label for="message">Your Message</label>]


In [18]:
# choosing our label of interest
email_label = label_tags[3]
print(email_label)

<label>Email</label>


In [19]:
# navigating one tag forward and getting the text/string of it
email = email_label.findNext().text
print(email)

hdavtyan@aua.am


Similarly, if once uses the **findAllNext()** function, then s/he will get a list of all the following tags (with their contents), where the very first one will be the e-mail we were looking for.

In [20]:
print(email_label.findAllNext())

[<span>hdavtyan@aua.am</span>, <li><label>Phone</label><span>+374 99 02-06-62</span></li>, <label>Phone</label>, <span>+374 99 02-06-62</span>, <li><label>Website</label><span>HrantDavtyan.GitHub.io</span></li>, <label>Website</label>, <span>HrantDavtyan.GitHub.io</span>, <div class="menu">
<ul class="tabs">
<li><a href="#profile" class="tab-profile">Profile</a></li>
<li><a href="#resume" class="tab-resume">Resume</a></li>
<li><a href="#portfolio" class="tab-portfolio">Portfolio</a></li>
<li><a href="#contact" class="tab-contact">Contact</a></li>
</ul>
</div>, <ul class="tabs">
<li><a href="#profile" class="tab-profile">Profile</a></li>
<li><a href="#resume" class="tab-resume">Resume</a></li>
<li><a href="#portfolio" class="tab-portfolio">Portfolio</a></li>
<li><a href="#contact" class="tab-contact">Contact</a></li>
</ul>, <li><a href="#profile" class="tab-profile">Profile</a></li>, <a href="#profile" class="tab-profile">Profile</a>, <li><a href="#resume" class="tab-resume">Resume</a><

**findPrevious()** and **findAllPrevious()** functions follow the same intutition.

### findNextSibling() - Finding e-mail trough its label

The same objective can be achieved using the **findNextSibling()** function. The difference between **findNext()** and **findNextSibling()** is that the former is finding the next tag, while the latter is trying to find the next tag which has the same level, e.g. is a sibling and not a parent or a child. Similarly, **findNextSibling()** is finding all the siblings that follow the current tag.

In [22]:
email_sibling = email_label.findNextSibling().text
print(email_sibling)

hdavtyan@aua.am


**findPreviousSibling()** and **findPreviousSiblings()** functions follow the same intutition.

### findParent()

The **findParent()** function returns the whole parent tag and its content for the very forst parent of the current tag. FOr example, e-mail is included in a list, which means the direct parent is a ```<li>``` tag.

In [24]:
email_parent = email_label.findParent()
print(email_parent)

<li><label>Email</label><span>hdavtyan@aua.am</span></li>


**findParents()** function will provide the list of all the parents until the top one (```<html>```) starting from the direct parent and ending the list with the *oldest* parent.

In [25]:
email_parents = email_label.findParents()
print(email_parents)

[<li><label>Email</label><span>hdavtyan@aua.am</span></li>, <ul class="personal-info">
<li><label>Name</label><span>Hrant Davtyan</span></li>
<li><label>Date of birth</label><span>June 16, 1992</span></li>
<li><label>Address</label><span>Yerevan 0033, Armenia</span></li>
<li><label>Email</label><span>hdavtyan@aua.am</span></li>
<li><label>Phone</label><span>+374 99 02-06-62</span></li>
<li><label>Website</label><span>HrantDavtyan.GitHub.io</span></li>
</ul>, <div id="profile">
<!-- About section -->
<div class="about">
<div class="photo-inner"><img src="images/photo.jpg" height="186" width="153" /></div>
<h1>HRANT DAVTYAN</h1>
<h3>Business Analyst &amp; Data Scientist</h3>
<p>I am a Data Enthusiast, teaching Business Analytics and providing consultancy on Statistics, Economics and IT. Feel free to take a look around my webpage.</p>
</div>
<!-- /About section -->
<!-- Personal info section -->
<ul class="personal-info">
<li><label>Name</label><span>Hrant Davtyan</span></li>
<li><label>D

In [26]:
email_parents[-1]

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>Hrant Davtyan</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />
<link rel="stylesheet" type="text/css" href="css/reset.css" />
<link rel="stylesheet" type="text/css" href="css/style.css" />
<link rel="stylesheet" type="text/css" href="css/fancybox.css" />
<link rel="stylesheet" type="text/css" href="http://fonts.googleapis.com/css?family=Open+Sans:400,600,300,800,700,400italic|PT+Serif:400,400italic" />
<script type="text/javascript" src="js/jquery.min.js"></script>
<script type="text/javascript" src="js/jquery.easytabs.min.js"></script>
<script type="text/javascript" src="js/respond.min.js"></script>
<script type="text/javascript" src="js/j

### findChild()

The **findChild()** function follows the same intuition, yet, for our e-mail case it will not return anything, as the e-mail tag does not have any children.

In [27]:
email_child = email_label.findChild()
print(email_child)

None


**findChildren()** function follows the same intuition as **findParents()** function.

```A note: all this time, in order to get the text content, the``` **.text** ``` method was used. The ``` **.string** ``` methods provides the same output too, yet there is a difference between them: while ``` **.text** ``` returns a unicode text, ``` **.string** ``` retunrs a navigatable string, which one can use to navigate backward/forward using the BeautifulSoup Navigation functions discussed above.```