# Navigating the three

Here's the "Three sisters" HTML document again:

In [3]:
html_doc = """
<!DOCTYPE html>
<html>
<head>
	<title>The Dormouse's Story</title>
</head>
<body>
	<p class="title"><b>The Dormouse's Story</b></p>

	<p class="story">Once upon a time there were three little sisters; and their names where <a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>, <a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and <a href="http://example.com/tillie" class="sister" id="link3">Tillie</a> and they lived at the botton of a well.</p>
</body>
</html>"""

from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')

I'll use this as an example to show you how to move from one part of a document to another.

## Going down

Tags may contain strings and other tags. These elements are the tag's *children*. Beautiful Soup provides a lot of different attributes for navigating over a tag's children.

Note that Beautiful Soup strings don't suport any of attributs, because a string can't have children.

### Navigating using tag names

The simplest way to navigate the parse tree is to say the name of the tag you want. If you want the `<head>` tag, just say `soup.head`.

In [4]:
soup.head

<head>
<title>The Dormouse's Story</title>
</head>

In [5]:
soup.title

<title>The Dormouse's Story</title>

You can use this trick again and again to zoom in on a certain part of the parse tree. This code get first `<b>` tag beneath the `<body>` tag:

In [6]:
soup.body.b

<b>The Dormouse's Story</b>

Using a tag name as an attribute will give you only the *first* tab by that name:

In [7]:
soup.a

<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>

If you need to get *all* the `<a>` tags, or anything more complicated than the first tag with a certain name, you'll need to use one of the methods described in Searching the tree, such as *find_all()*:

In [8]:
soup.find_all('a')

[<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>,
 <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>,
 <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]

### `.contents` and `.children`

A tag's children are available in a list called `.contents`:

In [9]:
head_tag = soup.head
head_tag

<head>
<title>The Dormouse's Story</title>
</head>

In [10]:
head_tag.contents

['\n', <title>The Dormouse's Story</title>, '\n']

In [11]:
title_tag = head_tag.contents[1]
title_tag

<title>The Dormouse's Story</title>

In [13]:
title_tag.contents

["The Dormouse's Story"]

In [18]:
first_child = head_tag.contents[0]
first_child

'\n'

In [19]:
first_child.contents

AttributeError: 'NavigableString' object has no attribute 'contents'

### `.decendants`

The `.contents` and `.children` attributes only consider a tag's *direct* children. For instance `<head>` tag has a single direct chil - the `<title>` tag?

In [20]:
head_tag.contents

['\n', <title>The Dormouse's Story</title>, '\n']

But the `<title>` tag itself has a child: the string "The Dormouse's story". There's a sense in which that string is also a child of the `<head>` tag. The `.decendants` attributes lets you iterate over all of a tag's children, recursively: its direct children and the children of its direct children and so on:

In [21]:
for child in head_tag.descendants:
    print(child)



<title>The Dormouse's Story</title>
The Dormouse's Story




The `<head>` tag has only one child, but it has two descendants: the `<title>` tag and the `<title>` tag's child