# Tag's Children

We saw in an earlier lesson that tags may contain other tags and strings within in them. These elements are known as the tag’s children. For simplicity, in the following examples we will use a simpler HTML file named `sample2.html`. So let's print it to see what it looks like:

In [1]:
# Import BeautifulSoup
from bs4 import BeautifulSoup

# Open the HTML file and print the head tag
with open('./sample2.html') as f:
    print(BeautifulSoup(f, 'lxml').prettify())

<!DOCTYPE html>
<html lang="en-US">
 <head>
  <title>
   AI For Trading
  </title>
  <meta charset="utf-8"/>
  <link href="./teststyle.css" rel="stylesheet"/>
  <style>
   .h2style {background-color: tomato;color: white;padding: 10px;}
  </style>
 </head>
</html>


As we can see, the `<html>` tag contains some child tags. For example, the `<head>` tag is a direct child of the `<html>` tag. Similarly, the `<title>` tag is a direct child of the `<head>` tag. We also see that the `<title>` tag itself has a child, namely the string `'AI For Trading'`. BeautifulSoup provides a lot of different attributes for navigating over a tag’s children. We already saw that we can access child tags as if they were attributes of the parent tag. For example, we can access the string `'AI For Trading'` from our `soup` object by using:

```python
soup.head.title.get_text()
```
as shown in the code below:

In [2]:
# Import BeautifulSoup
from bs4 import BeautifulSoup

# Open the HTML file and print the text in the title tag within the head tag
with open('./sample2.html') as f:
    print(BeautifulSoup(f, 'lxml').head.title.get_text())

AI For Trading


We can view a tag's children by using the `.contents` attribute of the Tag object. The `.contents` attribute returns a list with all the tag's children. By counting the number of elements in this list we can see how many children a parent tag has. In the code below we print the children of the `<head>` tag and we also print the number children the `<head>` tag has:

In [3]:
# Import BeautifulSoup
from bs4 import BeautifulSoup

# Open the HTML file and create a BeautifulSoup Object
with open('./sample2.html') as f:
    page_content = BeautifulSoup(f, 'lxml')

# Access the head tag
page_head = page_content.head

# Print the children of the head tag
print(page_head.contents)

# Print the number of children of the head tag
print('\nThe <head> tag contains {} children'.format(len(page_head.contents)))

[<title>AI For Trading</title>, <meta charset="utf-8"/>, <link href="./teststyle.css" rel="stylesheet"/>, <style>.h2style {background-color: tomato;color: white;padding: 10px;}</style>]

The <head> tag contains 4 children


# TODO: Get The Children from the `<title>`  Tag

In the cell below, print the contents and the number of children of the `<title>` tag in the `sample2.html` file. Start by opening the `sample2.html` file and passing the open filehandle to the BeautifulSoup constructor using the `lxml` parser. Save the BeautifulSoup object returned by the constructor in a variable called `page_content`. Then access the `<title>` tag and save the tag object in variable called `page_title`. Then use the `.contents` attribute to print the contents and the number of children of the `<title>` tag.

In [None]:
# Import BeautifulSoup


# Open the HTML file and create a BeautifulSoup Object

page_content = 

# Access the title tag
page_title = 

# Print the children of the title tag


# Print the number of children of the title tag


We should note, that strings do not have children because they can’t contain anything. 

Instead of getting a tag's children as a list, we can also get an iterator that we loop over by using the `.children` attribute. In the code below we create a loop to iterate over the `<head>` tag's children:

In [5]:
# Import BeautifulSoup
from bs4 import BeautifulSoup

# Open the HTML file and create a BeautifulSoup Object
with open('./sample2.html') as f:
    page_content = BeautifulSoup(f, 'lxml')

# Print the children of the head tag
for child in page_content.head.children:
    print(child)

<title>AI For Trading</title>
<meta charset="utf-8"/>
<link href="./teststyle.css" rel="stylesheet"/>
<style>.h2style {background-color: tomato;color: white;padding: 10px;}</style>


# TODO: Loop Through The Children The `<title>`  Tag

In the cell below, print the children of the `<title>` tag in the `sample2.html` file. Start by opening the `sample2.html` file and passing the open filehandle to the BeautifulSoup constructor using the `lxml` parser. Save the BeautifulSoup object returned by the constructor in a variable called `page_content`. Then create a loop that prints the children of the `<title>` tag using the `.children` attribute.

In [None]:
# Import BeautifulSoup


# Open the HTML file and create a BeautifulSoup Object

page_content = 

# Print the children of the head tag


# The Recursive Argument

If we use the `.find_all()` method on a Tag object, `tag.find_all()`,then the `find_all()` method will search all the tag's children, its children’s children, and so on. However, there will be times where you only want BeautifulSoup to search a tag's direct children. To do this, we can pass the `recursive=False` argument, to the `.find_all()` method. Let's see an example.

Let's start by printing our `sample2.html` file again to see what its structure looks like:

In [7]:
# Import BeautifulSoup
from bs4 import BeautifulSoup

# Open the HTML file and print the head tag
with open('./sample2.html') as f:
    print(BeautifulSoup(f, 'lxml').prettify())

<!DOCTYPE html>
<html lang="en-US">
 <head>
  <title>
   AI For Trading
  </title>
  <meta charset="utf-8"/>
  <link href="./teststyle.css" rel="stylesheet"/>
  <style>
   .h2style {background-color: tomato;color: white;padding: 10px;}
  </style>
 </head>
</html>


We can see that the `<head>` tag is directly beneath the `<html>` tag and that the `<title>` tag is directly beneath the `<head>` tag. Even though the `<title>` tag is beneath the `<html>` tag, it’s **not** directly beneath it, because the `<head>` tag is in between them. 

Now, if we search the `<html>` tag for the `<title>` tag, using the `.find_all()` method, BeautifulSoup will find it because it is searching in all of the descendants of the `<html>` tag:

In [8]:
# Import BeautifulSoup
from bs4 import BeautifulSoup

# Open the HTML file and create a BeautifulSoup Object
with open('./sample2.html') as f:
    page_content = BeautifulSoup(f, 'lxml')
    
# Search the html tag for the title tag
for tag in page_content.html.find_all('title'):
    print(tag)

<title>AI For Trading</title>


We can see that the we get a match.

Now, let's restrict our search to only look at the `<html>` tag’s direct children, by using the `recursive=False` argument:

In [9]:
# Import BeautifulSoup
from bs4 import BeautifulSoup

# Open the HTML file and create a BeautifulSoup Object
with open('./sample2.html') as f:
    page_content = BeautifulSoup(f, 'lxml')

# Search the html tag's direct children for the title tag
for tag in page_content.html.find_all('title', recursive = False):
    print(tag)

We can see that now we don't get any matches because the `<title>` tag is **not** a direct descendent of the `<html>` tag.

# TODO: Search For The `<head>` Tag

In the cell below, search for the `<head>` tag only in the direct children of the `<html>` tag in the `sample2.html` file. Start by opening the `sample2.html` file and passing the open filehandle to the BeautifulSoup constructor using the `lxml` parser. Save the BeautifulSoup object returned by the constructor in a variable called `page_content`. Then search the html tag's direct children for the `<head>` tag  using the `recursive=False` argument. Print the result using the `.prettify()` attribute.

In [None]:
# Import BeautifulSoup


# Open the HTML file and create a BeautifulSoup Object

page_content = 
    
# Search the html tag's direct children for the head tag    


# Solution

[Solution notebook](children_tags_solution.ipynb)