# NavigatableString Objects in BeautifulSoup

In this notebook, we will explore **NavigatableString** objects in BeautifulSoup. These objects are used to store text within HTML tags. We will create HTML elements, extract their string contents, and manipulate them.

In [None]:
# Check Python version to ensure compatibility
import sys
print(sys.version)

## Import BeautifulSoup
We import the BeautifulSoup class from the `bs4` module to parse HTML documents.

In [None]:
from bs4 import BeautifulSoup

## Creating a Tag and NavigatableString
We will create an HTML `<h1>` element and parse it. Then we'll examine the tag object and extract its string.

In [None]:
soup_object = BeautifulSoup('<h1 attribute_1 = "Heading Level 1">Future trends for IoT in 2018</h1>', 'html.parser')

# Access the H1 tag
tag = soup_object.h1

# Check the type of the tag
type(tag)

In [None]:
# Verify the tag name
tag.name

In [None]:
# Extract the string inside the tag
tag.string

In [None]:
# Check the type of the string
type(tag.string)

### Working with NavigatableString
We can store the string in a variable and manipulate it using `replace_with`.

In [None]:
our_navigatable_string = tag.string
our_navigatable_string

In [None]:
# Replace the content of the NavigatableString
our_navigatable_string.replace_with('NaN')
tag.string

## Utilizing NavigatableString Objects in a Larger HTML Document
We can loop through all string contents of an HTML document using `stripped_strings`.

In [None]:
our_html_document = '''
<html><head><title>IoT Articles</title></head>
<body>
<p class='title'><b>2018 Trends: Best New IoT Device Ideas for Data Scientists and Engineers</b></p>
<p class='description'>It’s almost 2018 and IoT is on the cusp of an explosive expansion...</p>
<h1>Future Trends for IoT in 2018</h1>
<p>New IoT device ideas won’t do you much good unless you at least know the basic technology trends...</p>
</body></html>
'''

# Parse the HTML document
our_soup_object = BeautifulSoup(our_html_document, 'html.parser')

In [None]:
# Loop through all string objects in the document
for string in our_soup_object.stripped_strings:
    print(repr(string))

### Accessing Links and Their Parent Tags
We can access the first link (`<a>` tag) and inspect its parent or string.

In [None]:
# Access the first link in the HTML
first_link = our_soup_object.a
print(first_link)

In [None]:
# Access the parent of the link
first_link.parent

In [None]:
# Access the string inside the link
first_link.string