In [2]:
# practice identifying specific data using Chrome Developer Tools (also known as DevTools). 

# This tool allows developers to look at the structure of any webpage. Not only that, but there's a search function as
# well. This should help make more sense of the tags and components that hold the data she's looking for.

# Let's visit one of the websites Robin plans to use and take a peek at its structure, then practice finding different
# components.

In [None]:
#  NASA news website

# extract the most recently published article's title and summary. 
# Let's find the HTML components in the page so we can help her with that.


In [None]:
# Click on website, then Open the DevTools by right-clicking anywhere on the page, 
# then click "Inspect" from the pop-up menu.

In [None]:
# After clicking "Inspect," a new window should open under the webpage. 
# This new window is docked to the webpage itself—it's part of the webpage, it's attached to the webpage, but it has 
# a different job.


In [None]:
#  What we're currently looking at is how this news site is assembled. 

# The <html lang=”en” …> line should look familiar, as well as the <head /> and <body /> tags, but what is all of this
# other stuff? And the stuff inside the familiar tags? 

# Remember how we spoke about containers? 
# For example, the <body /> tag is a container for every visual component of a webpage, such as headers and paragraphs.
# Inside that <body /> tag are other containers, which are nested much like a nesting doll. 
# In the case of this website (and most websites), these other containers inside the body are <div /> tags.

In [None]:
# There is a lot of custom code included in this website, so instead of scrolling through all of it to find a certain 
# element, we will search for it instead. 

# In your DevTools, press "ctrl + f" or "command + f" to bring up the search function. 

# Input "gallery_header" into the search bar then press enter. 
# Make sure the line "header class="gallery_header" is selected, then hover over it with your mouse pointer. 

# This will highlight the header section of the page: the title and its container element.



In [None]:
# At this point the code doesn't match what is in the reading, but essentially, underneath the gallery_header are
# the nested articles, which if you hover beneath the header code, should highlight as you go down. 

In [3]:
# This is a great way to pinpoint where on the website we want our web scraping code to pull data from. 
# We can't just tell the code to grab a div or a header though, because there could be many of these on the website 
# when we only want one. This is where the class and id attributes come into play.


In [4]:
# HTML Classes and IDs

# HTML code can get bloated and confusing, so keep specific containers unique. 
# With everything contained within HTML code, it can be really difficult to find what we're looking for.

# how are developers able to distinguish one <div /> from another? 
# By adding attributes unique to each container or element. 

# That's another reason to practice using DevTools. We can use it to search for these attributes. 



In [None]:
# How exactly do they work?

# Think of it like a litter of puppies. They all look pretty similar, but they each have a personality quirk or trait 
# that makes them act a little differently from their siblings. 
# By adding a different color collar to each puppy, we can now tell them apart just by looking. 
# HTML class and id attributes are like those collars.



In [5]:
# Robin knows that she will want to pull the top article and summary sentence. 

# How do we identify those components, though? Let's look at our DevTools again. 

# This time, let's drill further down into the nested components—we want to find the element that highlights only the 
# top article on the page.

# the first <li /> element with a class of "slide" highlights the top article on the page.

In [None]:
# The section we're aiming for (the article title and text) is nested further in, and there are quite a few steps 
# we'll need to take to get there.

# First, click the drop-down arrow on the <li class=”slide”> element (if it isn't already open). 

# From there, we're directed to another element: a div with the class of "image_and_description_container." 

# Click that drop-down arrow as well. Within that, we have another element, <div class=”list_text”>.

# Maneuvering around these nested elements is called "drilling down," and it's a skill you'll encounter and employ 
# fairly often as you continue to work with HTML.

In [None]:
# <div class=”list_text”>.

# This final container holds the information Robin will want: the article title and summary. 
# With the use of DevTools, this process is something we'll be following with each additional webpage we want to
# scrape: 
# visit the page, identify the data, then shift through the HTML code to pinpoint its location on the webpage.

# Too slow. Let's condense the steps above.

# Go ahead and close your dev tools window, then take another look at the webpage.
# Locate the first article's title and summary, and right-click the space below them. 
# This time, click "inspect" from the pop-up menu.

# The dev tools window automatically opens again, but this time the highlighted section is already closer to the 
# element you want to view, if it isn't already selected. 

# You can tell by mousing over the highlighted element—it will simultaneously highlight the corresponding location on 
# the webpage.



In [None]:
# Mobile Device Preview

# DevTools also comes with a feature that allows us to view webpages as we would if using a phone or tablet. 
# Not only that, but there are specific device models we can use to test the page. 

# Let's look at the DevTools again—this time at the Device icon.

# top left corner is the Device Icon 
# This button toggles the device selector. When clicked, the webpage we're viewing automatically adjusts to the height
# and width of a responsive mobile device. 

# When in mobile mode, there is a drop-down menu at the top left of the screen; this menu provides a selection of 
# devices to choose from and to view the site with.