# HTML, CSS and Javascript

There's a trio of technologies commonly used  in the  creation of webpages:

* HTML - used to create the content of the page
* CSS - used to style the content. 
* Javascript - used to add interactivity to pages, add logic and fetch data.


The browser rendering engine takes the HTML and CSS then renders it into a webpage.  Afterwards, any JavaScript is then executed. This may change the layout of the page, or fetch some additional data. 




https://developer.mozilla.org/en-US/docs/Web/HTML/Element

# HTML 
HTML (Hyper Text Markup Language) is used to contain the content of a web page. It consists of HTML elements or tags which are names surrounded by angle brackets (<>). HTML tags usually come in pairs, for example:

```
<tagname>content goes here...</tagname>

```


Below we have an example of what HTML looks like. We can use the python html magic (`%%html`) to execute the html in the notebook so we can see how it renders. 

In [1]:
%%html
<body>
    <h1>This is a heading</h1>
    <h2>This is a heading 2</h2>
    <p>Normal text usually going in paragraph tags</p>
    <ul> <! unordered (bulleted) list. Use the ul tag together with the li tag to create unordered lists.>
        <li>Item 1</li>
        <li><strong>We can bold text by putting it inside strong tags</strong></li>
        <li> <a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Element">Link to MDN Docs</a> </li>
    </ul>
</body>

HTML elements can also have attributes. These attributes provide a way to add additional information to the elements. For example in the `<a>` tag, there is a `href` attribute which contains a link to another site.

```html
<a href="https://www.javascript.com/" >JavaScript</a>
```

Another common tag is the ` <div>` tag. Web developers use `<div>` tags to divide up pages and apply styles easily to lots of elements. The div tags often contain a `class` attribute, which can be targeted with CSS. 


```html
  <div class="red">
        <p>Some text that I want read</p>
  </div>
```

A`<style>` tag can be used to contain CSS which allows us to apply different styling to the HTML page. 

Class attributes can be reused to apply the same styles to many elements. This is in contrast to `id` attributes which should be used on a single element. Notice that in CSS classes start with a dot e.g `.classname` , whereas ID's start with a hashtag e.g `#myid`.

In [2]:
%%html
<body>
    <style>
        .red {
            color:red
        }
        
        .hidden {
            display:none
        }
        
        #an_id {
            color: blue
        }
    
    </style>
    <h2 class="red hidden">This text will be hidden. Try and remove the hidden class to see what happens.</h2>
    <div class="red">
        <p> This text will be red because it is surrounded by a div that has a class of red. </p>
        <p id="an_id">The styling from IDs has higher priority over classes, hence me being blue. IDs should be unique.</p>
    </div>
    <img src="http://www.catster.com/wp-content/uploads/2017/08/A-fluffy-cat-looking-funny-surprised-or-concerned.jpg" alt="">
</body>

HTML is in fact a type of tree structure so when describing HTML, you may hear the use words such as parent, child, sibling and ancestor nodes. 
- A child node is anything that is contained within an element, for example: the p tags within the div are children of the div. 
- Siblings are elements that are next to each other in the tree, for example: the two p tags above are siblings. 
- A parent node is the node directly above an element.  

Below is one final example of a table in html.

In [3]:
%%html
<table style="width:100%">
  <tr>   
    <th>Firstname</th> <!note the 'th' which makes it bold>
    <th>Lastname</th> 
    <th>Age</th>
  </tr>
  <tr>
    <td>Jill</td>
    <td>Smith</td> 
    <td>50</td>
  </tr>
  <tr>
    <td>Eve</td>
    <td>Jackson</td> 
    <td>94</td>
  </tr>
</table>

Firstname,Lastname,Age
Jill,Smith,50
Eve,Jackson,94


#### Exercise

Create a list of your top 5 most favourite hobbies and color the first item on the list red.

# CSS

CSS (Cascading Style Sheets) help web developers design and structure a page. HTML contains the content of a page and CSS is used to describe the layout and design. 

## Styling with CSS

Below are some HTML examples without any styling provided.

In [4]:
%%html
<body>
    <div>
        <header>
            <h1>Welcome to my page</h1>
        </header>
        <section>
            <div>
                <h3>Navigate my page</h3>
                <ul>
                    <li><a href="/index.html">Home</a></li>
                    <li><a href="/about.html">About</a></li>
                </ul>
            </div>
            <div>
                <h2>Text about me</h2>
                <p>Hello, this is my webpage</p>
            </div>
        </section>
    </div>
</body>

Let's add some attributes to the HTML and use CSS to style it.

In [6]:
%%html
<body>
<style>
body {
    background-color: #9B9B9B;
}

#container {
    width: 800px;
    margin: auto;
    background-color: black;
}

.heading {
    text-align: center;
    padding-top: 20px;
}

section {
    display: flex;
}

#navigation {
    width: 30%;
    padding-left: 30px;
}

#content {
    width: 60%;
}

.link {
    color: red;
}


</style>


    <div id="container">
        <header>
            <h1 class="heading">Welcome to my page</h1>
        </header>
        <section>
            <div id="navigation">
                <h3>Navigate my page</h3>
                <ul>
                    <li><a href="/index.html" class="link">Home</a></li>
                    <li><a href="/about.html" class="link">About</a></li>
                </ul>
            </div>
            <div id="content">
                <h2>Text about me</h2>
                <p>Hello, this is my webpage.
                </p>
            </div>
        </section>
    </div>
</body>

We can see how the layout of the page has changed.

## CSS Syntax


The syntax of CSS is quite simple. It consists of a property and a value separated by a colon and a selector which specifies which HTML element to target.

![Css selector](https://cdn.mos.cms.futurecdn.net/c3cf2e9ff8f8adfe64311cc5f4c1f82a-650-80.png)

For the purpose of web scraping, what we really care about is how CSS selectors work because they provide us an effective way to specify which HTML elements we want. 

For a more detailed explanation of CSS see [here](https://developer.mozilla.org/en-US/docs/Learn/CSS/Introduction_to_CSS)

## CSS Selectors

CSS Selectors are used to specify which HTML element we wish to select.

### Element Selectors 

The most basic CSS selector is simply the name of the element that you would like to select. For the example below, the CSS selector would select all `p` tags and colour them red.

```css
p {
color: red;
}
```

### Descendant Selectors


Descendant selectors can be used to selected elements within other elements. For the example below, the CSS would select all `li` elements contained within  a `ul`. Remember 'li' means list item, and 'ul' is an unordered list.

```
ul li {
border-bottom: 1px gray solid;
}
```

### Classes


Classes start with a dot **`.`** and can be used to select elements that have that class as an attribute.

```.green
.spaghetti {
background-color: green;
}
```
    
### IDs

IDs start with a hash tag `#` and are used to apply a CSS style to a unique element.

```css
#fusilli {
background-color: yellow;
}
```

### Chained Selectors

When using classes we can chain selectors together. For example the selector below would select all elements that have both the `.spaghetti` and `.penne` class.

```css
.spaghetti.penne {
background: blue;
}
```

There are many more ways to write CSS selectors. To learn the other ways I suggest you use a [cheatsheet](https://www.w3schools.com/cssref/css_selectors.asp).


#### Exercise

Create a table using HTML like below (note the text color of the headers):

| <span style="color:red">Name</span> | <span style="color:red">Age</span> | <span style="color:red">Grade</span> |
| :- | :- | :- |
| Ace | 21 | A |
| Ben | 20 | B |
| Caleb | 18 | B |

# Chrome Development Tools

Before we can scrape a webpage, we need to understand its structure. For this, we'll use the Chrome Developer Tools to inspect it. Most of the browsers have development tools but Chrome's is among the best. The development tools allow us to inspect an HTML page easily so that we can find the HTML elements which we would like to extract information from. 

# JavaScript

Currently JavaScript (JS) is the most popular langauge and certainly the language of the web.You need to use JS for anything dynamic in your web browser. Closing a popup by clicking a button? Logging into an app? Displaying content endlessly on your Facebook feed? All of this is achieved with the magic of JavaScript. JavaScript support is built right into all the major web browsers, and each browser has a JS console. 

For a brief overview of the language see [javascript in Y minutes](https://learnxinyminutes.com/docs/javascript/). Don't worry about learning any Javascript if you're not fully comfortable with Python yet. 

Nevertheless, below are some of the useful aspects of Javascript you may use in web scraping. First is being able to select particular html elements. For this, we can reuse our CSS selectors.

```javascript
// To Select by ID , prepend the id that you wanted to select with '#'
document.querySelector('#id')

// To select by class, prepend the class that you wanted to select with '.'
document.querySelector('.class')

// To select by two class, write them together. NO SPACE in the middle.
// Only the element that are matching these two classes will be selected.
document.querySelector('.class1.class2')

// And No, you couldn't have two ids for an element. You don't have two ID Cards right?

// To select a specific tag by its tag name , just write the name of tag as selector.
// This example selects ALL of the <a></a> in your webpage
document.querySelectorAll('a')

// To select multiple distinct elements at the same time, separate the selector with a comma(,)
// The following example selects all the element with class=class or id=id or <a> tag in your webpage.
document.querySelectorAll('.class,#id,a')

```

After we've selected an element we'll usually want to extract information from the element. 

- We can access attributes of the element by using it like a dictionary. 
- The text can be accessed using `.text`.

```javascript
link = document.querySelector('#linkICareAbout')
href = link['href'] 
text = link.text 

```

Clicking on the element is also quite simple

```javascript

//Select a button with id of #next-page-button
button = document.querySelector('#next-page-button')
button.click()

```

Slightly less often, we may want to submit a form. In order to do this we'll have to fill out the input fields of the form and then finally click on the submit button. This would look something like.

```javascript
//Select a input field with the id of full-name
fullname = document.querySelector('#full-name')
//Set the value of the input field
fullname.value = 'Dom Morgan'

email = document.querySelector('#email')
email.value = 'somewhere@email.com'

//Finaly Submit the from by clicking on the submit button
sumbitButton = document.querySelector('#submit-button')
sumbitButton.click()

```

## Extra Practice - CSS Selectors

You can practice css selectors with these games:

* [CSS Diner](https://flukeout.github.io/)
* [CSS Leveler](http://toolness.github.io/css-selector-game/)

Alternatively you could write a simple html page and try to style it.  For a good cheatsheet on CSS Selectors and Xpath see [here](http://www.cheetyr.com/css-selectors).