 # SOEN287 Web Programming
## HTML: HyperText Markup Language
### Author: Denis Rinfret 


## Hello World Example
~~~~html
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Hello SOEN287!</title>
</head>
<body>
    <h1>Hello SOEN287!</h1>
</body>
</html>
~~~~

### Output: `hello.html`
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Hello SOEN287!</title>
</head>
<body>
    <h1>Hello SOEN287!</h1>
</body>
</html>

- An HTML document is a file containing *text* in a specific format
- The file name extension is usually `.html`, or sometimes `.htm`
- An HTML file can be edited with *any text editor*
- Usually, a *web browser* is used to view an HTML document
- HTML documents, along with *Cascading Style Sheets (CSS)*, *JavaScript* and other technologies, are used to build web sites

## Introduction

- A good source of information, used throughout this course among other resources, is the W3Schools [https://www.w3schools.com/html/html_intro.asp](https://www.w3schools.com/html/html_intro.asp)
[![W3Schools](https://www.w3schools.com/images/w3schools200x60.gif)](https://www.w3schools.com/html/html_intro.asp)
- Some sections, like this one, are based in part on the W3Schools tutorials and reference documents, while other sections are completely independent of the W3Schools

### What is HTML?

HTML is the standard markup language for creating Web pages.
- HTML stands for *Hyper Text Markup Language*
- HTML describes the *structure* of Web pages using *markup*
- HTML *elements* are the building blocks of HTML pages
- HTML elements are represented by *tags*
- HTML tags label pieces of content such as *heading*, *paragraph*, *table*, and so on
- *Browsers* do not display the HTML tags, but use them to *render* the content of the page

![W3Schools](https://www.w3schools.com/images/w3schools80x15.gif)

### HTML Versions

Since the early days of the web, there have been many versions of HTML:

| Version | Year|
|---------|-----|
| HTML | 1991 |
| HTML 2.0 | 1995 |
| HTML 3.2 | 1997 |
| HTML 4.01 | 1999 |
| XHTML | 2000 |
| HTML5 | 2014 |

![W3Schools](https://www.w3schools.com/images/w3schools80x15.gif)

### A Simple HTML Document
~~~~html
<!DOCTYPE html>
<html>
<head>
    <title>Page Title</title>
</head>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>
~~~~

![W3Schools](https://www.w3schools.com/images/w3schools80x15.gif)

### Output: `intro.html`
<!DOCTYPE html>
<html>
<head>
    <title>Page Title</title>
</head>
<body>

<h1>My First Heading</h1>
<p>My first paragraph.</p>

</body>
</html>

### Example Explained

- The `<!DOCTYPE html>` declaration defines this document to be *HTML5*
- The `<html>` element is the *root* element of an HTML page
- The `<head>` element contains *meta* information about the document
- The `<title>` element specifies a *title* for the document
- The `<body>` element contains the *visible page content*
- The `<h1>` element defines a *large heading*
- The `<p>` element defines a *paragraph*

![W3Schools](https://www.w3schools.com/images/w3schools80x15.gif)

### HTML Tags

- HTML tags are element names surrounded by angle brackets:
~~~~html
<tagname>content goes here...</tagname>
~~~~

- HTML tags normally come in pairs like `<p>` and `</p>`
- The first tag in a pair is the *start tag*, the second tag is the *end tag*
- The end tag is written like the start tag, but with a *forward slash* inserted before the tag name

![W3Schools](https://www.w3schools.com/images/w3schools80x15.gif)

### Do Not Forget the End Tag
Some HTML elements will display correctly, even if you forget the end tag:

Example:
~~~~html
<html>
<body>

<p>This is a paragraph
<p>This is a paragraph

</body>
</html>
~~~~


- The previous example works in all browsers, because the closing tag is considered optional.

- **Never rely on this. It might produce unexpected results and/or errors if you forget the end tag**.

![W3Schools](https://www.w3schools.com/images/w3schools80x15.gif)

### Empty HTML Elements
- HTML elements with no content are called empty elements.

- `<br>` is an empty element without a closing tag

- Example:
~~~~html
<p>This is a <br> paragraph with a line break.</p>
~~~~

- Empty elements can be *closed* in the opening tag like this: `<br />`.

- HTML5 does not require empty elements to be closed. But if you want stricter validation, or if you need to make your document readable by XML parsers, you must close all HTML elements properly.

![W3Schools](https://www.w3schools.com/images/w3schools80x15.gif)

### Use Lowercase Tags

- HTML tags are not case sensitive: `<P>` means the same as `<p>`.

- The HTML5 standard does not require lowercase tags, but W3C *recommends lowercase in HTML*, and demands lowercase for stricter document types like XHTML.

![W3Schools](https://www.w3schools.com/images/w3schools80x15.gif)

## Editors

- Use any text editor to edit HTML documents, such as
    - SublimeText
    - Brackets
    - Notepad++
    - Visual Studio Code
    - and many others...

- Because we will do server-side programming in Python later on in the course, and that we will be using the PyCharm IDE from JetBrains for that part of the course, it's a good idea to start using PyCharm immediately.

- Save your files with the `.html` extension, preferably using the *UTF-8* encoding.

## PyCharm

- Download PyCharm from [JetBrains.com](https://www.jetbrains.com/pycharm/download/)
- Choose the *Professional* version.
- Apply for a *Student* license [here](https://www.jetbrains.com/student/)
- Use your *university email address* to apply
- Or you can use your GitHub account if you already have a [GitHub Student Developer Pack](https://education.github.com/pack)
- If you are already using another JetBtains IDE, such as IntelliJ IDEA, you can install the Python plugin, which will enable most of the PyCharm functionality

## Examples
1. `profile1.html`
1. `profile2.html`
1. `links.html`
1. `lists.html`
1. `table1.html`
1. `table2.html`

### `Pre` Element

The `p` element doesn't preserve whitespaces, but the `pre` element does. The font styles are also different.

~~~html
<p>Whitespaces are       not 
    preserved
in a paragraph, but are preserved 
            in a pre element.</p>
~~~

#### Output:
<p>Whitespaces are       not 
    preserved
in a paragraph, but are preserved 
            in a pre element.</p>

~~~html
<pre>Whitespaces are       not 
    preserved
in a paragraph, but are preserved 
            in a pre element.</pre>
~~~

#### Output:
<pre>Whitespaces are       not 
    preserved
in a paragraph, but are preserved 
            in a pre element.</pre>

### Code
`code` is similar to `pre`, but doesn't preserve whitespaces by default, and could have different styles
    applied to it. To preserve whitespaces in `code`, surround it with a `pre`, or apply some CSS rules to preserve whitespaces.
~~~html
<pre>
<code>
hello        =     "Hello, SOEN287!"
print(hello)
</code>
</pre>
~~~

#### Output:
<pre>
<code>
hello        =     "Hello, SOEN287!"
print(hello)
</code>
</pre>

### Entities
How to display a *less than* character <, either in a code section or elsewhere? Use an entity.
~~~html
<pre>
<code>
x = 5
if x &lt; 10:
    print(x)
</code>
</pre>
~~~

#### Output:
<pre>
<code>
x = 5
if x &lt; 10:
    print(x)
</code>
</pre>

### Other Symbols

There are too many symbols to list them all. Examples:
- `&euro;` &euro; 
- `&alpha;` &alpha;
- `&copy;` &copy;
- `&hearts;` &hearts;

## URL - Uniform Resource Locator

- A URL is another word for a web address.
- A URL can be composed of words (w3schools.com), or an Internet Protocol (IP) address (192.68.20.50).
- Most people enter the name when surfing, because names are easier to remember than numbers.
- Web browsers request pages from web servers by using a URL.
- A Uniform Resource Locator (URL) is used to address a document (or other data) on the web.

![W3Schools](https://www.w3schools.com/images/w3schools80x15.gif)

A web address like https://www.w3schools.com/html/default.asp follows these syntax rules:

> **scheme://prefix.domain:port/path/filename**

- **scheme**: defines the type of Internet service (most common is http or https)
- **prefix**: defines a domain prefix (default for http is www)
- **domain**: defines the Internet domain name (like w3schools.com)
- **port**: defines the port number at the host (default for http is 80)
- **path**: defines a path at the server (If omitted: the root directory of the site)
- **filename**: defines the name of a document or resource

![W3Schools](https://www.w3schools.com/images/w3schools80x15.gif)

## HTML Validator
- Valid HTML code should render correctly in all modern browsers
- Invalid HTML code might render correctly if you are lucky, or possibly render into something unreadable
- Validate your HTML code using the [W3C Validator](https://validator.w3.org/nu/)
- Errors are problematic. You should get rid of all errors
- Warnings should not create any rendering issues, but it is better to get rid of warnings if possible

## Readings and Resources

1. [W3Schools HTML5 Tutorial](https://www.w3schools.com/html/default.asp)
    - HTML sections *HOME* to *Lists*
    - Skip (for now) sections *Styles*, *Colors* and *CSS*, and also the parts in other scetions referring to these 3 sections
    - *Cascading Style Sheets (CSS)* will be covered later
    - Sections *Computercode* to *Charset*
    - HTML *Forms* will be covered later
    - HTML5 sections *Intro*, *New Elements*, *Semantics* and *Style Guide*
2. [W3Schools HTML Reference](https://www.w3schools.com/tags/default.asp)
    - Don't worry, you will not have to memorize all these HTML tags, attributes, etc...
    - But it's a good resource to use to discover new tags or to refresh your memory