***An introduction to TCP/IP networks***

The Internet protocol suite, often referred to as TCP/IP, is a set of protocols
designed to work together to provide end-to-end transmission of messages
across interconnected networks.

**IP addresses<br>
So, let's get started with something you're likely to be familiar with, that is,
IP addresses. They typically look something like this:
203.0.113.12
They are actually a single 32-bit number, though they are usually written just
like the number shown in the preceding example; they are written in the form of
four decimal numbers that are separated by dots. The numbers are sometimes called
octets or bytes because each one represents 8-bits of the 32-bit number. As such, each
octet can only take values from 0 to 255, so valid IP addresses range from 0.0.0.0 to
255.255.255.255. This way of writing IP addresses is called dot-decimal notation.**

***If we run one of these commands, then we can see that the IP addresses are assigned
to our device's network interfaces. On Linux, these will have names, such as eth0 ;
on Windows these will have phrases, such as Ethernet adapter Local Area
Connection .
You will get the following output when you run the ip addr command on Linux:***


$ ip addr

**Packets** <br>
Many protocols, including the principle protocols in the Internet protocol suite,
employ a technique called packetization to help manage data while it's being
transmitted across a network.
When a packetizing protocol is given some data to transmit, it breaks it up into small
units — sequences of bytes, typically a few thousand bytes long and then it prefixes
each unit with some protocol-specific information. The prefix is called a header, and
the prefix and data together form a packet. The data within a packet is often called
its payload.

# Requests with urllib

In [6]:
from urllib.request import urlopen

In [7]:
response = urlopen('http://www.debian.org')


In [8]:
response

<http.client.HTTPResponse at 0x7fe855b8adc0>

In [9]:
response.readline()

b'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">\n'

We use the urllib.request.urlopen() function for sending a request and
receiving a response for the resource at http://www.debian.org , in this case an
HTML page. We will then print out the first line of the HTML we receive.

# Response objects

In [10]:
response.url

'https://www.debian.org/'

In [13]:
response.read(500)

b'e>\n  <link rel="author" href="mailto:webmaster@debian.org">\n  <meta name="Description" content="Debian is an operating system and a distribution of Free Software. It is maintained and updated through the work of many users who volunteer their time and effort.">\n  <meta name="Generator" content="WML 2.12.2">\n  <meta name="Modified" content="2020-09-10 23:27:16">\n  <meta name="viewport" content="width=device-width">\n  <meta name="mobileoptimized" content="300">\n  <meta name="HandheldFriendly" cont'

### response first 500 bytes


# Status codes

## HTTP responses provide a means for us to do this through status codes. We can read
the status code of a response by using its status attribute.

In [14]:
response.status

200

# status code
 * 100: Informational
• 200: Success
• 300: Redirection
• 400: Client error
• 500: Server error

Know your cookies

In [22]:
from http.cookiejar import CookieJar

In [23]:
cookie_jar = CookieJar()

In [24]:
from urllib.request import build_opener, HTTPCookieProcessor

In [25]:
opener = build_opener(HTTPCookieProcessor(cookie_jar))

In [26]:
opener.open('http://www.github.com')

<http.client.HTTPResponse at 0x7fe8555da6a0>

In [27]:
len(cookie_jar)

3

In [28]:
cookies = list(cookie_jar)

In [29]:
cookies

[Cookie(version=0, name='_octo', value='GH1.1.1466440542.1600110891', port=None, port_specified=False, domain='.github.com', domain_specified=True, domain_initial_dot=False, path='/', path_specified=True, secure=True, expires=1631646891, discard=False, comment=None, comment_url=None, rest={'SameSite': 'Lax'}, rfc2109=False),
 Cookie(version=0, name='logged_in', value='no', port=None, port_specified=False, domain='.github.com', domain_specified=True, domain_initial_dot=False, path='/', path_specified=True, secure=True, expires=1631646891, discard=False, comment=None, comment_url=None, rest={'HttpOnly': None, 'SameSite': 'Lax'}, rfc2109=False),
 Cookie(version=0, name='_gh_sess', value='pMM7NUJVgYPWTltAgysy8jRIhpKs9tbBTWWL4A34QLon6SjTQbpwyj0lNfLN8z4DF%2FtloeIfjTL0g9vxX2sNjdlI2jPaEDJ2EZwsnnfLoEuhafFN1YSkKM4LHfWbC7G709dmCztrKTpq6ONbPZj2CpgLBwqSrVvKV32CxO6efGbvdHfPQY49CWS1WUuUuodb7sMhekk%2FvDDkDjkZPmar2Hm4b9OzrrTa%2BwAk8AFKCgmNCKsUV8%2BvZYXJVHc34fJp8ee6ikbTyCn6UGcJEhGTKg%3D%3D--Jp2gAh6ggdSi

In [30]:
cookies[0].name

'_octo'

In [32]:
cookies[0].domain

'.github.com'

In [33]:
cookies[0].path

'/'

In [34]:
import datetime

In [35]:
datetime.datetime.fromtimestamp(cookies[0].expires)

datetime.datetime(2021, 9, 15, 1, 14, 51)

So, our cookie will expire on 15th of April, 2035. An expiry date is the amount of
time that the server would like the client to hold on to the cookie for. Once the
expiry date has passed, the client can throw the cookie away and the server will
send a new one with the next request. Of course, there's nothing to stop a client
from immediately throwing the cookie away, though on some sites this may break
functionality that depends on the cookie.