# Transferring Data on the Internet

In [17]:
import requests
import urllib
import socket

### TCP/IP Protocols

On the Internet, messages are transferred from one machine to another using the Internet Protocol (IP), which specifies how to transfer packets of data among different networks to allow global Internet communication. 

Each packet contains a header containing the destination IP address, along with other information. All packets are forwarded throughout the network toward the destination using simple routing rules on a best-effort basis.

In [None]:
!curl -v https://httpbin.org/html

This design imposes constraints on communication. Packets transferred using modern IP implementations (IPv4 and IPv6) have a maximum size of 65,535 bytes. Larger data values must be split among multiple packets. 

The Transmission Control Protocol is an abstraction defined in terms of the IP that provides reliable, ordered transmission of arbitrarily large byte streams. The protocol provides this guarantee by correctly ordering packets transferred by the IP, removing duplicates, and requesting retransmission of lost packets. 

The TCP breaks a stream of data into TCP segments, each of which includes a portion of the data preceded by a header that contains sequence and state information to support reliable, ordered transmission of data. Some TCP segments do not include data at all, but instead establish or terminate a connection between two computers.

In [16]:
socket.gethostbyname('www.nytimes.com')

'199.232.37.164'

The client first requests the Internet Protocol (IP) address of the computer located at that name from a Domain Name Server (DNS). A DNS provides the service of mapping domain names to IP addresses, which are numerical identifiers of machines on the Internet. Python can make such a request directly using the socket module.

In [18]:
response = urllib.request.urlopen('http://www.nytimes.com').read()
response[:100]

b'<!DOCTYPE html>\n<html lang="en" xmlns:og="http://opengraphprotocol.org/schema/">\n  <head>\n    <title'

### HTTP Protocol 

##### GET Request

In [2]:
url = "https://httpbin.org/html"

response = requests.get(url)
response

<Response [200]>

In [3]:
request = response.request
for key in request.headers:
    print(f"{key}: {request.headers[key]}")

User-Agent: python-requests/2.21.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive


In [4]:
for key in response.headers:
    print(f"{key}: {response.headers[key]}")

Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Content-Encoding: gzip
Content-Type: text/html; charset=utf-8
Date: Thu, 12 Dec 2019 19:22:28 GMT
Referrer-Policy: no-referrer-when-downgrade
Server: nginx
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Content-Length: 1936
Connection: keep-alive


##### POST Request

In [6]:
post_response = requests.post(url, data = {'name': 'DS-GA 1007 Student'})

In [7]:
post_response

<Response [405]>

In [8]:
post_response = requests.post("https://httpbin.org/post", data = {'name': 'DS-GA 1007 Student'})

In [9]:
post_response

<Response [200]>

In [10]:
post_response.text

'{\n  "args": {}, \n  "data": "", \n  "files": {}, \n  "form": {\n    "name": "DS-GA 1007 Student"\n  }, \n  "headers": {\n    "Accept": "*/*", \n    "Accept-Encoding": "gzip, deflate", \n    "Content-Length": "23", \n    "Content-Type": "application/x-www-form-urlencoded", \n    "Host": "httpbin.org", \n    "User-Agent": "python-requests/2.21.0"\n  }, \n  "json": null, \n  "origin": "216.165.95.142, 216.165.95.142", \n  "url": "https://httpbin.org/post"\n}\n'