# Networks Sockets and Requests

Python scripts and best practices about the protocols that web browsers use to retrieve documents and web applications use to interact with Application Program Interfaces (APIs).

## Content:
- Sockets in Python
- An HTTP Request
- Using URL Parameters
- Access JSON content
- Authentication web access with requests
- Using Timeout with requests

## Sockets in Python

Python has a built-in support for TCP Sockets.

In [3]:
import socket

In [4]:
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
#connect to host name '' , and port number
mysock.connect(('data.pr4e.org', 80))

We have made the conncetion at the Trasnport Layer, now let's comunicate at the Application Layer.

## An HTTP Request

We have made a connection to a port, now we send a get request, and then get some data back:

In [5]:
# prepare request with \r\n return (enter) twice
cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode()
# use of .encode() because of UTF-8 
# send request
mysock.send(cmd)

47

The host now will send stuff back, and we can use a while loop to print what we receive:

In [6]:
while True:
    data = mysock.recv(512) #want to receive up to 512 characters
    # with no data we end the file and end transmission:
    if (len(data) < 1):
        break
    print(data.decode())
# close conncetion
mysock.close()    

HTTP/1.1 200 OK
Date: Thu, 09 Jan 2020 15:02:20 GMT
Server: Apache/2.4.18 (Ubuntu)
Last-Modified: Sat, 13 May 2017 11:22:22 GMT
ETag: "a7-54f6609245537"
Accept-Ranges: bytes
Content-Length: 167
Cache-Control: max-age=0, no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Connection: close
Content-Type: text/plain

But soft what light through yonder window breaks
It is the east and Juliet is the sun
Arise fair sun and kill the envious moon
Who is already s
ick and pale with grief



We have metadata on top, and after the blank line we can see the content of the text.

## Using URL Parameters

I'll load `requests` module here, installed with `$ pip install requests` 

In the examples I'll play with the host https://httpbin.org iterating on multiple parameters with a dictionary:

In [7]:
import requests

In [24]:
for i in range(0,3):
    for j in range(0,3):
        payload = {'page':i, 'count':j}
        resp = requests.get('https://httpbin.org/get', params=payload)
        print(resp.url) 

https://httpbin.org/get?page=0&count=0
https://httpbin.org/get?page=0&count=1
https://httpbin.org/get?page=0&count=2
https://httpbin.org/get?page=1&count=0
https://httpbin.org/get?page=1&count=1
https://httpbin.org/get?page=1&count=2
https://httpbin.org/get?page=2&count=0
https://httpbin.org/get?page=2&count=1
https://httpbin.org/get?page=2&count=2


I only printed the urls here but with `.text` or `.content` instead of `.url`, we can access data via multiple parameters.

In [25]:
payload = {'page':0, 'count':1}
resp = requests.get('https://httpbin.org/get', params=payload)
print(resp.text)

{
  "args": {
    "count": "1", 
    "page": "0"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.22.0"
  }, 
  "origin": "95.244.8.21, 95.244.8.21", 
  "url": "https://httpbin.org/get?page=0&count=1"
}



## Access JSON content

It is possible to access json content with dictionaries.

In [46]:
resp = requests.get('https://httpbin.org/json')
print(resp.text)

{
  "slideshow": {
    "author": "Yours Truly", 
    "date": "date of publication", 
    "slides": [
      {
        "title": "Wake up to WonderWidgets!", 
        "type": "all"
      }, 
      {
        "items": [
          "Why <em>WonderWidgets</em> are great", 
          "Who <em>buys</em> WonderWidgets"
        ], 
        "title": "Overview", 
        "type": "all"
      }
    ], 
    "title": "Sample Slide Show"
  }
}



The response can be moduled with `.json()` in dictionary and added to a variable

In [44]:
dat_dict = resp.json()
print(dat_dict['slideshow']) #Access the 'slideshow' content

{'author': 'Yours Truly', 'date': 'date of publication', 'slides': [{'title': 'Wake up to WonderWidgets!', 'type': 'all'}, {'items': ['Why <em>WonderWidgets</em> are great', 'Who <em>buys</em> WonderWidgets'], 'title': 'Overview', 'type': 'all'}], 'title': 'Sample Slide Show'}


## Authentication web access with requests

We can use `reuests` module to access webpages with authentication form:

<img src="auth_example.png">

If user id is **admin** and password **12345** we can access the webpage with:

In [48]:
resp = requests.get('https://httpbin.org/basic-auth/admin/12345', 
                    auth=('admin','1234567'))
print(resp)

<Response [401]>


We get 401: unauthorized response code because we've put the wrong password.

In [49]:
resp = requests.get('https://httpbin.org/basic-auth/admin/12345', 
                    auth=('admin','12345'))
print(resp)

<Response [200]>


Now it is good.

In [50]:
print(resp.text)

{
  "authenticated": true, 
  "user": "admin"
}



We are inside.

## Using Timeout with requests

It is good to have a timeout parameter when trying to access webpages because the program will try to connect for ever if not stoped automatically when a page gives no response.

In [53]:
#simulate a 1sec response time with timeout of 3
resp = requests.get('https://httpbin.org/delay/1', 
                    timeout=3)
print(resp)

<Response [200]>


Next I'll simulate to get a traceback a read timeout error:

In [55]:
#simulate a 6sec response time with timeout of 3
resp = requests.get('https://httpbin.org/delay/6', 
                    timeout=3)
print(resp)

ReadTimeout: HTTPSConnectionPool(host='httpbin.org', port=443): Read timed out. (read timeout=3)