## Deep Dive into httplib2 in Python

##### httplib is renamed as httplib.client in python3,  httplib2 is comprehensive to httplib.client so we are using httplib2 here.

### Installing and importing httplib2

In [1]:
import httplib2
from pprint import pprint

¬These are the different statuses will get while sending request to the websites. 
if:                         
* status : 200 --> success                                  
* status : 404 --> server not available                          
* status : 405 --> method not allowed

#### https://httpbin.org this website is designed for testing (HTTP Request & Response Service) the methods like "get( )", "post( )", "put( )" .

In [2]:
bin_url = 'https://httpbin.org/'

In [3]:
http = httplib2.Http()

http.request return 2 responses:           
* first response with header (meta data) information 
* second response with actual data (HTML formate)

In [4]:
resp, data = http.request(bin_url)

#### http.request  returning the response with header details

In [5]:
pprint(resp)

{'access-control-allow-credentials': 'true',
 'access-control-allow-origin': '*',
 'connection': 'keep-alive',
 'content-length': '9593',
 'content-location': 'https://httpbin.org/',
 'content-type': 'text/html; charset=utf-8',
 'date': 'Mon, 22 Jan 2024 18:43:09 GMT',
 'server': 'gunicorn/19.9.0',
 'status': '200'}


In [6]:
type(resp), len(resp)

(httplib2.Response, 9)

In [7]:
resp.status, resp.reason, resp.version

(200, 'OK', 11)

In [8]:
resp.previous

#### http.request is returning the data in byte formate, we can observe below it is representing by b

In [9]:
pprint(data)

(b'<!DOCTYPE html>\n<html lang="en">\n\n<head>\n    <meta charset="UTF-8">\n'
 b'    <title>httpbin.org</title>\n    <link href="https://fonts.googleapis.'
 b'com/css?family=Open+Sans:400,700|Source+Code+Pro:300,600|Titillium+Web:400,6'
 b'00,700"\n        rel="stylesheet">\n    <link rel="stylesheet" type="text/'
 b'css" href="/flasgger_static/swagger-ui.css">\n    <link rel="icon" type="'
 b'image/png" href="/static/favicon.ico" sizes="64x64 32x32 16x16" />\n    <'
 b'style>\n        html {\n            box-sizing: border-box;\n            ov'
 b'erflow: -moz-scrollbars-vertical;\n            overflow-y: scroll;\n      '
 b'  }\n\n        *,\n        *:before,\n        *:after {\n            box-'
 b'sizing: inherit;\n        }\n\n        body {\n            margin: 0;\n  '
 b'          background: #fafafa;\n        }\n    </style>\n</head>\n\n<body'
 b'>\n    <a href="https://github.com/requests/httpbin" class="github-corner'
 b'" aria-label="View source on Github">\n        <svg wi

In [10]:
type(data), len(data)

(bytes, 9593)

#### From above http response we can observe it is encoded in "UTF-8" , so we are trying to decode this reponse.

In [11]:
html = data.decode("UTF-8")

type(html)

str

#### Here we are displaying the decoded data which is in the form of str and in the starting 'b' is not there

In [12]:
pprint(html)

('<!DOCTYPE html>\n'
 '<html lang="en">\n'
 '\n'
 '<head>\n'
 '    <meta charset="UTF-8">\n'
 '    <title>httpbin.org</title>\n'
 '    <link '
 'href="https://fonts.googleapis.com/css?family=Open+Sans:400,700|Source+Code+Pro:300,600|Titillium+Web:400,600,700"\n'
 '        rel="stylesheet">\n'
 '    <link rel="stylesheet" type="text/css" '
 'href="/flasgger_static/swagger-ui.css">\n'
 '    <link rel="icon" type="image/png" href="/static/favicon.ico" '
 'sizes="64x64 32x32 16x16" />\n'
 '    <style>\n'
 '        html {\n'
 '            box-sizing: border-box;\n'
 '            overflow: -moz-scrollbars-vertical;\n'
 '            overflow-y: scroll;\n'
 '        }\n'
 '\n'
 '        *,\n'
 '        *:before,\n'
 '        *:after {\n'
 '            box-sizing: inherit;\n'
 '        }\n'
 '\n'
 '        body {\n'
 '            margin: 0;\n'
 '            background: #fafafa;\n'
 '        }\n'
 '    </style>\n'
 '</head>\n'
 '\n'
 '<body>\n'
 '    <a href="https://github.com/requests/httpbin"

In [13]:
resp, data = http.request('http://google.com')

pprint(resp)

{'-content-encoding': 'gzip',
 'cache-control': 'private, max-age=0',
 'content-length': '20662',
 'content-location': 'http://www.google.com/',
 'content-security-policy-report-only': "object-src 'none';base-uri "
                                        "'self';script-src "
                                        "'nonce-i35JMGs3RksL_v4KNyQH0g' "
                                        "'strict-dynamic' 'report-sample' "
                                        "'unsafe-eval' 'unsafe-inline' https: "
                                        'http:;report-uri '
                                        'https://csp.withgoogle.com/csp/gws/other-hp',
 'content-type': 'text/html; charset=ISO-8859-1',
 'date': 'Mon, 22 Jan 2024 18:43:47 GMT',
 'expires': '-1',
 'p3p': 'CP="This is not a P3P policy! See g.co/p3phelp for more info."',
 'server': 'gws',
 'set-cookie': '1P_JAR=2024-01-22-18; expires=Wed, 21-Feb-2024 18:43:47 GMT; '
               'path=/; domain=.google.com; Secure, '
            

In [14]:
type(data)

bytes

In [15]:
data = data.decode('ISO-8859-1')

pprint(data)

('<!doctype html><html dir="rtl" itemscope="" '
 'itemtype="http://schema.org/WebPage" lang="ar"><head><meta '
 'content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta '
 'content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" '
 'itemprop="image"><title>Google</title><script '
 'nonce="i35JMGs3RksL_v4KNyQH0g">(function(){var '
 "_g={kEI:'Y7euZcHZIKWDhbIPp6-LyAU',kEXPI:'0,18167,780063,3,567234,207,4804,1132070,870537,327163,303262,77528,16115,19398,9286,23792,283,12036,2815,14765,4998,17075,38444,2872,2891,4140,7614,606,30668,19391,10631,2614,3783,9708,230,20583,4,28691,30926,27041,6633,7596,1,42160,2,39755,5679,1020,31123,4568,6255,17114,6307,1252,33064,2,2,1,10956,13670,2006,8155,8861,14490,20506,7,1922,9779,42459,3141,17058,68056,5122,3030,15816,1804,13806,33276,1635,9708,19568,474,12516,5224232,2,366,586,482,5992853,678,2806666,7475465,20554301,2375,43887,3,1603,3,262,3,234,3,2121276,2585,22636438,392913,12799,8408,4503,12162,10,3154,9859,4427,1225,9352,

### OPTIONS Method

#### .options( ) method will show all options like what type of operations we can perform on webpage like... GET, POST, PUT, DELETE, OPTIONS' . It will not return any page data.

In [16]:
resp, data = http.request(bin_url, method = 'OPTIONS')

pprint(resp)

{'access-control-allow-credentials': 'true',
 'access-control-allow-methods': 'GET, POST, PUT, DELETE, PATCH, OPTIONS',
 'access-control-allow-origin': '*',
 'access-control-max-age': '3600',
 'allow': 'OPTIONS, HEAD, GET',
 'connection': 'keep-alive',
 'content-length': '0',
 'content-location': 'https://httpbin.org/',
 'content-type': 'text/html; charset=utf-8',
 'date': 'Mon, 22 Jan 2024 18:43:58 GMT',
 'server': 'gunicorn/19.9.0',
 'status': '200'}


### head

#### To get only header info of webpage we have .head( ) method. It will not return any page data.

In [17]:
resp, data = http.request(bin_url, method = 'HEAD')

pprint(resp)

{'access-control-allow-credentials': 'true',
 'access-control-allow-origin': '*',
 'connection': 'keep-alive',
 'content-length': '9593',
 'content-location': 'https://httpbin.org/',
 'content-type': 'text/html; charset=utf-8',
 'date': 'Mon, 22 Jan 2024 18:44:03 GMT',
 'server': 'gunicorn/19.9.0',
 'status': '200'}


### GET Method

In [18]:
resp, data = http.request('https://httpbin.org/get', method = 'GET')

In [19]:
pprint(resp)

{'access-control-allow-credentials': 'true',
 'access-control-allow-origin': '*',
 'connection': 'keep-alive',
 'content-length': '292',
 'content-location': 'https://httpbin.org/get',
 'content-type': 'application/json',
 'date': 'Mon, 22 Jan 2024 18:44:06 GMT',
 'server': 'gunicorn/19.9.0',
 'status': '200'}


### POST Method

#### "post" method will not replace the data in url it will just create the child of existing data. When ever we dont know the exact url where we are going to write the data in such case we will use post( ).

#### http.request will take data in dictionary formate.

In [20]:
post_data = '{"name": "Alice", "college": "Harvard"}'

#### We are trying to write the data to the website using post( ), for this http is providing method parameter to pass the method what ever we want (by default it will take get( ) ).

In [21]:
resp, data = http.request('https://httpbin.org/post', 
                          method = 'POST', 
                          body = post_data,
                          headers = {'content-type':'application/json'})

pprint(resp)

{'access-control-allow-credentials': 'true',
 'access-control-allow-origin': '*',
 'connection': 'keep-alive',
 'content-length': '521',
 'content-type': 'application/json',
 'date': 'Mon, 22 Jan 2024 18:44:15 GMT',
 'server': 'gunicorn/19.9.0',
 'status': '200'}


In [22]:
pprint(data.decode('UTF-8'))

('{\n'
 '  "args": {}, \n'
 '  "data": "{\\"name\\": \\"Alice\\", \\"college\\": \\"Harvard\\"}", \n'
 '  "files": {}, \n'
 '  "form": {}, \n'
 '  "headers": {\n'
 '    "Accept-Encoding": "gzip, deflate", \n'
 '    "Content-Length": "39", \n'
 '    "Content-Type": "application/json", \n'
 '    "Host": "httpbin.org", \n'
 '    "User-Agent": "Python-httplib2/0.22.0 (gzip)", \n'
 '    "X-Amzn-Trace-Id": "Root=1-65aeb77f-4d35d0d17b4d9bee07f66f78"\n'
 '  }, \n'
 '  "json": {\n'
 '    "college": "Harvard", \n'
 '    "name": "Alice"\n'
 '  }, \n'
 '  "origin": "45.245.78.252", \n'
 '  "url": "https://httpbin.org/post"\n'
 '}\n')


### PUT Method

#### "put" method will  replace the data in the url . When we know the exact url where we are going to write the data in such case we will use put( ).
#### We are trying to write the data to the website using put method in url

In [23]:
resp, data = http.request('https://httpbin.org/put',  
                          method = 'PUT', 
                          body = post_data, 
                          headers = {'content-type':'application/json'})

pprint(resp)

{'access-control-allow-credentials': 'true',
 'access-control-allow-origin': '*',
 'connection': 'keep-alive',
 'content-length': '520',
 'content-type': 'application/json',
 'date': 'Mon, 22 Jan 2024 18:44:21 GMT',
 'server': 'gunicorn/19.9.0',
 'status': '200'}


In [24]:
http.follow_redirects

True

In [25]:
http.follow_all_redirects

False

In [26]:
resp, data = http.request('https://httpbin.org/absolute-redirect/1',  
                          method = 'GET')

pprint(resp)

{'access-control-allow-credentials': 'true',
 'access-control-allow-origin': '*',
 'connection': 'keep-alive',
 'content-length': '291',
 'content-location': 'http://httpbin.org/get',
 'content-type': 'application/json',
 'date': 'Mon, 22 Jan 2024 18:44:25 GMT',
 'server': 'gunicorn/19.9.0',
 'status': '200'}


In [27]:
resp.previous

{'date': 'Mon, 22 Jan 2024 18:44:24 GMT',
 'content-type': 'text/html; charset=utf-8',
 'content-length': '251',
 'connection': 'keep-alive',
 'server': 'gunicorn/19.9.0',
 'location': 'http://httpbin.org/get',
 'access-control-allow-origin': '*',
 'access-control-allow-credentials': 'true',
 'status': '302',
 'content-location': 'https://httpbin.org/absolute-redirect/1'}

In [28]:
http.follow_redirects = False

In [29]:
resp, data = http.request('https://httpbin.org/absolute-redirect/1',  
                          method = 'GET')

pprint(resp)

{'access-control-allow-credentials': 'true',
 'access-control-allow-origin': '*',
 'connection': 'keep-alive',
 'content-length': '251',
 'content-type': 'text/html; charset=utf-8',
 'date': 'Mon, 22 Jan 2024 18:44:28 GMT',
 'location': 'http://httpbin.org/get',
 'server': 'gunicorn/19.9.0',
 'status': '302'}


In [30]:
resp, data = http.request('https://httpbin.org/redirect-to?url=https://google.com&status_code=200',  
                          method = 'GET')

pprint(resp)

{'access-control-allow-credentials': 'true',
 'access-control-allow-origin': '*',
 'connection': 'keep-alive',
 'content-length': '0',
 'content-type': 'text/html; charset=utf-8',
 'date': 'Mon, 22 Jan 2024 18:44:29 GMT',
 'location': 'https://google.com',
 'server': 'gunicorn/19.9.0',
 'status': '302'}


In [31]:
http.follow_redirects = True

In [32]:
resp, data = http.request('https://httpbin.org/redirect-to?url=https://google.com',  
                          method = 'GET')

pprint(resp)

{'-content-encoding': 'gzip',
 'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000',
 'cache-control': 'private, max-age=0',
 'content-length': '20674',
 'content-location': 'https://www.google.com/',
 'content-security-policy-report-only': "object-src 'none';base-uri "
                                        "'self';script-src "
                                        "'nonce-GHPtmZRWWjiKoqLnimVSGg' "
                                        "'strict-dynamic' 'report-sample' "
                                        "'unsafe-eval' 'unsafe-inline' https: "
                                        'http:;report-uri '
                                        'https://csp.withgoogle.com/csp/gws/other-hp',
 'content-type': 'text/html; charset=ISO-8859-1',
 'date': 'Mon, 22 Jan 2024 18:44:32 GMT',
 'expires': '-1',
 'p3p': 'CP="This is not a P3P policy! See g.co/p3phelp for more info."',
 'server': 'gws',
 'set-cookie': '1P_JAR=2024-01-22-18; expires=Wed, 21-Feb-2024 18:44:32 GMT; '
   