### Use the urllib Module
- ‘urllib’ is not just a stand-alone module; it’s actually a package that contains sub-modules to handle different aspects of URL manipulations. 
- The module we’ll be using to open and read URLs is urllib.request.

In [3]:
# importing the module 
import urllib


### Fetching URLs with urllib.request


In [4]:
# importing urllib request 
import urllib.request

# URL to fetch
url = "http://google.com"

# open the url

responce = urllib.request.urlopen(url)


### Reading Contents

In [5]:
# Reading responce data
# we simply use read() for read the responce. The responce returned into bytes, so we will use decode() to coonvert it into a string


html = responce.read().decode()
print(html)

<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en-PK"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/logos/doodles/2025/new-years-day-2025-6753651837110593.2-law.gif" itemprop="image"><meta content="New Year's Day 2025" property="twitter:title"><meta content="New Year's Day 2025 #GoogleDoodle" property="twitter:description"><meta content="New Year's Day 2025 #GoogleDoodle" property="og:description"><meta content="summary_large_image" property="twitter:card"><meta content="@GoogleDoodles" property="twitter:site"><meta content="https://www.google.com/logos/doodles/2025/new-years-day-2025-6753651837110593-2xa.gif" property="twitter:image"><meta content="https://www.google.com/logos/doodles/2025/new-years-day-2025-6753651837110593-2xa.gif" property="og:image"><meta content="1150" property="og:image:width"><meta content="460" property="og:image:height"><meta content="https://www.google.com/logos/doodles/2025/new-years-d

### Sending HTTP Request - GET and POST

In [6]:
# A GET request

# import json

# Open the URL and read the response

responce = urllib.request.urlopen('https://jsonplaceholder.typicode.com/posts/1')
data = responce.read().decode()

# parced the json data
# parced_data = json.loads(data)


print(data)


{
  "userId": 1,
  "id": 1,
  "title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit",
  "body": "quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto"
}


In [2]:
# A POST Request

import urllib.parse
import urllib.request
import json



data = urllib.parse.urlencode({'id':'1002','name':'New data'}).encode()

# Make post request to json placehoslder (json is a free fake online REST API for testing and prototyping)
responce = urllib.request.urlopen("http://jsonplaceholder.typicode.com/posts",data=data)

# Read and decode the responce
responce_data = responce.read().decode()
print("Responce Status: ", responce.status)
print("Responce Data; ", responce_data)





Responce Status:  201
Responce Data;  {
  "id": 101,
  "name": "New data"
}


### Digging Deeper into Python's urllib Module

In [11]:
import http.cookiejar
import urllib.request

# create a cookie jar 
jar = http.cookiejar.CookieJar()


# Create an URL opener with our cookie jar
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(jar))

# use the opener to fetch a URL
responce = opener.open("https://jsonplaceholder.typicode.com/posts/1")

for cookie in jar:
    print(cookie)

### Error Handling

In [13]:
from urllib.error import URLError

try:
    responce = urllib.request.urlopen('https://google.com')
except URLError as e:
    if hasattr(e, 'reason'):
        print("WE failed to reach a server. ")
        print("Reason: ", e.reason)
    elif hasattr(e, 'code'):
        print("The server couldn't fulfil the request. ")
        print("Error code: ", e.code)

### Adding header to your request
- When working with web APIs, you would often need to include additional headers in your request. With ‘urllib’, this is smooth sailing:

In [4]:
# importing aditional module
from urllib.request import Request, urlopen

# Define the url
url = 'http://facebook.com'

# Create a request object
req = Request(url, headers={'User-Agent':'Mozilla/5.0'})

# Pass the request object to urlopen
responce = urlopen(req).read()

for head in responce:
    print(head)

60
33
68
79
67
84
89
80
69
32
104
116
109
108
62
10
60
104
116
109
108
32
108
97
110
103
61
34
117
114
34
32
105
100
61
34
102
97
99
101
98
111
111
107
34
32
99
108
97
115
115
61
34
110
111
95
106
115
34
62
10
60
104
101
97
100
62
60
109
101
116
97
32
99
104
97
114
115
101
116
61
34
117
116
102
45
56
34
32
47
62
60
109
101
116
97
32
110
97
109
101
61
34
114
101
102
101
114
114
101
114
34
32
99
111
110
116
101
110
116
61
34
100
101
102
97
117
108
116
34
32
105
100
61
34
109
101
116
97
95
114
101
102
101
114
114
101
114
34
32
47
62
60
115
99
114
105
112
116
32
110
111
110
99
101
61
34
97
79
112
70
88
86
51
50
34
62
102
117
110
99
116
105
111
110
32
101
110
118
70
108
117
115
104
40
97
41
123
102
117
110
99
116
105
111
110
32
98
40
98
41
123
102
111
114
40
118
97
114
32
99
32
105
110
32
97
41
98
91
99
93
61
97
91
99
93
125
119
105
110
100
111
119
46
114
101
113
117
105
114
101
76
97
122
121
63
119
105
110
100
111
119
46
114
101
113
117
105
114
101
76
97
122
121
40
91
34
69
110
118
34
93
4

### URL Parcing
* ‘urllib’ also provides utilities for URL parsing. This comes in handy while dealing with complex URLs.

In [6]:
from urllib.parse import urlparse


url = 'https://jsonplaceholder.typicode.com/posts/1'

urlComponents = urlparse(url)


print(urlComponents.scheme) # http
print(urlComponents.netloc) # example.com
print(urlComponents.path)   # /api/v1/data
print(urlComponents.query)  # id = 1001

https
jsonplaceholder.typicode.com
/posts/1



In [9]:
from urllib.request import urlopen, Request

url = "https://facebook.com"
# headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}

try:
    request = Request(url)
    response = urlopen(request)
    html = response.read()
    html_content = html.decode('utf-8')
    print(html_content)
except Exception as e:
    print("An error occurred:", e)


<!DOCTYPE html>
<html lang="ur" id="facebook" class="no_js">
<head><meta charset="utf-8" /><meta name="referrer" content="default" id="meta_referrer" /><script nonce="BUofRrms">function envFlush(a){function b(b){for(var c in a)b[c]=a[c]}window.requireLazy?window.requireLazy(["Env"],b):(window.Env=window.Env||{},b(window.Env))}envFlush({"useTrustedTypes":false,"isTrustedTypesReportOnly":false,"ajaxpipe_token":"AXhmUT81-_iAdwZMbj0","stack_trace_limit":30,"timesliceBufferSize":5000,"show_invariant_decoder":false,"compat_iframe_token":"AUXAiDHChewUIHWKWJwoc0K_9u4","isCQuick":false,"brsid":"7454923681511825327"});</script><script nonce="BUofRrms">window.openDatabase&&(window.openDatabase=function(){throw new Error()});</script><script nonce="BUofRrms">_btldr={};</script><script nonce="BUofRrms">function parentIsNotHeadNorBody(a){return a.parentElement!==document.body&&a.parentElement!==document.head}function isTagSupported(a){return a.nodeName==="SCRIPT"||a.nodeName==="LINK"&&((a=getNodeDat