The headers dictionary you provided is used to include additional information in your HTTP request. In particular, it includes a Cookie field, which is often used to maintain session information, and a User-Agent field, which identifies the type of browser or user agent making the request.

## Cookie:

The Cookie field in the headers is used to send previously stored information (cookies) back to the server. Cookies are often used to maintain session state or store user-specific information.
Without proper session information (handled by cookies), some websites may not recognize your requests as part of an authenticated session, leading to issues with accessing protected resources or performing actions that require authentication.

## User-Agent:

The User-Agent field identifies the type of browser or user agent making the request. Some websites may serve different content or apply different behavior based on the detected user agent.
Providing a user agent in your request headers helps mimic the behavior of a specific browser, reducing the chances of being blocked or served different content based on the assumption that the request is coming from a particular browser or device.

In summary, by including these headers in your HTTP request, you're providing the server with the necessary information to recognize your request as part of an authenticated session and to understand the type of client (browser) making the request. This can be crucial for accessing certain resources on websites that require authentication or tailor their responses based on the client's characteristics. Without these headers, the server might treat your request differently or may not recognize it as part of a valid session.

In [84]:
import requests

url = "https://cmci.dti.gov.ph/rankings-data.php?unit=1st%20to%202nd%20Class%20Municipalities"

headers = {
    "Cookie": "PHPSESSID=efcfnio3gnhiuq5dg0h5kn6ngd; 42205=1699759017502-513594619; _ga=GA1.1.108069233.1699759020; Lyp1CWKh=AxlSzcGLAQAAhWcardUZgXeYyFFkp8FGW2gwSZt8dMnkLrEAj3az1q_Hty_GAbS-P6WuctQxwH8AAEB3AAAAAA==; TS01e0ca52=01ba3f5e9682f52132a7caed69ce5d2f85a23d39b79bf080f19d7af1cd2efbe36046eff2f1796b0f1b71c6a018e39722b51dabdf5e; OClmoOot=A-VNiMGLAQAAWKNYkFcQe0sydi9bFTxmsHfVB-ZvQpr1ote12KeqWyVDlRiMAbS-P6WuctQxwH8AAEB3AAAAAA|1|1|0089bd88909c9a03e9ee6e24ea0650382dbf5df0; _ga_W4V0QFCBR4=GS1.1.1699766699.3.0.1699766699.0.0.0; 422003=oBUWdYAUAjybn03Bmf3oBsSmzcqsvoRLAPILYVkJlmZM0Hir0xHy20C/ION0e5lH/A08nWfco1o6Lo4GioLQRYXr8gFpl6i6giqSmFhH1isTs4+lWZwyyX78tgIiiu3gBZmxo+2pLBQgTDRubY+aQpdaKyXz2gITK97OOkS23KHYBEvT",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36",
}

response = requests.get(url, headers=headers)

# Check if the request was successful (status code 200)
if response.status_code == 200:
    # Print or do something with the response content
    print(response.text)
else:
    print(f"Failed to retrieve the page. Status code: {response.status_code}")


<!DOCTYPE html>
<html lang="en-US" class="no-js">
<head><script src="/common.js?matcher"></script><script src="/common.js?single"></script>
	<meta charset="UTF-8">
	<meta name="viewport" content="width=device-width, initial-scale=1">
  <title>Rankings Data - Cities and Municipalities Competitive Index</title>
  <meta content="width=device-width, initial-scale=1.0" name="viewport">
  <meta content="" name="keywords">
  <meta content="" name="description">
	
	<meta name="robots" content="index, follow" />
	<meta name="googlebot" content="index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1" />
	<meta name="bingbot" content="index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1" />
	<link rel="canonical" href="https://dict.gov.ph/learn-ict/ict-trainings/web-design/" />
	<meta property="og:locale" content="en_US" />
	<meta property="og:type" content="article" />
	<meta property="og:title" content="Cities and Municipalities Competitiveness Inde

try removing header values to see what will happen

In [85]:
from bs4 import BeautifulSoup

In [86]:
soup = BeautifulSoup(response.content,"html.parser")
tbody = soup.find_all("tbody")

# 2023 RANKINGS OF 1ST TO 2ND CLASS MUNICIPALITIES

In [87]:
len(tbody)

2

In [88]:
column_names = [z.text for z in soup.find_all("th")]

In [89]:
column_names[0:5]

['Rank', 'Score', 'LGU', 'Province', 'Region']

In [90]:
rank_score = [[z+"_rank",z+"_score"] for z in column_names[5:10]]
rank_score = sum(rank_score,[])
rank_score

['Economic Dynamism_rank',
 'Economic Dynamism_score',
 'Government Efficiency_rank',
 'Government Efficiency_score',
 'Infrastructure_rank',
 'Infrastructure_score',
 'Resiliency_rank',
 'Resiliency_score',
 'Innovation_rank',
 'Innovation_score']

In [91]:
final_names = column_names[0:5]+rank_score
print(len(final_names))
final_names

15


['Rank',
 'Score',
 'LGU',
 'Province',
 'Region',
 'Economic Dynamism_rank',
 'Economic Dynamism_score',
 'Government Efficiency_rank',
 'Government Efficiency_score',
 'Infrastructure_rank',
 'Infrastructure_score',
 'Resiliency_rank',
 'Resiliency_score',
 'Innovation_rank',
 'Innovation_score']

In [79]:
import pandas as pd 

mun_list = []
for i in tbody[0].find_all("tr"):
    row_list = []
    for j in i.find_all("td"):
        row_list.append(j.text)
    mun_list.append(row_list)

mun_df = pd.DataFrame(mun_list)
mun_df.columns = final_names
mun_df

Unnamed: 0,Rank,Score,LGU,Province,Region,Economic Dynamism_rank,Economic Dynamism_score,Government Efficiency_rank,Government Efficiency_score,Infrastructure_rank,Infrastructure_score,Resiliency_rank,Resiliency_score,Innovation_rank,Innovation_score
0,1st,47.2492,Cainta,Rizal,REGION IV-A (CALABARZON),1,10.4160,13,10.7624,3,5.1031,28,12.1040,11,8.8637
1,2nd,45.0574,Taytay (RL),Rizal,REGION IV-A (CALABARZON),2,8.2441,34,10.2335,2,5.3807,23,12.1592,9,9.0399
2,3rd,44.0800,Baliwag,Bulacan,REGION III (Central Luzon),11,6.4308,22,10.4303,5,4.8799,8,12.8725,6,9.4665
3,4th,43.7654,Carmona,Cavite,REGION IV-A (CALABARZON),3,7.6384,3,11.2303,26,3.7397,83,11.6230,3,9.5340
4,5th,43.3141,San Mateo (RL),Rizal,REGION IV-A (CALABARZON),25,5.0986,113,9.6128,7,4.7735,1,14.3093,4,9.5199
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
506,506th,17.0635,Kabugao,Apayao,CAR - Cordillera Administrative Region,494,1.7508,488,5.8737,498,0.6329,494,6.7891,463,2.0170
507,507th,16.3494,Maguing,Lanao Del Sur,BARMM - Bangsamoro Autonomous Region in Muslim...,498,1.3293,502,4.5967,474,1.8367,492,8.5527,506,0.0340
508,508th,16.0160,Tuburan (BA),Basilan,BARMM - Bangsamoro Autonomous Region in Muslim...,503,0.4697,497,4.8976,501,0.1859,442,10.4598,509,0.0030
509,509th,14.1143,Parang (SU),Sulu,BARMM - Bangsamoro Autonomous Region in Muslim...,499,1.1682,492,5.5530,467,1.9248,496,4.4533,487,1.0150
