## Find hidden API in KLSEScreener

Created by [tanyongsheng.net](https://tanyongsheng.net)

----

### Introduction
Modern websites use hidden APIs to render dynamic data on the fly through background requests. Often this is used in product pagination, search functionality and other dynamic page parts. Therefore, we will engage in reverse engineering to identify and extract data directly from the APIs.

### Target Website
[KLSE Screener](https://www.klsescreener.com/v2/news)

<img src="../assets/static/klsescreener-news-page.png" width=500px>

In [1]:
%pip install pandas
%pip install tqdm
%pip install requests
%pip install lxml

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [8]:
import pandas
import json
from tqdm import tqdm
import requests
import time
from lxml import etree

headers = {
  'accept': 'application/json, text/javascript, */*; q=0.01',
  'accept-language': 'en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7',
  'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36',
  'x-requested-with': 'XMLHttpRequest',
}
params = {}
session = requests.Session()

def fetch_data(session, url, headers, params):
    response = session.request("GET", url, headers=headers, params=params)
    return response

def scrape_news(max_page_num):
  data_list = []
  #processed_df = pandas.DataFrame()
  for page_num in tqdm(range(1,max_page_num+1)):
    url = f"https://www.klsescreener.com/v2/news/index/{page_num}"
    response = fetch_data(session, url, headers, params)
    data = response.json()
    
    # loop to next page
    page_num += 1
    params["until"]= data["paging"]["until"]

    data_list.append(data)
  return data_list

news_data = scrape_news(20)


  0%|          | 0/20 [00:00<?, ?it/s]

100%|██████████| 20/20 [00:06<00:00,  3.33it/s]


In [23]:
df = pandas.DataFrame(news_data)
pandas.json_normalize(df.to_dict(orient="records"),)

Unnamed: 0,data,count,html,paging.until
0,"[{'News': {'id': '1287167', 'created': '2024-0...",20,"<div class=""item figure flex-block"">\n ...",1709381760
1,"[{'News': {'id': '1287123', 'created': '2024-0...",20,"<div class=""item figure flex-block"">\n ...",1709368533
2,"[{'News': {'id': '1287083', 'created': '2024-0...",20,"<div class=""item figure flex-block"">\n ...",1709342522
3,"[{'News': {'id': '1287031', 'created': '2024-0...",20,"<div class=""item figure flex-block"">\n ...",1709308800
4,"[{'News': {'id': '1286895', 'created': '2024-0...",20,"<div class=""item figure flex-block"">\n ...",1709296726
5,"[{'News': {'id': '1286809', 'created': '2024-0...",20,"<div class=""item figure flex-block"">\n ...",1709294335
6,"[{'News': {'id': '1286701', 'created': '2024-0...",20,"<div class=""item figure flex-block"">\n ...",1709290706
7,"[{'News': {'id': '1286661', 'created': '2024-0...",20,"<div class=""item figure flex-block"">\n ...",1709287860
8,"[{'News': {'id': '1286891', 'created': '2024-0...",20,"<div class=""item figure flex-block"">\n ...",1709285046
9,"[{'News': {'id': '1286843', 'created': '2024-0...",20,"<div class=""item figure flex-block"">\n ...",1709283008


In [30]:
df = pandas.DataFrame(news_data)
df_new = pandas.json_normalize(df.to_dict(orient="records"), 
        record_path=['data'], meta=["count", "html","paging"],
        )
df_new

Unnamed: 0,News.id,News.created,News.modified,News.category,News.publisher_id,News.title,News.summary,News.date_post,News.thumbnail_url,News.author,News.date_added,News.img_url,0.content,Publisher.id,Publisher.name,Publisher.url,count,html,paging
0,1287167,2024-03-03 00:30:52,2024-03-03 00:30:52,国际财经,64,抢搭AI电动车热潮 台积电全球招募8000人,（台北3日讯）台积电搭上AI、电动车等热潮，全力大扩张，宣布要在台湾大手笔征才约1700人，...,2024-03-03 00:25:16,//images.weserv.nl/?url=//www.enanyang.my/site...,,2024-03-03 00:25:16,https://www.enanyang.my/sites/default/files/st...,<div><div><div><div><div><div><div><img loadin...,64,Nanyang,http://www.enanyang.my/财经/国际财经/,20,"<div class=""item figure flex-block"">\n ...",{'until': 1709381760}
1,1287165,2024-03-03 00:30:51,2024-03-03 00:30:51,国际财经,64,创业容易传业难 大华银行愁继承人,彭博评论打造一个庞大商业帝国是一回事，能否在家族内传承则是另一回事。在95岁的黄祖耀月初去世...,2024-03-03 00:22:17,//images.weserv.nl/?url=//www.enanyang.my/site...,,2024-03-03 00:22:17,https://www.enanyang.my/sites/default/files/st...,<div><div><div><div><div><div><div><img loadin...,64,Nanyang,http://www.enanyang.my/财经/国际财经/,20,"<div class=""item figure flex-block"">\n ...",{'until': 1709381760}
2,1287163,2024-03-03 00:30:21,2024-03-03 00:30:21,财经新闻,12,培养更多技职毕业生 农业部重点发展4产业,莫哈末沙布（右二）参观农业专家博览会各个展位。（沙登2日讯）农业及粮食安全部长拿督斯里莫哈末...,2024-03-03 00:22:17,//images.weserv.nl/?url=//www.enanyang.my/site...,,2024-03-03 00:22:17,https://www.enanyang.my/sites/default/files/st...,<div><div><div><div><div><div><div><img loadin...,12,Nanyang,http://www.enanyang.my/财经/财经新闻,20,"<div class=""item figure flex-block"">\n ...",{'until': 1709381760}
3,1287161,2024-03-03 00:30:20,2024-03-03 00:30:20,财经新闻,12,加强与大马各领域合作 乌克兰致力重建经济,（吉隆坡2日讯）乌克兰准备加强与大马在各领域的交流并扩大合作，以致力于重建在过去两年因俄罗斯...,2024-03-03 00:22:17,//images.weserv.nl/?url=//www.enanyang.my/site...,,2024-03-03 00:22:17,https://www.enanyang.my/sites/default/files/st...,<div><div><div><p>（吉隆坡2日讯）乌克兰准备加强与大马在各领域的交流并扩大...,12,Nanyang,http://www.enanyang.my/财经/财经新闻,20,"<div class=""item figure flex-block"">\n ...",{'until': 1709381760}
4,1287159,2024-03-03 00:30:19,2024-03-03 00:30:19,财经新闻,12,农业部拟稻米行动计划 供3月20日会议审议,（吉隆坡2日讯）农业及粮食安全部将与稻米及白米行业利益相关者，就稻谷收购价、供应因素、进口白...,2024-03-03 00:22:17,//images.weserv.nl/?url=//www.enanyang.my/site...,,2024-03-03 00:22:17,https://www.enanyang.my/sites/default/files/st...,<div><div><div><p>（吉隆坡2日讯）农业及粮食安全部将与稻米及白米行业利益相...,12,Nanyang,http://www.enanyang.my/财经/财经新闻,20,"<div class=""item figure flex-block"">\n ...",{'until': 1709381760}
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
395,1286169,2024-02-29 21:51:06,2024-03-01 11:34:32,Edge,52,Syariah Criminal Offences Enactment in Perak s...,The Syariah Criminal Offences Enactment in Per...,2024-02-29 21:29:53,https://assets.theedgemarkets.com/Saarani_shar...,,2024-02-29 21:29:53,https://assets.theedgemarkets.com/Saarani_shar...,<div class=newsTextDataWrapInner><p>IPOH (Feb ...,52,TheEdge,http://www.theedgemarkets.com/categories/malaysia,20,"<div class=""item figure flex-block"">\n ...",{'until': 1709211906}
396,1286165,2024-02-29 21:32:02,2024-03-01 11:30:29,新闻,52,DRB-Hicom posts RM26.5m net profit in 4Q amid ...,DRB-Hicom Bhd posted a net profit of RM26.47 m...,2024-02-29 21:26:04,https://assets.theedgemarkets.com/drb-hicom-2_...,,2024-02-29 21:26:04,https://assets.theedgemarkets.com/drb-hicom-2_...,<div class=newsTextDataWrapInner><p>KUALA LUMP...,52,TheEdge,http://www.theedgemarkets.com/categories/malaysia,20,"<div class=""item figure flex-block"">\n ...",{'until': 1709211906}
397,1286163,2024-02-29 21:32:01,2024-03-01 11:04:42,,52,June 13 hearing for govt's application to stri...,The High Court has fixed June 13 to hear the g...,2024-02-29 21:15:00,https://assets.theedgemarkets.com/wayhta_theed...,,2024-02-29 21:15:00,https://assets.theedgemarkets.com/wayhta_theed...,<div class=newsTextDataWrapInner><p>KUALA LUMP...,52,TheEdge,http://www.theedgemarkets.com/categories/malaysia,20,"<div class=""item figure flex-block"">\n ...",{'until': 1709211906}
398,1286161,2024-02-29 21:18:54,2024-03-03 02:46:13,新闻,53,成本和燃油价格高涨 亚航长程末季净利暴跌82%,（吉隆坡29日讯）尽管营业额大涨，但维护与检修成本增加、员工成本高涨，以及燃料成本上涨，重创...,2024-02-29 21:07:23,https://assets.theedgemarkets.com/AirAsia-X_fu...,,2024-02-29 21:07:23,https://assets.theedgemarkets.com/AirAsia-X_fu...,<div class=newsTextDataWrapInner><p>（吉隆坡29日讯）尽...,53,TheEdge,http://www.theedgemarkets.com/categories/news,20,"<div class=""item figure flex-block"">\n ...",{'until': 1709211906}


## Additional info: How to find hidden API endpoint via Chrome Developer Tool

1. Open the website: [https://www.klsescreener.com/v2/news](https://www.klsescreener.com/v2/news) in your Chrome browser and open the Chrome Developer Tools by pressing Ctrl + Shift + I on your keyboard.

2. Go to the "Network" tab in DevTools and select "Fetch/XHR".

3. Scroll down and load the next pagination to trigger the AJAX API request as below:

<br>
<img src="../assets/static/klsescreener-news-page-finding-API.jpg" width=500px alt="Quotes to scrape's scrolling page to find APIs">



## Computing Environment

In [1]:
%load_ext watermark

%watermark

# print out pypi packages used
%watermark --iversions

# date
%watermark -u -n -t -z

Last updated: 2024-03-03T11:46:04.300811+08:00

Python implementation: CPython
Python version       : 3.9.16
IPython version      : 8.8.0

Compiler    : MSC v.1916 64 bit (AMD64)
OS          : Windows
Release     : 10
Machine     : AMD64
Processor   : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
CPU cores   : 8
Architecture: 64bit


Last updated: Sun Mar 03 2024 11:46:04Malay Peninsula Standard Time

