# Session 1 - Fetching Data

## Downloading files

We can use `urlretrieve` from `urllib.request` module to download file.

For example, we can download geckdriver.zip file from their Github repository with the following code.

In [None]:
from datetime import datetime

datetime.now()

datetime.datetime(2023, 8, 18, 19, 2, 52, 202466)

In [None]:
from urllib.request import urlretrieve

urlretrieve('https://www.cpttm.org.mo/images/cpttm-logo.png','cpttm.jpg')

('cpttm.jpg', <http.client.HTTPMessage at 0x1816a6c1d08>)

In [None]:
'''Download chart from AAStock server with given stock numbers.'''

from urllib.request import urlretrieve

stock_numbers = ['0001','0005','0011','0700','3333','0002','0012']

for stock_number in stock_numbers:
    url = f"http://charts.aastocks.com/servlet/Charts?fontsize=12&15MinDelay=T&lang=1&titlestyle=1&vol=1&Indicator=1&indpara1=10&indpara2=20&indpara3=50&indpara4=100&indpara5=150&subChart1=2&ref1para1=14&ref1para2=0&ref1para3=0&subChart2=3&ref2para1=12&ref2para2=26&ref2para3=9&subChart3=12&ref3para1=0&ref3para2=0&ref3para3=0&scheme=3&com=100&chartwidth=660&chartheight=855&stockid=00{stock_number}.HK&period=6&type=1&logoStyle=1"
    urlretrieve(url, f'C:\\Users\\cm540-08-2023-c\\Desktop\\{stock_number}-chart.gif')

# Fetching XML

In [None]:
pip install untangle

Note: you may need to restart the kernel to use updated packages.


## Example: SMG.gov.mo

xml.smg.gov.mo

In [None]:
import untangle
import datetime

obj = untangle.parse('https://xml.smg.gov.mo/c_actual_brief.xml')

# obj.ActualWeatherBrief.Custom.Temperature[0].Value.cdata

temperature = obj.ActualWeatherBrief.Custom.Temperature.Value.cdata
humidity = obj.ActualWeatherBrief.Custom.Humidity.Value.cdata

print("現時澳門氣溫 " + temperature + " 度，濕度 " + humidity + "%。")



'29'

There may be error when running the code above, depending on how many "Temperature" data are there from SMG.gov.mo.

If there are only one `Temperature` data, it is a direct access. If there are more than one `Temperature` data, it becomes a list. We can determine if it is a list by checking `type(target) == list`.

In [None]:
type('hello')==type("")

True

In [None]:
import untangle
import datetime

obj = untangle.parse('https://xml.smg.gov.mo/c_actual_brief.xml')

humidity = obj.ActualWeatherBrief.Custom.Humidity.Value.cdata

if type(obj.ActualWeatherBrief.Custom.Temperature) == list:
    temperature = obj.ActualWeatherBrief.Custom.Temperature[0].Value.cdata
else:
    temperature = obj.ActualWeatherBrief.Custom.Temperature.Value.cdata


print("現時澳門氣溫 " + temperature + " 度，濕度 " + humidity + "%。")



現時澳門氣溫 29 度，濕度 84%。


## Example: 博彩月計毛收入

http://www.dicj.gov.mo/web/cn/information/DadosEstat_mensal/index.html

In [None]:
import untangle
import datetime

year = datetime.date.today().year

# list begins at 0, and we look for previous month.
month = datetime.date.today().month - 2

if month < 0:
    year = year - 1
    month = 11 # list beings at 0.

url = f"http://www.dicj.gov.mo/web/cn/information/DadosEstat_mensal/{year}/report_cn.xml?id=8"

data = untangle.parse(url)

month_data = data.STATISTICS.REPORT.DATA.RECORD[month]

net_income = month_data.DATA[1].cdata
last_net_income = month_data.DATA[2].cdata
change_rate = month_data.DATA[3].cdata
acc_net_income = month_data.DATA[4].cdata
acc_last_net_income = month_data.DATA[5].cdata
acc_change_rate = month_data.DATA[6].cdata

print(f"{year} 年 {month+1} 月份 毛收入 {net_income} ({year-1}:{last_net_income}), {change_rate}")
print(f"{year} 年 {month+1} 月份 累計毛收入 {acc_net_income} ({year-1}:{acc_last_net_income}), {acc_change_rate}")

2023 年 7 月份 毛收入 16,662 (2022:398), 4082.9%
2023 年 7 月份 累計毛收入 96,798 (2022:26,668), 263.0%


## 過去 12 個月博彩月計毛收入

In [None]:
from datetime import date
from dateutil.relativedelta import relativedelta

def fetch_and_print_dicj_year_month(year, month):
    url = f"http://www.dicj.gov.mo/web/cn/information/DadosEstat_mensal/{year}/report_cn.xml"

    data = untangle.parse(url)

    month_data = data.STATISTICS.REPORT.DATA.RECORD[month-1]

    net_income = month_data.DATA[1].cdata
    last_net_income = month_data.DATA[2].cdata
    change_rate = month_data.DATA[3].cdata
    acc_net_income = month_data.DATA[4].cdata
    acc_last_net_income = month_data.DATA[5].cdata
    acc_change_rate = month_data.DATA[6].cdata

    print(f"{year} 年 {month}  月份 毛收入\t {net_income} \t ({year-1}:{last_net_income}), {change_rate}")
#     print(f"{year} 年 {month} 累計毛收入\t {acc_net_income}\t ({year-1}:{acc_last_net_income}), {acc_change_rate}")

for i in range(1,12):
    target_date = datetime.date.today() - relativedelta(months=i)
    fetch_and_print_dicj_year_month(target_date.year, target_date.month)

2023 年 7  月份 毛收入	 16,662 	 (2022:398), 4082.9%
2023 年 6  月份 毛收入	 15,207 	 (2022:2,477), 513.9%
2023 年 5  月份 毛收入	 15,565 	 (2022:3,341), 365.9%
2023 年 4  月份 毛收入	 14,722 	 (2022:2,677), 449.9%
2023 年 3  月份 毛收入	 12,738 	 (2022:3,672), 246.9%
2023 年 2  月份 毛收入	 10,324 	 (2022:7,759), 33.1%
2023 年 1  月份 毛收入	 11,580 	 (2022:6,344), 82.5%
2022 年 12  月份 毛收入	 3,482 	 (2021:7,962), -56.3%
2022 年 11  月份 毛收入	 2,999 	 (2021:6,749), -55.6%
2022 年 10  月份 毛收入	 3,899 	 (2021:4,365), -10.7%
2022 年 9  月份 毛收入	 2,962 	 (2021:5,879), -49.6%


## Json file

https://data.gov.mo/Detail?id=a7059c6d-880b-4abd-8bef-45a9498442ba

In [None]:
import json

with open('rest.json', encoding='utf-8') as f:
    data = json.loads(f.read())

print(data)


[{'nameZh': '亞堅奴前地休憩區', 'namePt': 'Zona de Lazer do Largo do Aquino', 'nameEn': 'Leisure Area in Largo do Aquino', 'location': '22.192494,113.536783', 'openHourZh': '全日', 'openHourPt': 'aberto 24 horas', 'openHourEn': 'Whole Day', 'photo': 'https://www.iam.gov.mo/showFile.ashx?p=iamweb/facilities/637351834014571.jpg', 'tempClose': False}, {'nameZh': '亞馬喇前地 澳門亞馬喇前地', 'namePt': 'Praça de Ferreira do Amaral Praça de Ferreira do Amaral, Macau', 'nameEn': 'Praça de Ferreira do Amaral Praça de Ferreira do Amaral, Macau', 'location': '22.188361,113.543497', 'openHourZh': '全日', 'openHourPt': 'aberto 24 horas', 'openHourEn': 'Whole Day', 'photo': 'https://www.iam.gov.mo/showFile.ashx?p=iamweb/facilities/637342124555015.jpg', 'tempClose': False}, {'nameZh': '何林圍休憩區', 'namePt': 'Zona de Lazer do Pátio de Além-Bosque', 'nameEn': 'Leisure Area in Pátio de Além-Bosque', 'location': '22.199711,113.538474', 'openHourZh': '全日', 'openHourPt': 'Dia inteiro', 'openHourEn': 'Whole day', 'photo': 'https://w

# Exercise

Macau government data platform  

https://data.gov.mo/Detail?id=b4fa6540-4295-4bf6-81d3-3377dd013fa1

Try list out all the bars in Macau, in below format:  

[繁體中文名稱] [英文名稱] - [繁體中文地址]  
[聯絡電話]  
[網址]  

In [None]:
import requests
import untangle

headers = {
    'Authorization': 'APPCODE 09d43a591fba407fb862412970667de4'
}
response = requests.get('https://dst.apigateway.data.gov.mo/dst_bars', headers=headers)
response.encoding = 'utf-8'

with open('lab6.xml','w',encoding='utf-8') as f:
    f.write(response.text)

data = untangle.parse('lab6.xml')
print(data.mgto.bars[0].com_name_en.cdata)

### Request

A web request is a request made by a client, such as a web browser, to a server in order to retrieve a web page or other resource. Web requests are sent using the Hypertext Transfer Protocol (HTTP), which is a standard protocol for transmitting data on the World Wide Web.  
Normally, when we browse a website, we invoke a "GET" request, and when we submit a form, usually we submite a "POST" request.

The most common types of request methods are GET and POST but there are many others, including HEAD, PUT, DELETE, CONNECT, and OPTIONS. GET and POST are widely supported while support for other methods is sometimes limited but expanding.

In [None]:
import requests

res = requests.get('https://www.google.com')

print(f'Response Code: {res} \n')
print(res.text)

Response Code: <Response [200]> 

<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="zh-TW"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script nonce="REjZK3XO63h8BLRYdZ4ABg">(function(){var _g={kEI:'6BvjZNj2JJnm2roP-oetuAQ',kEXPI:'0,1359409,6059,206,4804,2316,383,246,5,1129120,1197716,380775,16114,19398,9286,22430,1362,284,12034,17581,4998,17075,38444,2872,2891,4140,4208,3406,606,30668,30021,16336,20583,4,3832,42126,13659,4437,22565,6672,7596,1,39047,2,3105,2,39761,5679,1021,31122,4567,6259,23418,1249,25138,7929,2,2,1,24626,2006,8155,7381,1479,14490,873,19634,7,1922,9779,12415,30044,20198,928,19209,14,82,20206,8377,18988,253,297,4825,3030,6110,5041,4665,1804,13356,1004,12047,6690,2172,5252,6561,1635,7948,5739,1299,11713,1991,5777,146,12742,3,8,3690,440,1271,2413,4234,5206087,997,2,242,5994820,2803117,3311,141,795,1973

This is a list of Hypertext Transfer Protocol (HTTP) response status codes. Status codes are issued by a server in response to a client's request made to the server.   
The commons are:   

200 OK
- Standard response for successful HTTP requests. The actual response will depend on the request method used. In a GET request, the response will contain an entity corresponding to the requested resource. In a POST request, the response will contain an entity describing or containing the result of the action.

301 Moved Permanently
- This and all future requests should be directed to the given URI.

400 Bad Request
- The server cannot or will not process the request due to an apparent client error (e.g., malformed request syntax, size too large, invalid request message framing, or deceptive request routing).  

401 Unauthorized
- Similar to 403 Forbidden, but specifically for use when authentication is required and has failed or has not yet been provided. The response must include a WWW-Authenticate header field containing a challenge applicable to the requested resource. See Basic access authentication and Digest access authentication. 401 semantically means "unauthorised", the user does not have valid authentication credentials for the target resource.
Some sites incorrectly issue HTTP 401 when an IP address is banned from the website (usually the website domain) and that specific address is refused permission to access a website.[citation needed]

402 Payment Required
- Reserved for future use. The original intention was that this code might be used as part of some form of digital cash or micropayment scheme, as proposed, for example, by GNU Taler,[14] but that has not yet happened, and this code is not widely used. Google Developers API uses this status if a particular developer has exceeded the daily limit on requests.[15] Sipgate uses this code if an account does not have sufficient funds to start a call.[16] Shopify uses this code when the store has not paid their fees and is temporarily disabled.[17] Stripe uses this code for failed payments where parameters were correct, for example blocked fraudulent payments.[18]

403 Forbidden
- The request contained valid data and was understood by the server, but the server is refusing action. This may be due to the user not having the necessary permissions for a resource or needing an account of some sort, or attempting a prohibited action (e.g. creating a duplicate record where only one is allowed). This code is also typically used if the request provided authentication by answering the WWW-Authenticate header field challenge, but the server did not accept that authentication. The request should not be repeated.

404 Not Found
- The requested resource could not be found but may be available in the future. Subsequent requests by the client are permissible.

### Beautiful Soup  
Beautiful Soup is a Python library for parsing structured data. It allows you to interact with HTML in a similar way to how you interact with a web page using developer tools.

In [None]:
from bs4 import BeautifulSoup

res = requests.get('https://example.com/')
soup = BeautifulSoup(res.text)

print(soup)
print(type(soup)) # Even it looks the same, but Beautiful Soup already help to you wrap the whole response as a new object

print(soup.find('h1').text) # With Beautiful Soup, you can easily extract out the element in the HTML code


<class 'bs4.BeautifulSoup'>
Example Domain


### Scraping the news headline

In [None]:
from bs4 import BeautifulSoup
import requests

res = requests.get("https://www.gcs.gov.mo/news/list/zh-hant/news/")
soup = BeautifulSoup(res.text)

soup.find_all('div', {'class':'cell captionSize subject'})
# soup.find([tag name], {'attr. name':'attr. value'})
# Find the first element matched

for c in soup.find_all("div", {"class": "cell captionSize subject"} ):
    print(c.text.strip())


加蓋“立法會大樓開放日”紀念郵戳
氣象局1月親子活動即日起接受公眾報名
2023年第四季度《殘疾僱員工作收入補貼計劃》  1月2日起接受申請
澳門理工大學2024/2025學年學士學位課程現正接受報名
立法會開放日即將舉行
市政署2024年度各類准照續期申請 可到“政府綜合服務大樓” 或 “離島政府綜合服務中心”辦理
【除夕17.5萬創新高】2023旅客2,823萬回復疫情前7成
澳門理工大學學者餐景研究助澳門世界旅遊休閒中心建設
配合四橋工程  明外港航道封航
新年快樂


### Scraping the news headline (mobile website)

In [None]:
res = requests.get("https://www.gcs.gov.mo/news/list/zh-hant/news/")
soup = BeautifulSoup(res.text)

for s in soup.find_all('h5'):
    print(s.text.strip())

加蓋“立法會大樓開放日”紀念郵戳
氣象局1月親子活動即日起接受公眾報名
2023年第四季度《殘疾僱員工作收入補貼計劃》  1月2日起接受申請
澳門理工大學2024/2025學年學士學位課程現正接受報名
立法會開放日即將舉行
市政署2024年度各類准照續期申請 可到“政府綜合服務大樓” 或 “離島政府綜合服務中心”辦理
【除夕17.5萬創新高】2023旅客2,823萬回復疫情前7成
澳門理工大學學者餐景研究助澳門世界旅遊休閒中心建設
配合四橋工程  明外港航道封航
新年快樂


### Also scrape the content?

In [None]:
res = requests.get("https://www.gcs.gov.mo/news/list/zh-hant/news/")
soup = BeautifulSoup(res.text)

link = soup.find('a',{'class':'baseInfo container grid-x grid-margin-y'})

# new_link = 'https://www.gcs.gov.mo/news/list/zh-hant/news/' + link['href']
new_link = 'https://www.gcs.gov.mo/news/' + link['href'].replace('../../../','')

res_content = requests.get(new_link)
soup_content = BeautifulSoup(res_content.text)

content = soup_content.find('div',{'class':'cell baseContent baseSize text-justify content NEWS'})
print(content.text.strip())

為進一步推動澳門作為“創意城巿美食之都”，培養本地餐飲廚藝人才，勞工事務局和澳娛綜合度假股份有限公司合作推出“澳娛綜合「升」級廚藝事業發展計劃＂，以“先入職、後培訓”方式，協助符合條件和有意從事餐飲廚藝事業發展的澳門居民就業，計劃由8月22日至29日接受申請。
提供20個職缺　先入職後培訓　快速拓展職業生涯
計劃提供西式廚藝助理人員職缺20個，獲錄取參加計劃者，將接受為期12個月的一系列培訓，內容包括專業技能及知識、職業素養、食品衛生常識，以及職安健知識等。完成培訓後將獲安排參加“酒店及飲食業職安卡＂及“職業技能等級認定─西式烹調師（初級）＂等考試，助參加者做好裝備，快速拓展職業生涯。
網上報名　額滿即止　參加講座　了解詳情
如欲了解計劃章程的澳門居民可登入勞工局網站內相關專頁：https://www.dsal.gov.mo/zh_tw/standard/sjm_foodandbeverage_trainingprogramme.html，有意參加者可於8月22日至29日期間網上遞交申請，名額有限，額滿即止。
申請期結束後，澳娛綜合將分批通知符合條件的申請人參加講座，詳細說明計劃的各項培訓內容及職涯發展階梯，講座及面試日期為9月1日，講座後隨即安排面試，請參加者關注並保持電話暢通，以便接收澳娛綜合進一步通知。
如有查詢，可於辦公時間內致電勞工局就業廳蔡小姐電話：（83999822）或澳娛綜合人才招募熱線代表葉小姐（電話：82970980）。


In [None]:
res = requests.get("https://www.gcs.gov.mo/news/list/zh-hant/news/")
soup = BeautifulSoup(res.text)

links = soup.find_all('a',{'class':'baseInfo container grid-x grid-margin-y'})

for a in links[:5]:
    new_link = 'https://www.gcs.gov.mo/news/' + a['href'].replace('../../../','')
    res_content = requests.get(new_link)
    soup_content = BeautifulSoup(res_content.text)
    print(soup_content.find('h5',{'class','cell captionSize'}).text.strip()+'\n')
    content = soup_content.find('div',{'class':'cell baseContent baseSize text-justify content NEWS'})
    print(content.text.strip())
    print('\n-----------------------------------------------------\n')

加蓋“立法會大樓開放日”紀念郵戳

為配合立法會大樓開放日之舉行，郵電局特於二零二四年一月六日上午十時至下午六時，在立法會大樓前地設置臨時櫃台，為市民提供加蓋“立法會大樓開放日”紀念郵戳服務。
屆時將發售此活動之貼票連蓋銷紀念封，每個售價為澳門元八元，同時亦出售各類澳門集郵品。歡迎市民參觀選購。

-----------------------------------------------------

氣象局1月親子活動即日起接受公眾報名

為持續推動多元化的氣象科普教育，地球物理氣象局將於1月舉辦 “氣象FUN識親子遊” 及 “遊氣象 樂探索” 兩項親子活動。活動旨在帶領參加者開啓一趟探索氣象科學樂趣，感受氣象科學魅力的親子旅程，並藉有趣互動的氣象之旅，加深參加者對本澳氣象服務及運作的瞭解。兩項活動由即日（2日）起至1月6日接受網上報名，參加者於活動結束後可獲精美紀念品乙份。
其中， “氣象FUN識親子遊” 將於1月27日（星期六）舉辦，對象為4至8歲兒童及其家長，活動內容包括欣賞氣象劇場、體驗懸掛風球及擔任小小天氣預報員模擬講解天氣情況等； “遊氣象 樂探索” 將於1月20日（星期六）舉辦，對象為9至12歲的孩童及其家長，活動內容包括體驗氣象探測、進行互動遊戲及氣象工作坊等。此外，兩項活動均會實地參觀預報預警的核心地帶及前往百年氣象站認識氣象設施。
兩項親子活動的報名皆以家庭為單位，每組家庭可登記2至5人，費用全免，若報名人數超過名額上限，將採用電腦抽籤方式決定，並以短訊通知錄取人士。家長如欲瞭解更多活動詳情，可掃描活動宣傳圖上的二維碼或瀏覽氣象局網頁https://www.smg.gov.mo/zh/news-detail/518。

-----------------------------------------------------

2023年第四季度《殘疾僱員工作收入補貼計劃》 1月2日起接受申請

為支援殘疾人士就業，讓殘疾僱員獲得最基本的工資權益，特區政府於2020年11月1日推出《殘疾僱員工作收入補貼計劃》行政法規。根據規定，每年1月、4月、7月及10月接受前一季度的收入補貼申請，符合資格的殘疾僱員由 1 月2日起至1月31日可向勞工事務局申請2023年第四季的工作收入補貼。
法規生效至今，勞工事務局已接收12個季度的申請，累計收到243

## Fetching Macao Daily news

In [None]:
from bs4 import BeautifulSoup
import requests
import datetime

today = datetime.date.today()
year = today.year
month = today.month
day = today.day

month = str(month).zfill(2)
day = str(day).zfill(2)

res = requests.get(f"http://www.macaodaily.com/html/{year}-{month}/{day}/node_1.htm")
res.encoding = "utf-8"
soup = BeautifulSoup(res.text) # Be aware that you may need a different parser if "lxml" not found.
all_article = soup.find('div',{'id':'all_article_list'})
links = all_article.find_all('a')
for l in links:
    print(l.text)


A01：澳聞
新橋區或寸步難行
西灣湖街一段將封閉
歐陽瑜：新中圖建停車場否未定

A02：澳聞
賀倡泛珠參與深合區建設
賀晤閩省長冀共建深合區
瓊澳握新機遇產業融合發展
賀：桂澳拓更大旅遊市場
A區更新教育用地規劃草圖
朝九晚九辦證延至月底
當局設展推領事保護協助

A03：澳聞
離島醫院料12月首階段運作
教局促教育界關心學生健康
教育基金巡視非高教學校
琴岸長隆無人駕駛專線開通
司警籲防範藉裝修行騙

A04：澳聞
七成公僕重收益人際關係
公總青委冀關注代際差異
李振宇倡弘揚公僕精神
公務工會專場培訓提升服務
食環學會：多重把關防日本核食
議員質詢口岸保安查車合法性

A05：澳聞
警揭四假外僱一落網
兩男外僱涉偷頭盔法辦
新聞特搜
兩男涉入屋盜竊被羈押
本地漢墮投資騙局失百四萬
四女墮刷單騙局共失六萬二
男子祼聊遭勒索無損失
七旬漢玉石店賒帳走數被捕
男子被揭涉醉駕法辦
海關截逾三百走私美妝產品
終院審結一公司上訴四商標註冊
男子信用卡被盜用報警
十四幼童染腸病毒

A06：澳聞
書市揭幕設“一加四”專檯
激發閱讀
閱讀關懷計劃捐書四萬五冊
土生葡人家常菜新著發售
阮毓明書畫展昨書市揭幕
抗戰地圖展呈古地圖文物

A07：澳聞
耆英交流團訪莞惠增家國情
工聯北綜義工體驗漁家樂
科企訪粵澳工商聯促合作
司警扶康會宣講防電騙
家協辦毛冷親織興趣班
金華領導訪蘇浙滬同鄉會
代購食品宜留意風險
經民聯訪肇探資源互補
創意花燈紮作進階班報名
兩的士會月底考察星馬
好家園籲學生理性用網
民眾建澳促關注三盞燈樹木
議員倡增語言治療學位課程

A08：澳聞
滬台澳青年師生研習營閉營
利瑪竇與京小學線上交流
理大暑期葡語教師培訓受歡迎
高等教育國際研討會徵論文
石排灣公校展師生作品
城大訪葡高校深化合作
理大訪港恒大促合作
聖中五校赴穗誦讀比賽交流
菜農幼師STEAM教學校本培訓
勞幼畢業禮溫馨感人
聖德蘭逾七十小學生畢業
福校勉畢業生勇往直前

A09：澳聞
伊寧兩民俗街區彰民族融洽
（新聞小語）強執行力 理順掘路
公僕團訪伊犁林則徐紀念館
今多雲酷熱
持續酷熱慎防中暑
市署議事亭前地補植樟樹
寶馬XM新車發佈

A10：經濟
就業改善 次季消費者信心回升
（一家之言）建城市品牌  營好客氛圍
地產業冀寬樓市限制促成交
三至五月住宅樓價指數升0.1%
專家：發展大模型 需

##Scrape Table on Static Website
  
table: A table element in the HTML  
thead: Header of the Table  
tbody: The main part of the Table  
tr: Table row  
td: Table Data (Cell)  


In [None]:
url = 'https://www.bankofchina.com/mo/en/fimarkets/fm1/201707/t20170718_9799977.html'

### Use on API

requests is very useful on accessing API

https://data.gov.mo/Detail?id=ac55c2f1-780a-4dc8-875f-851b2203b706

In [None]:
my_headers = {
    'Authorization':'APPCODE 09d43a591fba407fb862412970667de4'
}
res = requests.get('https://dsat.apigateway.data.gov.mo/car_park_detail', headers=my_headers)
res.encoding = 'utf-8'
print(res.text)

<CarPark title="停車場資料">
  <Car_park_info CP_ID="7065" NameC="污水處理站停車場" LocationC="黑沙環新填海區之污水處理站旁" CarParkEntryC="出口及入口均設於勞動節大馬路" ContactNo="2876 3564" NameP="Auto-Silo da ETAR" LocationP="Junto da Estação de Tratamento de Águas Residuais do Aterro da Areia Preta" CarParkEntryP="A entrada e saída efectua-se pela Avenida 1.º de Maio" X_coords="22.207970" Y_coords="113.558418" height="---       " DSCC_X="21955.79143204315" DSCC_Y="19640.23708476258" Lcar_price_C="MOP$6(日間)##MOP$3(夜間)" Hcar_price_C="MOP$8(日間)##MOP$4(夜間)##(註1)## ##MOP$10(日間)##MOP$5(夜間)##(註2)" moto_price_C="-" remark_price_C="日間：上午八時至下午八時前##夜間：下午八時至翌日上午八時前##註1：小型公共汽車及長度不超過7 米和總重量不超過7 公噸的重型車輛。## 註2：公共汽車及其他重型車輛。 " Lcar_price_P="MOP$6 (Diurno)##MOP$3 (Nocturno)" Hcar_price_P="MOP$8 (Diurno)##MOP$4 (Nocturno)##(Nota 1)## ##MOP$10 (Diurno) ##MOP$5 (Nocturno)##(Nota 2)" moto_price_P="-" remark_price_P="Diurno：Das 8,00 horas até antes das 20,00 horas.##Nocturno：Das 20,00 horas até antes das 8,00 horas do dia seguintes.## ##Nota 1: 

# Exercise

Let's take below news website links as our targets:  
  
https://www.galaxyentertainment.com/zh-hant/media/press-releases?year=2023

Try to scrape the news header, and the link for those AFTER 2023 Jun.

With below format as example:  

Headline - link  
銀河娛樂集團2023第十一屆姚基金慈善賽暨Hive5體育短片節 正式在中國澳門啓動 - https://www.galaxyentertainment.com/zh-hant/media/press-releases/1084/20230807