New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Study] PChome 商品資訊 - 爬蟲 #11
Comments
以 https://24h.pchome.com.tw/prod/DHAEDE-1900FFWUE 為例 import requests
headers = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36"}
id = "DHAEDE-1900FFWUE" #商品ID, 可從url獲得
url = f"https://ecapi-cdn.pchome.com.tw/ecshop/prodapi/v2/prod?id={id}&fields=Seq,Id,Name,Price,SaleStatus"
result1 = requests.get(url, headers)
print(result1.text.replace("\\/","/").encode('utf-8').decode('unicode_escape')) #decode時避開slash,unicode轉中文
url2 = f"https://ecapi-cdn.pchome.com.tw/ecshop/prodapi/v2/prod/button&id={id}&fields=Seq,Id,Name,Price,Qty,ButtonType,SaleStatus,isPrimeOnly,SpecialQty,Device"
result2 = requests.get(url2, headers)
print(result2.text.replace("\\/","/").encode('utf-8').decode('unicode_escape')) #decode時避開slash,unicode轉中文 result1
{
"DHAEDE-1900FFWUE-000": {
"Seq": 33716849,
"Id": "DHAEDE-1900FFWUE-000",
"Name": "ACER Swift 5 SF514-55T-54WK 綠(i5-1135G7/8G/512G PCIe/W11/FHD/14)", #商品名稱
"Price": {
"M": 33900, # 原價
"P": 24900, # 特價
"Low": null,
"Prime": ""
}
}
}
result2
{
"Seq": 33716849,
"Id": "DHAEDE-1900FFWUE-000",
"Price": { "M": 33900, "P": 24900, "Prime": "", "Low": null },
"Qty": 20,
"ButtonType": "ForSale",
"SaleStatus": 1, #上下架狀態, 0為下架
"isPrimeOnly": 0,
"SpecialQty": 0,
"Device": []
}
|
這樣前面我們需要一個 PCHome URL parser 來取得網址中的商品 ID,網址來源可能會有以下幾種
|
因為python 很不熟
import requests
url = "https://ecapi-cdn.pchome.com.tw/ecshop/prodapi/v2/prod?id=DYAT1K-A900FLZ63-000&&fields=Seq,Id,Name,Price,SaleStatus"
result1 = requests.get(url)
res= result1.json()
print(res) response {
"DYAT1K-A900FLZ63-000": {
"Seq": 34094219,
"Id": "DYAT1K-A900FLZ63-000",
"Name": "Google Pixel 7 Pro (12G/256G) 曜石黑",
"Price": {
"M": 0,
"P": 28990,
"Low": "None",
"Prime": ""
}
}
} 最後再補充
|
找時間來掃一下商品列表, 看看還有哪個可以當作上下架狀態. 關於 PCHome URL parser , 兩種string都ok. from urllib.parse import urlparse
PCHome_web_url = "https://24h.pchome.com.tw/prod/DYAJIB-1900BZ121"
PCHome_app_url = """Apple Watch SE GPS, 44mm Silver Aluminium Case with Abyss Blue Sport Band
https://24h.pchome.com.tw/prod/DYAJIB-1900BZ121"""
def pchomeUrlParser(url: str) -> str:
''' extract productID from url '''
parts = urlparse(PCHome_APP_url)
directories = parts.path.strip('/').split('/')
productID = directories[-1]
return productID
print(pchomeUrlParser(PCHome_WEB_url))
print(pchomeUrlParser(PCHome_APP_url))
# output:
# DYAJIB-1900BZ121
# DYAJIB-1900BZ121 |
感謝 @t1ina2003 @zhihdd 的 study 結果,接下來會關閉這張 study ticket,另開 feature ticket 進行實作的部分。 |
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
身為 使用者
我希望 有更多的電商平台商品可以收藏
如此 我可以有更多的機會買到便宜商品
背景
現行只有支援 momo shop,需要知道 PChome 是否也有可以單靠 http request,不用 javascript render 就可以得到商品狀態的方法
Definition of Done
可以透過 PChome 商品頁 URL 得到以下資訊
The text was updated successfully, but these errors were encountered: