Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WARN] [get_bawu_postlogs] 'NoneType' object has no attribute 'find_all'. args=('贴吧名',) kwargs={'search_value': '用户名', 'search_type': 0, 'pn': 1} #194

Closed
Voevodsky opened this issue Apr 17, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@Voevodsky
Copy link

get_bawu_postlogs在当search_type设置为0(搜索发帖人)的时候似乎有点问题,每次都提示NoneType。(我在网页吧务后台搜索能正常返回结果,所以肯定不是None)

不过当我尝试将search_type设置为1(搜索操作人)的时候却能正常返回结果了。

而且我观察了一下get_bawu_postlogs返回的postlog信息,似乎并不包含”发帖人“的数据,个人感觉搜索发帖人无法得到结果的原因可能和这个有关。

@Starry-OvO
Copy link
Owner

我这边一切正常,你要不抓个包看看返回的内容里有没有包含tbody这个html节点

@Starry-OvO Starry-OvO added the bug Something isn't working label Apr 18, 2024
@Voevodsky
Copy link
Author

嗯嗯,我仔细看了一下代码执行过程,感觉这个可能是url编码的问题,因为我如果搜索纯英文数字的用户id,就能顺利返回结果,搜索中文的话就不行了。

比如说,搜索用户名”萌客天行“,吧务后台的正确链接里的svalue=%25E8%2590%258C%25E5%25AE%25A2%25E5%25A4%25A9%25E8%25A1%258C,这似乎是一段两次url编码的字符串。
然而,如果我将get_bawu_postlogs的search_value直接设置为”萌客天行“,我观察了一下访问的url,svalue=%E8%90%8C%E5%AE%A2%E5%A4%A9%E8%A1%8C,这是一次编码的结果。且这个svalue无法返回结果。

于是我尝试在代码中添加了:
query_string = '萌客天行'
result = parse.quote(query_string)
再让search_value=result
bawu_info = await client.get_bawu_postlogs("贴吧名", search_value=result , search_type=0, pn=1)

这里result = %E8%90%8C%E5%AE%A2%E5%A4%A9%E8%A1%8C(一次编码),然后result进入get_bawu_postlogs函数后会再次编码,url的svalue = %25E8%2590%258C%25E5%25AE%25A2%25E5%25A4%25A9%25E8%25A1%258C,这里svalue就和我从吧务后台看到的链接一样了。

果然,这样就能顺利返回结果了。

@Voevodsky
Copy link
Author

这个函数访问的url根据search_type不同,分别是这两种格式:
https://tieba.baidu.com/bawu2/platform/listPostLog?word="XXX"&pn=1&ie=utf-8&svalue=XXX&stype=op_uname (search_type = 1)

https://tieba.baidu.com/bawu2/platform/listPostLog?word="XXX"&pn=1&ie=utf-8&svalue=xxx&stype=post_uname (search_type = 0)

当搜索中文名时,我分别手动在网页上粘贴这两个url,很奇怪的是,前面这个链接是能顺利返回结果的,后者则不能,我猜测这就是我search_type = 0时无法返回结果的原因

@n0099
Copy link

n0099 commented Apr 18, 2024

这似乎是一段两次url编码的字符串

经典一大堆%25重复escape了%本质倾斜牙签综合征 https://en.wikipedia.org/wiki/Leaning_toothpick_syndrome
堪比许多老接口不指定querystring之ie=utf-8就会将urldecode为gb2312/gbk而非utf8

@Starry-OvO
Copy link
Owner

应该修好了,面向诗山编程之 f51e1bd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants