We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1、作者的链接变由/author变为了/serach
/author
/serach
def author_filter(self, a_element): a_href = a_element.attrib['href'] return '/search' in a_href
2、简介内容上层div标签不完整导致包含简介之外的其他内容。剪去</div>之后的内容
</div>
book['description'] = '' if len(summary_element): summary = etree.tostring(summary_element[-1], encoding="utf8").decode("utf8").strip() book['description'] = summary[0:(summary.index("</div>") + 6)]
The text was updated successfully, but these errors were encountered:
第一个问题是存在的,不过不是豆瓣页面结构变化,应该是部分作者在豆瓣作者库中没有映射上,所以显示成搜索。
第二个暂不修改,简介中div是普遍存在,直接去掉肯定不行(最好多找几个例子看看)
Sorry, something went wrong.
第一个问题是存在的,不过不是豆瓣页面结构变化,应该是部分作者在豆瓣作者库中没有映射上,所以显示成搜索。 第二个暂不修改,简介中div是普遍存在,直接去掉肯定不行(最好多找几个例子看看)
第一个问题的确实是你说的作者没映射上,不过这种情况很多,尤其是外籍作者,可以考虑兼容两种情况。其他的比如丛书链接是/series开头,出品方是/producer开头,应该不会有影响。
丛书
/series
出品方
/producer
第二个问题,我大概看了二十来个,简介<div class="intro">下级都是p标签,目前没有见到有div标签的
<div class="intro">
p
div
No branches or pull requests
1、作者的链接变由
/author
变为了/serach
2、简介内容上层div标签不完整导致包含简介之外的其他内容。剪去
</div>
之后的内容The text was updated successfully, but these errors were encountered: