Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update zhihu.com.txt, update user-agent #578

Merged
merged 1 commit into from Dec 19, 2018
Merged

Conversation

@xdtianyu
Copy link
Contributor

xdtianyu commented Dec 12, 2018

No description provided.

@lizyn

This comment has been minimized.

Copy link

lizyn commented Dec 16, 2018

The page elements on site zhihu.com have changed, elements like author, date should be extracted other way. For some pages, even the content body extraction failed with current config. You can test it with https://www.zhihu.com/question/20637942 and see if it works.

I have modified the config file accordingly and test it on my local machine. In that case, am I supposed to make a new pull request or post it here?

@fivefilters fivefilters merged commit 3b2e7ad into fivefilters:master Dec 19, 2018
@fivefilters

This comment has been minimized.

Copy link
Owner

fivefilters commented Dec 19, 2018

Hi @lizyn you can send us a pull request here. It's possible to do it directly via GitHub's web interface if you haven't cloned the repository.

Kdecherf added a commit to Kdecherf/ftr-site-config that referenced this pull request Mar 8, 2019
@TotiUY

This comment has been minimized.

Copy link

TotiUY commented Mar 15, 2019

it seems have a problem to fetch zhihu.com or zhuanlan.zhihu.com now. Is it possible to update the file ?
I use https://f43.me to test several times and make sure that nothing canbe grabed from ZHIHU.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.