Skip to content

added newspaper to web crawling#987

Closed
jjanczyszyn wants to merge 1 commit intovinta:masterfrom
jjanczyszyn:web-crawling
Closed

added newspaper to web crawling#987
jjanczyszyn wants to merge 1 commit intovinta:masterfrom
jjanczyszyn:web-crawling

Conversation

@jjanczyszyn
Copy link

What is this Python project?

A library that makes it easy to crawl for and scrape articles.

What's the difference between this Python project and similar ones?

  • It's extremely easy to use.
  • Multi-threaded article download framework
  • News url identification
  • Text extraction from html
  • Top image extraction from html
  • All image extraction from html
  • Keyword extraction from text
  • Summary extraction from text
  • Author extraction from text
  • Google trending terms extraction
  • Works in 10+ languages (English, Chinese, German, Arabic, ...)
    --

Anyone who agrees with this pull request could vote for it by adding a 👍 to it, and usually, the maintainer will merge it when votes reach 20.

@wang736838506
Copy link

Hello

@vinta vinta force-pushed the master branch 2 times, most recently from 23abd09 to 40cd98b Compare June 6, 2019 19:55
@stale
Copy link

stale bot commented Oct 30, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Oct 30, 2019
@stale stale bot closed this Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants