Skip to content

edsu/wplinks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wplinks

Build Status

wplinks provides a generator function called extlinks that lets you iterate through links from Wikipedia articles to a particular website, or portion of a website. It also provides links which lets you iterate through other Wikipedia URLs that are linked from a given Wikipedia URL.

So for example, to see what Wikipedia articles point at interviews on the The Paris Review website:

from wplinks import extlinks 

for src, target in extlinks('http://www.theparisreview.org/interviews'):
    print src, target

By default you get links for English Wikipedia, but if you'd like results for the French Wikipedia instead use the lang parameter:

from wplinks import extlinks

for src, target in extlinks('http://www.theparisreview.org/interviews', lang='fr'):
    print src, target

If you'd like to see what other Wikipedia articles a given Wikipedia article links to use the links function. For example lets say you want to see what articles the James Joyce article points to:


from wplinks import links

for url in links('http://en.wikipedia.org/wiki/James_Joyce'):
    print url

Why?

wplinks used to be somewhat involved since it scraped the External links search page. It became quite a bit simpler once I discovered the exturlusage API call. You might want to make this API call yourself and page through the results, without including wplinks as a dependency. But I left it here just in case you'd rather not.

License

  • CC0

About

utility to get a list of Wikipedia articles that point at a particular website

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages