Fast links parser for Python & Humans
Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
examples
tests
.gitignore
README.md
fastlinks.py

README.md

##FastLinks Missing simple links parser for python & humans

Use this component if you want to get http links from content in a fast ( very ) way.

###Overview

Imagine you have this html content:

<LINK REL="SHORTCUT ICON" HREF="favicon.ico" />

src='/clickme.php?id=10&amp;stats=23d'

URL="http://www.testsite.com/verygood.html"

href='www.testsite.com/hello placentas/word.htm'
href='../test.html'

And all you want to do is just get list of normal looking links from it.

###You can do it now!!

just:

 links = get_links(content, 'http://www.testsite.com/')

Isn't that trolololowesome ?!

output:

[1] http://www.testsite.com/test.html
[2] http://www.testsite.com/hello placentas/word.htm
[3] http://www.testsite.com/favicon.ico
[4] http://www.testsite.com/verygood.html
[5] http://www.testsite.com/clickme.php?id=10&stats=23d

Please feel free to improve it if you like :)

image

Also you can try (more power on data mining) CustomStringParser