Skip to content

quang-megatron/linkpreview

 
 

Repository files navigation

linkpreview

Build Status Coverage Status pypi

Get link preview in python

Gathering data from:

  1. OpenGraph meta tags
  2. TwitterCard meta tags
  3. Microdata meta tags
  4. JSON-LD meta tags
  5. HTML Generic tags (h1, p, img)
  6. URL readable parts

Install

pip install linkpreview

Usage

Basic

from previewlink_helper import link_preview

url = "http://localhost"
content = """
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width">
    <!-- ... --->
    <title>a title</title>
  </head>
  <body>
  <!-- ... --->
  </body>
</html>
"""
preview = link_preview(url, content)
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)

Automatic fetch link content

from previewlink_helper import link_preview

preview = link_preview("http://github.com/")
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)

lxml as XML parser

Very recommended for better performance.

Install the lxml and use it like this:

from previewlink_helper import link_preview

preview = link_preview("http://github.com/", parser="lxml")
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)

Advanced

from previewlink_helper import Link, LinkPreview, LinkGrabber

url = "http://github.com"
grabber = LinkGrabber(
    initial_timeout=20,
    maxsize=1048576,
    receive_timeout=10,
    chunk_size=1024,
)
content, url = grabber.get_content(url)
link = Link(url, content)
preview = LinkPreview(link, parser="lxml")
print("title:", preview.title)
print("description:", preview.description)
print("image:", preview.image)
print("force_title:", preview.force_title)
print("absolute_image:", preview.absolute_image)
print("site_name:", preview.site_name)
print("favicon:", preview.favicon)
print("absolute_favicon:", preview.absolute_favicon)

About

Get link preview in python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 71.0%
  • HTML 28.7%
  • Shell 0.3%