🔗 📦 A single endpoint which generates metadata previews of any given URL
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
endpoint
img
tests
.DS_Store
.env
.gcloudignore
.gitignore
Pipfile
Pipfile.lock
README.md
fetch.py
main.py
meta.py
requirements.txt
setup.cfg
setup.py

README.md

Linkbox

Python Flask BeautifulSoup Google Cloud Functions GitHub Last Commit GitHub Issues GitHub Stars GitHub Forks

Linkbox is a single endpoint which accepts a ?url= parameter and returns best-guess metadata for the target site. A successful request parses HTML of the target page and discerns which data is best suited to provided a preview of said page. The resulting JSON can be used to format embedded HTML previews, thereby creating a better visual experience as well as countless SEO benefits.

Link Preview

Usage

Public access coming soon!

Request:

curl https://linkbox.link?url=https://www.theatlantic.com/magazine/archive/2018/05/barbara-ehrenreich-natural-causes/556859/

Response:

{
  "author": "Victoria Sweet",
  "feeds": ["https://www.theatlantic.com/feed/all/",
    "https://www.theatlantic.com/feed/best-of/"
  ],
  "image": "https://cdn.theatlantic.com/assets/media/img/2018/04/_BarbaraEhrenreich_FINAL_RVB/facebook.png?1523295067",
  "postOrigin": "https://hackersandslackers.com/lynx-roundup-april-22nd",
  "publishDate": "2018-04-10T08:00:00+00:00",
  "summary": "I went to medical school, at least in part, to get to know death and perhaps to make my peace with it. One day—usually when you’re young, though sometimes later—the thought hits you: You really are going to die. Meanwhile, I watched as what had been called “medical care”—that is, treating the sick—turned into “health care,” keeping people healthy, at an ever-rising cost.",
  "tags": ["control",
    "cancer cells",
    "delays death",
    "ever-rising cost",
    "spiritual epiphanies",
    "strong family history",
    "firsthand experience",
    "fast-growing literature",
    "regular physical exams",
    "Natural Causes",
    "finally unevadable death",
    "new science",
    "immune system—and",
    "immediate health-care costs",
    "chronicling cultural shifts",
    "congenial new home",
    "popular delusions",
    "white blood cell"
  ],
  "title": "Your Body Is a Teeming Battleground",
  "url": "https://www.theatlantic.com/magazine/archive/2018/05/barbara-ehrenreich-natural-causes/556859/",
  "videos": []
}

Functionality in Development

There are several major features which remain in development:

  • Returning Embed HTML: Instead of simply returning JSON containing metadata per request, users will be able to opt in to instead receiving a text response of HTML for an embedded widget.
  • Customized Responses: Some content providers (such as Medium) are intentionally resistant to scrapers. Exceptions for such sources will be handled on a case-by-case basis to ensure meaningful data is returned.
  • Content Awareness: Depending on the content of the link, a different embed will be returned to best display said content.
  • Direct Database Writes: Integration with a site's content database to ensure HTML is hard-embedded to the page as opposed to a client-side script.