merle

A command-line tool for getting meta information from a URL

Uses the newspaper3k package and some custom HTML parsing to extract attributes.

Current state of usage:

$ pip install merle
$ merle http://www.newyorker.com/magazine/1988/11/07/counting-votes

Result:

slug: counting-votes-new-yorker-www-newyorker-com
fetched_at: 2016-06-16 11:51:14.707606
url: http://www.newyorker.com/magazine/1988/11/07/counting-votes
title: Counting Votes - The New Yorker
description: |
  Counting Votes - The New Yorker
published_at: 1988-11-07
authors:
  - Richard Brody
  - Gilad Edelman
  - Josephine Livingstone
  - Jason Adam Katzenstein
  - Farid Farid
  - Louis Menand
  - Margaret Talbot
  - Sarah Hutto
word_count: 20691
excerpt: |
  During the past quarter of a century, with hardly anyone noticing, the inner workings of democracy have been computerized. All our elections, from mayor to President, are counted locally, in about ten thousand five hundred political jurisdictions, and gradually, since 1964, different kinds of computer-based voting systems have been installed in town after town, city after city, county after county. ...

Todo:

Create a Document class so that dates can be parsed from URLs
[?] Integrate newspaper egg
title = list(extract_element('title', f).values())[0]
FetchedResource to Dictionary
image url fetcher
Turn into CLI
Make source list in data/publiushers.txt

Broken URLs

http://www.stanforddaily.com/2013/09/17/transfer-student-experience-offers-rewards-challenges/

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
merle		merle
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

merle

Current state of usage:

Todo:

Broken URLs

About

Releases

Packages

Languages

License

dannguyen/merle

Folders and files

Latest commit

History

Repository files navigation

merle

Current state of usage:

Todo:

Broken URLs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages