Skip to content

kootenpv/requests_viewer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

requests_viewer

The idea is that requests_viewer can tell us information about our requests-objects quickly.

It opens up an HTML page with information for the request.

import requests
from requests_viewer.web import view_request
view_request(requests.get("https://xkcd.com/"))

# or

from requests_viewer import main
main("https://xkcd.com/") # considers different mime types

Main features

  • HTML page is being shown as how the crawler sees it
    • Can extract the domain and hot-link so that it looks almost indistuinguishable
  • Contains other nice functions to show lxml tree nodes
  • Can visually show diffs between 2 html pages / trees
from requests_viewer.web import view_diff_tree, get_tree
url1 = "http://xkcd.com/"
url2 = "http://xkcd.com/1/"
tree1 = get_tree(url1)  # get tree from request object directly
tree2 = get_tree(url2)  # could instead use `make_tree` if you already have a req
view_diff_tree(tree1, tree2)

Results in:

Installation

pip install requests_viewer
pip3 install requests_viewer

Note that in order to do the real fancy stuff, you should install:

pip install requests_viewer[fancy]
pip3 install requests_viewer[fancy]

this will install lxml, bs4 and tldextract.

Types it can show currently:

  • text/html
  • image/*
  • application/json

Usability

Some example web.py functions:

def slugify(value):
def view_request(r, domain=None):
def view_html(x):
def view_node(node, attach_head=False, question_contains=None):
def view_tree(node):
def view_diff_tree(tree1, tree2, url, diff_method):
def view_diff_html(html1, html2, url, diff_method):
def view_diff(html1, html2, tree1, tree2, url, diff_method):
def make_parent_line(node, attach_head=False, question_contains=None):
def extract_domain(url):
def make_tree(html, domain=None):
def get_tree(url, domain=None):
def get_local_tree(url, domain=None):

Contribute

This package is very small at the moment. I very much encourage you to contribute:

  • Most likely we will want to show headers on the top of the package (html)
  • Make the encoding an argument (instead of fixed utf8)

Note that I use yapf with max-line=100 to avoid any styling discussion.

About

View requests objects with style

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages