PyExposeHtml is a Python library for helping you to scrap web pages. It shows you a lot of information about the page.
Version 0.0.1:
- First version
Use the package manager to install.
pip install PyExposeHtmlAfter install the package:
from py_expose_html import exposeCreate an instance of ExposeHtmlDocument. The constructor needs an URL. This URL will be scraped.
expose_doc = new ExposeHtmlDocument('https://some-random-url')Return total of CSS files referenced in the html page
total_css = expose_doc.count_css()Return total of JS files referenced in the html page
total_js = expose_doc.count_js()Return total of Html Elements
total_html_elements = expose_doc.count_total_elements()Return total of META elements
total_meta = expose_doc.count_meta_elements()Return all the JS content
total_js_content = expose_doc.get_js_content()Return all the CSS content
total_css_content = expose_doc.get_css_content()Return the total of onclick events in all elements in the html
total_onclick_events = expose_doc.count_onclick_events()Return the total of Forms in html page
total_forms = expose_doc.count_forms_elements()Return the Action and HttpMethod from Form
form_info = expose_doc.get_form_info()Return the size in Kb of the page
page_size = expose_doc.get_page_size()Return the onclick values
onclick_values = expose_doc.get_onclick_values()Return True/False
has_ajax_call = expose_doc.has_ajax_call()Return the JSON with the amount of info found
report_json = expose_doc.generate_report()Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.