Skip to content

Maintained By rawandahmed698: A library to read a YML file with X-Path or CSS Selectors and extract data from HTML pages using them

License

Notifications You must be signed in to change notification settings

rawandahmad698/selectorlib

 
 

Repository files navigation

selectorlib

Documentation Status Updates

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

Example

>>> from selectorlib import Extractor
>>> yaml_string = """
    title:
        css: "h1"
        type: Text
    link:
        css: "h2 a"
        type: Link
    """
>>> extractor = Extractor.from_yaml_string(yaml_string)
>>> html = """
    <h1>Title</h1>
    <h2>Usage
        <a class="headerlink" href="http://test">¶</a>
    </h2>
    """
>>> extractor.extract(html)
{'title': 'Title', 'link': 'http://test'}

About

Maintained By rawandahmed698: A library to read a YML file with X-Path or CSS Selectors and extract data from HTML pages using them

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • HTML 98.8%
  • Other 1.2%