Skip to content

sffjunkie/ppuri

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ppURI

A pyparsing based URI parser/scanner library.

Install using pip or your tool of choice e.g.

pip install ppuri
poetry add ppuri

Usage

Parsing

Either import ppuri.uri and use the parse function to match and parse against all URI schemes e.g.

from ppuri import uri
info = uri.parse("https://www.example.com:443/a.path?q=aparam#afragment")
print(info)

prints

{
  "authority": { "address": "www.example.com", "port": "443" },
  "fragment": "afragment",
  "parameters": [{ "name": "q", "value": "aparam" }],
  "path": "/a.path",
  "scheme": "https",
  "uri": "https://www.example.com:443/a.path?q=aparam#afragment"
}

Or import a specific scheme's parse function.

from ppuri.scheme import http
info = http.parse()

and use that to parse

Scanning

To scan text for URIs use the scan method

Supported schemes

Currently supports the following schemes

  • http(s)
  • urn
  • data
  • file
  • mailto
  • about
  • aaa
  • coap
  • crid

Http(s)

uri.parse() on an HTTP url returns a dictionary of the form

{
  "scheme": "http or https",
  "authority": {
    "address": "hostname or ipv4 address or ipv6 address",
    "port": "port number",
    "username": "user name if provided",
    "password": "pasword if provided"
  },
  "path": "path if provided",
  "parameters": [
    // list of parameters if provided
    {
      "name": "parameter name",
      "value": "parameter value or None if not provided"
    }
  ],
  "fragment": "fragment if provided",
  "uri": "The full URI"
}

Urn

uri.parse() returns a dictionary of the form

{
  "scheme": "urn",
  "nid": "Namespace Identifier",
  "nss": "Namespace Specific String",
  "uri": "The full URI"
}

MailTo

uri.parse() returns a dictionary of the form

{
  "scheme": "mailto",
  "addresses": [
    "List of email addresses",
  ]
  "parameters": [
    "list of parameters if provided",
    {
        "name": "bcc",
        "value": "dave@example.com"
    }
  ],
  "uri": "The full URI"
}

Data

uri.parse() returns a dictionary of the form

{
  "scheme": "data",
  "type": "Mime type",
  "subtype": "Mime Subtype",
  "encoding": "base64 if specified",
  "data": "The actual data",
  "uri": "The full URI"
}

File

uri.parse() returns a dictionary of the form

{
  "scheme": "file",
  "path": "The /file/path",
  "uri": "The full URI"
}

Package Status

GitHub Workflow Status PyPI - Downloads

About

pyparsing based URI parser/scanner

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages