HTMLExtract
Introduced in t-ui beta 6.6
This feature let's you extract text from HTML pages and display it inside t-ui.
XPath is a language used to find particular nodes and tags in HTML/XML documents. It's very easy to understand, and very powerful.
-
Tutorial: w3schools
-
Examples: w3schools
- Tester: freeformatter.com
JSONPath has the same features of the language described above, but it works with Json.
-
Tutorial: http://goessner.net/
- Tester: jsonpath.com
This JSONPath
$.[bid,ask,last]
won't work unless you use it like that
$.['bid','ask','last']
Values:
-
%n
-> newline -
%t
-> tag name -
%t(attributeName)
-> the value of the attributeattributeName
of the matched node -
%a(format)(separator)
-> prints every attribute of the matched nodes-
%an
-> attribute name -
%av
-> attribute value
-
-
%v
-> tag value -
#[URL]
-> link -
#rrggbb[text]
-> color the text -
#[replaceThis/with][replaceAlsoThis/withThis]...
-> replace the text in front of the group
Note that the replace format works only with the very next word or set of world in front of it. For instance
#[replace/with]%t%v
will affect only %t
.
Moreover, color and links aren't allowed inside the replace group. You'll have to put them outside.
The /
symbol is the value of optional_value_separator
.
Matched node:
<a href="https://github.com/Andre1299/TUI-ConsoleLauncher/subscription" class="myClass" role="button">This is a link</a>
Format:
#[%t(href)]
Output:
https://github.com/Andre1299/TUI-ConsoleLauncher/subscription
Format:
%t -> %v%n%a(%an = %av)(%n)
Output:
a -> This is a link
href = https://github.com/Andre1299/TUI-ConsoleLauncher/subscription
class = myClass
role = button
Format:
#[a:linkTag]%t -> #[is:is not][link:plain text]%v%n%a(%an = %av)(%n)
Output:
linkTag -> This is not a plain text
href = https://github.com/Andre1299/TUI-ConsoleLauncher/subscription
class = myClass
role = button
You can select an infinite amount of nodes, but everyone will be of the same kind. Decide carefully what kind of nodes you need.
htmlextract -add [json OR xpath] [ID] [expression]
For instance:
htmlextract -add xpath 1 //a[@class="foo"]
htmlextract -add format [ID] [expression]
For instance:
htmlextract -add format 5 #[%t(href)]
htmlextract -query [ID] [optional: Format ID] [webpage]
For instance:
htmlextract -query 1 5 https://website.com/page.html
Notice that [Format ID]
is optional. This means that if you omit it, t-ui will use the value of htmlextract_default_format
instead.
Francesco Andreuzzi, Italy, andreuzzi.francesco@gmail.com