New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add Element class for manipulating DOM elements in Lua #471

Closed
wants to merge 4 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@mike1808
Contributor

mike1808 commented Jun 22, 2016

some test lua code

function main(splash)
  assert(splash:go('http://localhost:8000/'))

  local element = splash:select('button')
  element:mouse_click()
  assert(splash:wait(0.0001))

  local title = splash:select('#title')

  return {title:get_styles()["display"], title:node_property('nodeName')}
end

for the following html

<!DOCTYPE html>
<html>
<head>
   <title>Index 1</title>
</head>
<body>
<button type="button" onclick="document.getElementById('title').style.display = 'block'">Click me</button>
<h1 id="title" style="display: none">Index 1</h1>
</body>
</html>
Show outdated Hide outdated splash/browser_tab.py
Show outdated Hide outdated splash/browser_tab.py
Show outdated Hide outdated splash/html_element.py
@mike1808

This comment has been minimized.

Show comment
Hide comment
@mike1808

mike1808 Jul 7, 2016

Contributor

@Youwotma @immerrr I've changed the method of getting boundary rectangle for png and jpeg methods to getBoundingClientRect. Now those methods return None if element is not visible.

Also I moved all used Casper.JS methods to separate file and added MIT license to them.

@Youwotma check please the node_method both in Lua and Python. I have 2 different implementation for Python API and Lua.

Contributor

mike1808 commented Jul 7, 2016

@Youwotma @immerrr I've changed the method of getting boundary rectangle for png and jpeg methods to getBoundingClientRect. Now those methods return None if element is not visible.

Also I moved all used Casper.JS methods to separate file and added MIT license to them.

@Youwotma check please the node_method both in Lua and Python. I have 2 different implementation for Python API and Lua.

Show outdated Hide outdated splash/qtrender_lua.py
@mike1808

This comment has been minimized.

Show comment
Hide comment
@mike1808

mike1808 Jul 30, 2016

Contributor

@immerrr @kmike I've added docs and test.

Contributor

mike1808 commented Jul 30, 2016

@immerrr @kmike I've added docs and test.

Element Object
==============
Element objects are instanced by :ref:`splash-select`;

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

instantiated, created

@kmike

kmike Aug 8, 2016

Member

instantiated, created

.. note::
If the specified method returns another HTML element it will be represented as a table with
``type`` property which equals to ``'node'``

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

why not Element instance?

@kmike

kmike Aug 8, 2016

Member

why not Element instance?

This comment has been minimized.

@mike1808

mike1808 Aug 8, 2016

Contributor

It's already fixed in the other PR. I am going to refactor this PR.

@mike1808

mike1808 Aug 8, 2016

Contributor

It's already fixed in the other PR. I am going to refactor this PR.

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

👍

@kmike

kmike Aug 8, 2016

Member

👍

Trigger mouse click event on the top-left corner of the element.
**Signature:** ``ok, reason = element:mouse_click()``

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

It'd be nice to support relative coordinates here, defaulting to 0,0

@kmike

kmike Aug 8, 2016

Member

It'd be nice to support relative coordinates here, defaulting to 0,0

This comment has been minimized.

@mike1808

mike1808 Aug 13, 2016

Contributor

Ok
What if that coordinate exceed the element width or height? Just ignore it and click on the other element or throw an error?

@mike1808

mike1808 Aug 13, 2016

Contributor

Ok
What if that coordinate exceed the element width or height? Just ignore it and click on the other element or throw an error?

This comment has been minimized.

@kmike

kmike Aug 14, 2016

Member

I'd say click anyways, without throwing an error.

@kmike

kmike Aug 14, 2016

Member

I'd say click anyways, without throwing an error.

Return a screenshot of the element in PNG format
**Signature:** ``ok, shot = element:png{width=nil, height=nil, scale_method='raster'}``

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

I used a similar function in a project, and it was really nice to be able to take a screenshot of an area which is slightly larger than element itself. What do you think about adding 'pad' argument - a number of pixels to add to the screenshot around the element? Maybe it can also be a table with {top, left, bottm, right} padding widths. I know, more features.. :)

In implementation it'd be nice to have padding function separate and public, so that people can use it e.g. with get_bounds.

@kmike

kmike Aug 8, 2016

Member

I used a similar function in a project, and it was really nice to be able to take a screenshot of an area which is slightly larger than element itself. What do you think about adding 'pad' argument - a number of pixels to add to the screenshot around the element? Maybe it can also be a table with {top, left, bottm, right} padding widths. I know, more features.. :)

In implementation it'd be nice to have padding function separate and public, so that people can use it e.g. with get_bounds.

This comment has been minimized.

@immerrr

immerrr Aug 8, 2016

Contributor

What about making the "padding" relative coordinates, i.e. positive values mean normal html padding, negative values - pad the image "inside" of the element.

On a sidenote, i don't think it's necessary to keep "width" and "height" parameters with the same, quite obscure meaning as in the api. We might want to seize the opportunity to think of better names or better ways to describe the use cases for these parameters, not necessarily within this project.

@immerrr

immerrr Aug 8, 2016

Contributor

What about making the "padding" relative coordinates, i.e. positive values mean normal html padding, negative values - pad the image "inside" of the element.

On a sidenote, i don't think it's necessary to keep "width" and "height" parameters with the same, quite obscure meaning as in the api. We might want to seize the opportunity to think of better names or better ways to describe the use cases for these parameters, not necessarily within this project.

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

What about making the "padding" relative coordinates, i.e. positive values mean normal html padding, negative values - pad the image "inside" of the element.

yeah, that's what I meant, but haven't expressed clearly!

On a sidenote, i don't think it's necessary to keep "width" and "height" parameters with the same, quite obscure meaning as in the api. We might want to seize the opportunity to think of better names or better ways to describe the use cases for these parameters, not necessarily within this project.

What do you have in mind? It is nice to be able to rescale element when rendering; do you have a better API proposal?

@kmike

kmike Aug 8, 2016

Member

What about making the "padding" relative coordinates, i.e. positive values mean normal html padding, negative values - pad the image "inside" of the element.

yeah, that's what I meant, but haven't expressed clearly!

On a sidenote, i don't think it's necessary to keep "width" and "height" parameters with the same, quite obscure meaning as in the api. We might want to seize the opportunity to think of better names or better ways to describe the use cases for these parameters, not necessarily within this project.

What do you have in mind? It is nice to be able to rescale element when rendering; do you have a better API proposal?

This comment has been minimized.

@mike1808

mike1808 Aug 13, 2016

Contributor

Nice idea about the padding. What do think, what should we in the following cases:

  • for positive pads, if the padded image is going out of the web page bounds should we throw error or just fill with some color (maybe user-specified)
  • for negative pads, if the padding image has no width or height at all: throw error or return an empty image?
@mike1808

mike1808 Aug 13, 2016

Contributor

Nice idea about the padding. What do think, what should we in the following cases:

  • for positive pads, if the padded image is going out of the web page bounds should we throw error or just fill with some color (maybe user-specified)
  • for negative pads, if the padding image has no width or height at all: throw error or return an empty image?

This comment has been minimized.

@mike1808

mike1808 Aug 13, 2016

Contributor

What do you have in mind? It is nice to be able to rescale element when rendering; do you have a better API proposal?

What if we rename the size to rescale_to? Will it make easier to understand the meaning of that argument?

@mike1808

mike1808 Aug 13, 2016

Contributor

What do you have in mind? It is nice to be able to rescale element when rendering; do you have a better API proposal?

What if we rename the size to rescale_to? Will it make easier to understand the meaning of that argument?

This comment has been minimized.

@kmike

kmike Aug 14, 2016

Member

for positive pads, if the padded image is going out of the web page bounds should we throw error or just fill with some color (maybe user-specified)

+1 to fill with a color. I recall there was some filling in existing rendering methods; imho we should do the same here (/cc @immerrr)

for negative pads, if the padding image has no width or height at all: throw error or return an empty image?

my vote is to return an empty image.

What if we rename the size to rescale_to? Will it make easier to understand the meaning of that argument?

there is no 'size' argument, there are width and height. User can pass wither both of them, or only one of them. Also, in splash:png and splash:jpeg they work differently: width causes page to be rescaled, while height causes page to be truncated at the bottom. So rescale_to is not the same as width/height.

@kmike

kmike Aug 14, 2016

Member

for positive pads, if the padded image is going out of the web page bounds should we throw error or just fill with some color (maybe user-specified)

+1 to fill with a color. I recall there was some filling in existing rendering methods; imho we should do the same here (/cc @immerrr)

for negative pads, if the padding image has no width or height at all: throw error or return an empty image?

my vote is to return an empty image.

What if we rename the size to rescale_to? Will it make easier to understand the meaning of that argument?

there is no 'size' argument, there are width and height. User can pass wither both of them, or only one of them. Also, in splash:png and splash:jpeg they work differently: width causes page to be rescaled, while height causes page to be truncated at the bottom. So rescale_to is not the same as width/height.

This methods do the following:
* checks whether the selected element is editable

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

Why is this check needed?

@kmike

kmike Aug 8, 2016

Member

Why is this check needed?

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

E.g. element may become editable only after a click, or it may listen to keyboard events without being editable

@kmike

kmike Aug 8, 2016

Member

E.g. element may become editable only after a click, or it may listen to keyboard events without being editable

This comment has been minimized.

@mike1808

mike1808 Aug 13, 2016

Contributor

The overall idea of "sending keys to element" is so obvious. What does it mean to send keys to the element?

The original idea was taken from casperj's "setField" method which do almost the same:

  • checks whether the element is editable
  • focuses on it
  • send key events

However, in Splash we have more control over the web page (e.g. we can simulate a "real" mouse click, w/o using JS). So, I think you point is good. :)

@mike1808

mike1808 Aug 13, 2016

Contributor

The overall idea of "sending keys to element" is so obvious. What does it mean to send keys to the element?

The original idea was taken from casperj's "setField" method which do almost the same:

  • checks whether the element is editable
  • focuses on it
  • send key events

However, in Splash we have more control over the web page (e.g. we can simulate a "real" mouse click, w/o using JS). So, I think you point is good. :)

* selector - valid CSS selector
**Returns:** a :ref:`splash-element`.

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

Async flag is missing

@kmike

kmike Aug 8, 2016

Member

Async flag is missing

This comment has been minimized.

@mike1808

mike1808 Aug 13, 2016

Contributor

Should I write some small guide why this feature is useful?

@mike1808

mike1808 Aug 13, 2016

Contributor

Should I write some small guide why this feature is useful?

This comment has been minimized.

@kmike

kmike Aug 14, 2016

Member

Yes, and more examples could be nice!

@kmike

kmike Aug 14, 2016

Member

Yes, and more examples could be nice!

@@ -104,6 +106,11 @@ def _init_webpage(self, verbosity, network_manager, splash_proxy_factory, render
self.web_view.resize(
QSize(*map(int, defaults.VIEWPORT_SIZE.split('x'))))
def _init_elements_storage(self):

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

It'd be nice to have this work lazily, when a first HtmlElement instance is created. This way Splash won't be changing global scope unless required.

@kmike

kmike Aug 8, 2016

Member

It'd be nice to have this work lazily, when a first HtmlElement instance is created. This way Splash won't be changing global scope unless required.

This comment has been minimized.

@mike1808

mike1808 Aug 13, 2016

Contributor

Good point.

@mike1808

mike1808 Aug 13, 2016

Contributor

Good point.

@pyqtSlot(name="getId", result=str)
def get_id(self):
return str(uuid.uuid1())

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

It seems I'm missing something - why is there two uuid.uuid1() calls, one to set self.name, and another in get_id()?

@kmike

kmike Aug 8, 2016

Member

It seems I'm missing something - why is there two uuid.uuid1() calls, one to set self.name, and another in get_id()?

This comment has been minimized.

@mike1808

mike1808 Aug 13, 2016

Contributor

get_id is used in JS for retrieving a unique ID for the element. However, it's a good idea to call the same function in the constructor.

@mike1808

mike1808 Aug 13, 2016

Contributor

get_id is used in JS for retrieving a unique ID for the element. However, it's a good idea to call the same function in the constructor.

This comment has been minimized.

@kmike

kmike Aug 14, 2016

Member

Ah, so self.name is an id of elements storage object, and get_id generates an id of an element inside this storage? This makes sense. Could you please add docstrings to make it more clear?

@kmike

kmike Aug 14, 2016

Member

Ah, so self.name is an id of elements storage object, and get_id generates an id of an element inside this storage? This makes sense. Could you please add docstrings to make it more clear?

@@ -0,0 +1,114 @@
# Copyright (c) 2011-2015 Nicolas Perriault

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

could you please add a comment with a link to the casperjs file with relevant source code?

@kmike

kmike Aug 8, 2016

Member

could you please add a comment with a link to the casperjs file with relevant source code?

else if (o instanceof ClientRect) {
return {
top: o.top, left: o.left, bottom: o.bottom, right: o.right,
width: o.width, height: o.height,

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

👍

I wonder if it is possible to assign a non-float value to a ClientRect attribute in webkit used in Splash, bypassing this sanitizing function

@kmike

kmike Aug 8, 2016

Member

👍

I wonder if it is possible to assign a non-float value to a ClientRect attribute in webkit used in Splash, bypassing this sanitizing function

This comment has been minimized.

@mike1808

mike1808 Aug 13, 2016

Contributor

Why do you want to assign a value to the client rect? It won't affect on anything.

@mike1808

mike1808 Aug 13, 2016

Contributor

Why do you want to assign a value to the client rect? It won't affect on anything.

This comment has been minimized.

@kmike

kmike Aug 14, 2016

Member

Someone may want to assign a value to client rect in order to bypass sanitanization.

@kmike

kmike Aug 14, 2016

Member

Someone may want to assign a value to client rect in order to bypass sanitanization.

@@ -80,7 +80,7 @@ def get_last(self):
def command(table_argument=False, sets_callback=False,
decode_arguments=True):
decode_arguments=True, flag=False):

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

could you please use a longer name for flag variable? e.g. errors_as_flags or something like this (I'm not exactly sure about its meaning)?

@kmike

kmike Aug 8, 2016

Member

could you please use a longer name for flag variable? e.g. errors_as_flags or something like this (I'm not exactly sure about its meaning)?

local p = splash:select('p')
local form = splash:select('form#login')
local username = splash:select('input[name="username"]')
local password = splash:select('input[name="password"]')

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

From this example it looks like element:select() is a good idea - I'd better write form:select(...) here, as ideally the lookup should be limited to form#login.

@kmike

kmike Aug 8, 2016

Member

From this example it looks like element:select() is a good idea - I'd better write form:select(...) here, as ideally the lookup should be limited to form#login.

This comment has been minimized.

@mike1808

mike1808 Aug 13, 2016

Contributor

In my PR #490 you can do "form.node:querySelector()"

@mike1808

mike1808 Aug 13, 2016

Contributor

In my PR #490 you can do "form.node:querySelector()"

This comment has been minimized.

@kmike

kmike Aug 14, 2016

Member

hm, by the way, is there a difference between node:querySelector(css) and :select(css)? Can we make :select(css) a shortcut for .node:querySelector(css)? Are there extra features :select() provides?

@kmike

kmike Aug 14, 2016

Member

hm, by the way, is there a difference between node:querySelector(css) and :select(css)? Can we make :select(css) a shortcut for .node:querySelector(css)? Are there extra features :select() provides?

local p = splash:select('p')
local ok, parent = p:node_property('parentNode')
local ok, remove = parent:node_method('remove')

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

splash:select('p'):node_property('parentNone'):node_method('remove')() is so much nicer when you're writing quick scraping scripts :) I wonder if we should sacrifice ok, result consistency and start returning nil objects or raising errors

@kmike

kmike Aug 8, 2016

Member

splash:select('p'):node_property('parentNone'):node_method('remove')() is so much nicer when you're writing quick scraping scripts :) I wonder if we should sacrifice ok, result consistency and start returning nil objects or raising errors

This comment has been minimized.

@immerrr

immerrr Aug 8, 2016

Contributor

There's a branch that would allow splash:select('p').parentNode:remove(), that raises errors if something goes wrong, and I have high hopes for it. +1 from me.

Also, while writing it made me wonder if we want to have .parentNode for attributes and :remove for methods.

@immerrr

immerrr Aug 8, 2016

Contributor

There's a branch that would allow splash:select('p').parentNode:remove(), that raises errors if something goes wrong, and I have high hopes for it. +1 from me.

Also, while writing it made me wonder if we want to have .parentNode for attributes and :remove for methods.

@@ -89,7 +89,7 @@ def get_alive():
'SplashCookieJar', 'OneShotCallbackProxy',
'_ExposedRequest', '_ExposedBoundRequest',
'_ExposedResponse', '_ExposedBoundResponse',
'_ExposedTimer',
'_ExposedTimer', '_ExposedElement',

This comment has been minimized.

@kmike

kmike Aug 8, 2016

Member

could you please add ElementsStorage here?

@kmike

kmike Aug 8, 2016

Member

could you please add ElementsStorage here?

@mike1808

This comment has been minimized.

Show comment
Hide comment
@mike1808

mike1808 Aug 13, 2016

Contributor

@immerrr @kmike I've moved all the work to #490 because I did quite a much refactoring there. I'm going to fix all your notes in that branch. And if it is OK for you I'm going to close this PR.

Contributor

mike1808 commented Aug 13, 2016

@immerrr @kmike I've moved all the work to #490 because I did quite a much refactoring there. I'm going to fix all your notes in that branch. And if it is OK for you I'm going to close this PR.

@kmike

This comment has been minimized.

Show comment
Hide comment
@kmike

kmike Aug 13, 2016

Member

Makes sense 👍

Member

kmike commented Aug 13, 2016

Makes sense 👍

@kmike kmike closed this Aug 13, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment