-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add repetition of steps in different browsers without element ids. #41
Conversation
@marco-c please review it. |
collect.py
Outdated
else: | ||
elem_id = elem_attributes['id'] | ||
elem = driver.find_element_by_id(elem_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still only using the ID. It means we will find many more possible elements in the first browser, but then won't be able to reproduce the same steps in the second browser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c We are able to reproduce same steps in second browser as in line 80 I do check if the id
is present in elem_attributes
. In most cases it is not found so the code enters the if
condition and finds the element on the basis of attributes. Images in #15 demonstrate how the code reproduces the same steps for the second browser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh you're right, sorry I hadn't noticed.
I would refactor this a bit so that the first branch is taken when we are looking for an element to choose, and the second branch when we already know the element we want to use (either with the attributes or the ID).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c I didn't understand what do you mean by refactoring it? Should I create two functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean something like this:
if elem.attributes is None:
CODE THAT CHOOSES AN ELEMENT FROM THE PAGE
else:
FIND AN ALREADY CHOSEN ELEMENT BY ID OR USING ALL THE ATTRIBUTES
So the first branch implements the "choose random element" part, the second branch implements the "I already know what element I want to use, let's find it on the page" part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c I made the required change. Please check.
I'm thinking that to avoid confusion, maybe we can actually only have two possible formats: And we have to convert the already existing data to this format. |
ced0fca
to
b435e05
Compare
collect.py
Outdated
if elem_id == '': | ||
continue | ||
# Get all the attributes of the child. | ||
child_attributes = driver.execute_script('var elem_attribute = {}; for (i = 0; i < arguments[0].attributes.length; i++) { elem_attribute[arguments[0].attributes[i].name] = arguments[0].attributes[i].value }; return elem_attribute;', child) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this script more readable by using a multiline string? Like in wait_loaded
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we could use the Selenium API to do the same, but for now it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c I will try to do it with Selenium API.
collect.py
Outdated
for child in children: | ||
|
||
# Get all the attributes of the child. | ||
child_attributes = driver.execute_script('var elem_attribute = {}; for (i = 0; i < arguments[0].attributes.length; i++) { elem_attribute[arguments[0].attributes[i].name] = arguments[0].attributes[i].value }; return elem_attribute;', child) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is repeated, it should be in a separate function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
collect.py
Outdated
|
||
# If the element is not displayed or is disabled, the user can't interact with it. Skip | ||
# non-displayed/disabled elements, since we're trying to mimic a real user. | ||
if not child.is_displayed() or not child.is_enabled(): | ||
continue | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't remove this newline
collect.py
Outdated
# Get all the attributes of the child. | ||
child_attributes = driver.execute_script('var elem_attribute = {}; for (i = 0; i < arguments[0].attributes.length; i++) { elem_attribute[arguments[0].attributes[i].name] = arguments[0].attributes[i].value }; return elem_attribute;', child) | ||
|
||
# If the element is not displayed or is disabled, the user can't interact with it. Skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can ignore this part, as we already know that the element was displayed and not enabled in the first browser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c Because of two browsers can display in different way it might be that one browser displays the element while other doesn't. Should I remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that happens, it's a compatibility issue and we want to detect it, so it's useful to take a screenshot even if the element is not displayed in the second browser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok! I get it.
collect.py
Outdated
@@ -197,6 +220,10 @@ def run_tests(firefox_driver, chrome_driver): | |||
not os.path.exists('data/%d_chrome.png' % bug['id']): | |||
sequence = run_test(bug, 'firefox', firefox_driver) | |||
run_test(bug, 'chrome', chrome_driver, sequence) | |||
|
|||
with open("data/" + str(bug['id']) + ".txt", 'w') as f: | |||
f.write(json.dumps(sequence)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's write an object of the sequence per line, instead of a full json object. This way it's easier to see diffs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok!
print(' - Using %s' % elem_id) | ||
print(' - Using %s' % elem_attributes) | ||
image_file = str(bug['id']) + '_' + str(i) + '_' + browser | ||
screenshot(driver, 'data/%s.png' % (image_file)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is breaking compatibility with the old way of doing things, before merging this we will also need a script to convert the current data to the new format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Updated collect.py
collect.py
Outdated
def do_something(driver, elem_id=None): | ||
def get_all_attributes(driver, child): | ||
child_attributes = driver.execute_script(""" | ||
var elem_attribute = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use let
instead of var
.
collect.py
Outdated
child_attributes = driver.execute_script(""" | ||
var elem_attribute = {}; | ||
|
||
for (i = 0; i < arguments[0].attributes.length; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use let i
collect.py
Outdated
@@ -197,6 +235,11 @@ def run_tests(firefox_driver, chrome_driver): | |||
not os.path.exists('data/%d_chrome.png' % bug['id']): | |||
sequence = run_test(bug, 'firefox', firefox_driver) | |||
run_test(bug, 'chrome', chrome_driver, sequence) | |||
|
|||
with open("data/" + str(bug['id']) + ".txt", 'w') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Use %
instead of +
as it is more readable
'data/%d.txt' % bug['id']
collect.py
Outdated
children = buttons + links + inputs | ||
|
||
for child in children: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: no newline here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still here 😄
collect.py
Outdated
@@ -92,12 +107,9 @@ def do_something(driver, elem_id=None): | |||
random.shuffle(children) | |||
|
|||
for child in children: | |||
elem_id = child.get_attribute('id') | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: no newline here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still here 😄
rename_images.py
Outdated
@@ -0,0 +1,19 @@ | |||
from os import listdir, rename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script should also generate the .txt
files accompanying the images.
I suggest you use split('_')
to get the components from the name (if there are two components, it's an image in the format WEBCOMPAT-ID_BROWSER.png and you can skip it; if there are more than two components, it's an image in the format WEBCOMPAT-ID_ELEMENT-ID_SEQUENCE-NUM_BROWSER.png and you have to rename it and put in the .txt
file in order of SEQUENCE-NUM). You can use this https://github.com/marco-c/autowebcompat/blob/master/data_inconsistencies.py#L12 as inspiration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
collect.py
Outdated
for child in children: | ||
elem_id = child.get_attribute('id') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove newline here
collect.py
Outdated
inputs = body.find_elements_by_tag_name('input') | ||
children = buttons + links + inputs | ||
for child in children: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove newline here
collect.py
Outdated
# If the element is not displayed or is disabled, the user can't interact with it. Skip | ||
# non-displayed/disabled elements, since we're trying to mimic a real user. | ||
if not child.is_displayed() or not child.is_enabled(): | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, this part should be removed.
rename_images.py
Outdated
parts = os.path.splitext(f)[0].split('_') | ||
if len(parts) <= 2: | ||
continue | ||
if parts[0] not in image_info.keys(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assign names to the parts before this line, otherwise the rest of the code is hard to read.
rename_images.py
Outdated
|
||
for key, attributes in image_info.items(): | ||
with open("./data/%s.txt" % key, "w") as text_file: | ||
attributes = sorted(attributes.items()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this going to sort by element ID rather than sequence number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c No actually because attributes
is a dictionary
of { seq_no : elem_id }
I changed the variables for better understanding.
collect.py
Outdated
|
||
for (let i = 0; i < arguments[0].attributes.length; i++) { | ||
elem_attribute[arguments[0].attributes[i].name] = arguments[0].attributes[i].value; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This is not aligned correctly.
collect.py
Outdated
} | ||
|
||
return elem_attribute; | ||
""", child) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: align this with child_attributes
.
collect.py
Outdated
@@ -90,14 +104,10 @@ def do_something(driver, elem_id=None): | |||
children = buttons + links + inputs | |||
|
|||
random.shuffle(children) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: don't remove newline here 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c Is there any way to know where to put newline and where not? 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Don't remove newlines unless they are related to your changes;
- No newlines after a if, for, def, etc.;
- Newline to separate two logic blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c I will make sure nothing related to newline comes in my future PRs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't worry, they are minor annoyances.
collect.py
Outdated
links = body.find_elements_by_tag_name('a') | ||
inputs = body.find_elements_by_tag_name('input') | ||
children = buttons + links + inputs | ||
for child in children: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: newline before this for
rename_images.py
Outdated
seq_no_and_elem_id = sorted(seq_no_and_elem_id.items()) | ||
for value in seq_no_and_elem_id: | ||
sequence_no = value[0] | ||
elem_id = value[1][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to use only the first part of the element ID (e.g. for an_element_id
, this variable would be an
). I would replace line 12 with '_'.join(parts[1:-2])
and this line with value[1]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c Done!
914efd8
to
5d4c0bc
Compare
rename_images.py
Outdated
for value in seq_no_and_elem_id: | ||
sequence_no = value[0] | ||
elem_id = value[1] | ||
text_file.write("%s %s\n" % (sequence_no, elem_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that here you're writing text, but in collect.py
you are writing json objects.
The format of the files should be the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c For format to be same I made {"id" : "elem_id"}
per line according to sequence number in rename_images.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to do json.dumps({'id': elem_id})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter though, the end result is the same.
Can you run the script on your machine to rename the files? Then zip the resulting data
directory and upload it on Dropbox or some other service. Before merging this PR, I will have to update get_dependencies.py
to use the new data.zip file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c I will try to upload it today. Got a better Internet connection 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-c
https://www.dropbox.com/s/rsdq581uteu3mtc/data.zip?dl=0
this is the link with the renamed files.
fa61e6e
to
3ff0cc0
Compare
3ff0cc0
to
9e5f074
Compare
@marco-c I just realized the |
We should update The rename_images.py script should also update the labels files. |
I've done these changes myself. The only thing missing now is rewriting the label files with the new image names. Once that's done, I will replace the current data.zip file with the one you uploaded, then you can remove the rename_images.py script from the PR and we can finally merge it! |
@marco-c I made the required changes. (removed |
Thanks for the hard work! I'm verifying the new data.zip file you've uploaded and then I'll merge this. |
Please provide your feedback on this pull request here. Privacy statement: We don't store any personal information such as your email address or name. We ask for GitHub authentication as an anonymous identifier to account for duplicate feedback entries and to see people specific preferences. |
Solves #15