New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NCSU Project - Random web content generator #20448
Conversation
Thanks for the pull request, and welcome! The Servo team is excited to review your changes, and you should hear from @asajeffrey (or someone else) soon. |
Opened new PR for upstreamable changes. Completed upstream sync of web-platform-test changes at jdm/web-platform-tests#45. |
cc @Manishearth so you know what's going on with this project |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start! I've left some comments about ways the code could be cleaner and more extensible.
Please also run chmod -x tests/wpt/web-platform-tests/tools/webdriver/setup.py
to revert the changes to that file. It shouldn't be necessary to make it executable in this PR.
etc/file0.html
Outdated
@@ -0,0 +1,145 @@ | |||
<!DOCTYPE html><html> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not add the generated files in this pull request.
@@ -0,0 +1,27 @@ | |||
from client import Cookies, Element, Find, Session, Timeouts, Window |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove the tests/wpt/web-platform-tests/tools/webdriver/{build,dist} directories from this pull request.
etc/servo_automation_screenshot.py
Outdated
""" | ||
Created on Mon Mar 26 20:08:25 2018 | ||
|
||
@author: Pranshu Sinha, Abhay Soni, Aayushi Agrawal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is missing the Servo license header that is present in other files.
etc/servo_automation_screenshot.py
Outdated
""" | ||
|
||
""" | ||
The below program is intended to test rendering mismatches in servo based on random web pages generated by the web page fuzzer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This program in particular is only supposed to take screenshots of arbitrary files. It doesn't need to mention the random web page generator.
etc/servo_automation_screenshot.py
Outdated
start_servo.start_servo() | ||
|
||
#Since servo takes time to load and to get the session ID for subsequent steps, we added a sleep for 60 seconds | ||
time.sleep(60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unnecessarily long. Can we run some commands in a loop to wait only as long as necessary until Servo returns some status indicating that it's ready?
etc/servo_automation_screenshot.py
Outdated
#Since servo takes time to load and to get the session ID for subsequent steps, we added a sleep for 60 seconds | ||
time.sleep(60) | ||
|
||
#The program assumes that the html files to be rendered in servo are present in the same current working directory in the format "file#.html" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's accept URLs or file names as command line arguments rather than making this assumption.
etc/servo_automation_screenshot.py
Outdated
|
||
#The program assumes that the html files to be rendered in servo are present in the same current working directory in the format "file#.html" | ||
cwd = os.getcwd() | ||
url = 'http://localhost:7002/session' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid hardcoding this port value in more than one place if possible.
etc/servo_automation_screenshot.py
Outdated
payload = "{}" | ||
headers = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'} | ||
r = requests.post(url, data=payload, headers=headers) | ||
json_string = r.json() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's give this a more meaningful name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was not addressed.
etc/servo_automation_screenshot.py
Outdated
json_data = json.dumps(json_data) | ||
url2 = 'http://localhost:7002/session/'+json_string['value']['sessionId']+'/url' | ||
payload2 = json_data | ||
headers2 = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we reuse the previous headers if they're identical?
etc/servo_automation_screenshot.py
Outdated
headers2 = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'} | ||
r2 = requests.post(url2, data=payload2, headers=headers2) | ||
|
||
curl_command2 = 'curl -v http://localhost:7002/session/'+json_string['value']['sessionId']+'/screenshot | jq -r ".value" | base64 -d > testing'+str(x)+'.png' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than running this specific command, can we translate it into an equivalent request.get
, and subsequent python library calls to get the equivalent result? Additionally, let's print some output show that the user knows where the generated screenshots are and what they're called.
Transplanted upstreamable changes to existing PR. Completed upstream sync of web-platform-test changes at jdm/web-platform-tests#45. |
No upstreamable changes; closed existing PR. Completed upstream sync of web-platform-test changes at jdm/web-platform-tests#45. |
etc/servo_automation_screenshot.py
Outdated
time.sleep(60) | ||
# We are waiting for reponse and we will edit as soon as we get the response. | ||
|
||
# sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed by email, this is the right way to check for the webdriver server's availability.
etc/servo_automation_screenshot.py
Outdated
elif opt in ("-u", "--url"): # store the value provided with the option -u in url_cl variable. | ||
global url_cl | ||
url_cl = arg | ||
elif opt in ("-i", "--ifile"): # store the value provided with the option -i in inputfile variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have separate arguments for a URL and input file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the idea was to have user input localhost
as URL and the location where the files are stored in input file variable. Should we always just open servo on localhost? That would mean we will just need the location of where the files are located?
What are your thoughts? @Manishearth
etc/servo_automation_screenshot.py
Outdated
session_request = requests.post(url, data=payload, headers=headers) | ||
json_string = session_request.json() | ||
# Currently, we have generated 5 random html pages, hence the loop runs for 5 iterations | ||
for x in range(5): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making this assumption, we should have the caller provide all of the URLs or filenames to load on the command line. There's no need to make any guesses or assumptions.
etc/servo_automation_screenshot.py
Outdated
# connected = False | ||
# while not connected: | ||
# try: | ||
# sock.connect(('localhost',7002)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to un-hardcode the port.
etc/servo_automation_screenshot.py
Outdated
payload = "{}" | ||
headers = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'} | ||
r = requests.post(url, data=payload, headers=headers) | ||
json_string = r.json() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was not addressed.
etc/servo_automation_screenshot.py
Outdated
json_data = {} | ||
json_data['url'] = 'file://' + file_url + 'file' + str(x) + '.html' | ||
json_data = json.dumps(json_data) | ||
payload2 = json_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we create this payload2 value?
etc/servo_automation_screenshot.py
Outdated
url_request = requests.post(url + json_string['value']['sessionId'] + '/url', data=payload2, headers=headers) | ||
screenshot_request = requests.get(url + json_string['value']['sessionId'] + '/screenshot') | ||
image_data_encoded = screenshot_request.json()['value'] | ||
with open("screenshots/output_image_" + str(x) + ".png", "wb") as image_file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this fail if the screenshots directory does not exist?
etc/start_servo.py
Outdated
|
||
def start_servo(port, resolution): | ||
cmds = ['cd ..', './mach run --webdriver ' + port + ' --resolution ' + resolution] | ||
p = subprocess.Popen('/bin/bash', stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments about this function from the previous review were not addressed.
etc/servo_automation_screenshot.py
Outdated
print('python etc/servo_automation_screenshot.py -u <url> -p <port> -i <html_file_url> -r <resolution>') | ||
sys.exit() | ||
elif opt in ("-p", "--port"): # store the value provided with the option -p in port variable. | ||
global port |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of running half of the program inside of a main() function and storing values in global variables, why don't we run the entire program inside main and use local variables?
etc/servo_automation_screenshot.py
Outdated
process_servo = start_servo.start_servo(port, resolution) | ||
|
||
# Since servo takes time to load and to get the session ID for subsequent steps, we added a sleep for 60 seconds | ||
time.sleep(60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we'd have Servo in webdriver mode output something to the console when ready so we can start immediately after
etc/servo_automation_screenshot.py
Outdated
sys.exit(2) | ||
for opt, arg in opts: | ||
if opt == '-h': # -h means help. Displays how to input command line arguments | ||
print('python etc/servo_automation_screenshot.py -u <url> -p <port> -i <html_file_url> -r <resolution>') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Other print
's mention python3
, but this one is just python
. Also, moving it to a new function like print_help()
would make things easier if you want to add/change command line arguments in the future and reduce chances of making a typo.
etc/start_servo.py
Outdated
cmds = ['cd ..', './mach run --webdriver ' + port + ' --resolution ' + resolution] | ||
p = subprocess.Popen('/bin/bash', stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) | ||
for cmd in cmds: | ||
p.stdin.write(cmd + "\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a .encode('utf-8')
on the passed string to work
etc/start_servo.py
Outdated
|
||
|
||
def start_servo(port, resolution): | ||
cmds = ['cd ..', './mach run --webdriver ' + port + ' --resolution ' + resolution] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should os.chdir()
instead of calling cd
, though it's unclear as to why we need to change the directory at all
Hi @Manishearth, I am using this command to run the file from servo: |
etc/servo_automation_screenshot.py
Outdated
import getopt | ||
|
||
|
||
def servoReadyToAccept(url, payload, headers): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: python functions are in snake_case
etc/servo_automation_screenshot.py
Outdated
|
||
json_data = {} | ||
json_data['url'] = 'file://' + file_url + 'file' + str(x) + '.html' | ||
print(json_data['url']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove debugging prints, or make them output some text about what it's doing
etc/servo_automation_screenshot.py
Outdated
json_data['url'] = 'file://' + file_url + 'file' + str(x) + '.html' | ||
print(json_data['url']) | ||
json_data = json.dumps(json_data) | ||
requests.post(url + '/' + json_string['value']['sessionId'] + '/url', data=json_data, headers=headers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a format string here
etc/servo_automation_screenshot.py
Outdated
json_data = json.dumps(json_data) | ||
requests.post(url + '/' + json_string['value']['sessionId'] + '/url', data=json_data, headers=headers) | ||
# print('Response for html file post: ',url_request) | ||
screenshot_request = requests.get(url + '/' + json_string['value']['sessionId'] + '/screenshot') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also here
etc/servo_automation_screenshot.py
Outdated
image_file.write(base64.decodebytes(image_data_encoded.encode('utf-8'))) | ||
print("################################") | ||
|
||
print("The screenshot is stored in the location: " + cwd + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also here
(anywhere you use string concatenation)
etc/servo_automation_screenshot.py
Outdated
file_url = arg | ||
elif opt in ("-r", "--resolution"): # store the value provided with the option -r in resolution variable. | ||
resolution = arg | ||
elif opt in ("-n", "--numOfFiles"): # store the value provided with the option -n in num_of_files variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--num-files
etc/servo_automation_screenshot.py
Outdated
elif opt in ("-p", "--port"): # store the value provided with the option -p in port variable. | ||
port = arg | ||
elif opt in ("-u", "--url"): # store the value provided with the option -u in url variable. | ||
url = arg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this argument? Seems like url will always be localhost.
(We can also make port
default to 8080 or something)
Hi @Manishearth, I have addressed the comments. Let me know any additional changes you might want me to do. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor changes, then we'll land. Please also squash your commits
json_data['url'] = 'file://{0}file{1}.html'.format(file_url, str(x)) | ||
print(json_data['url']) | ||
json_data = json.dumps(json_data) | ||
requests.post('{0}/{1}/url'.format(url, json_string['value']['sessionId']), data=json_data, headers=headers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave some comments with links to webdriver docs about these endpoints and what's going on
f0a9299
to
b277eaa
Compare
etc/servo_automation_screenshot.py
Outdated
except getopt.GetoptError: | ||
# an error is generated if the options provided in commandline are wrong. | ||
# The help statement is printed on how to input command line arguments. | ||
print('Enter the correct command to run this file') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both of these should output some info on usage (expected flags, etc). Just saying "Usage: ./etc/servo_automation_screenshot.py filename -p portnumber" or whatever the format is would be enough
f9be855
to
0b20543
Compare
etc/servo_automation_screenshot.py
Outdated
if __name__ == "__main__": | ||
if len(sys.argv) < 8: | ||
print('Enter the correct command to run this file') | ||
print('python3 ./etc/servo_automation_screenshot.py -p <port>' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This usage print should be a function so that you don't have to copy paste it
Hi @Manishearth, Sorry for that, I hope there aren't anymore problems. Let me know. Thanks! |
@bors-servo r+ looking good! Your next steps are to get this working with Firefox, yes? |
📌 Commit 437276d has been approved by |
@Manishearth Yes! We are in progress, you can review it from here. Let us know what are your thoughts on this. We are almost done with fixing the issues in that repository. Will push the code soon. |
NCSU Project - Random web content generator <!-- Please describe your changes on the following line: --> Here is the link to our repository which contains the web-page fuzzer: [Link](https://github.com/asoni3/Random-web-content-generator---CSC-517-OSS-Project) **File: start_servo.py** The script is intended to start servo on localhost:7002 **File: servo_automation_screenshot.py** The below program is intended to test rendering mismatches in servo based on random web pages generated by the web page fuzzer. Here is the breakdown of how our code works: * A session is started on localhost:7002 * The randomly generated webpage's (html files) data is sent as JSON to this session * Using curl request, we load the html files for this session ID based on the session we just created * For each html file it renders, it takes the screenshot after rendering which is saved in the current working directory in the format "testing#.png" --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: --> - [x] `./mach build -d` does not report any errors - [x] `./mach test-tidy` does not report any errors - [ ] These changes fix #__ (github issue number if applicable). <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because _____ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/20448) <!-- Reviewable:end -->
💔 Test failed - linux-rel-wpt |
Hi @Manishearth, I tried looking at the logs and I am not able to understand why is this failing. I am not sure if this is related to my work. Can you please take a look? Thanks! |
@bors-servo retry something might be wrong with the CI |
NCSU Project - Random web content generator <!-- Please describe your changes on the following line: --> Here is the link to our repository which contains the web-page fuzzer: [Link](https://github.com/asoni3/Random-web-content-generator---CSC-517-OSS-Project) **File: start_servo.py** The script is intended to start servo on localhost:7002 **File: servo_automation_screenshot.py** The below program is intended to test rendering mismatches in servo based on random web pages generated by the web page fuzzer. Here is the breakdown of how our code works: * A session is started on localhost:7002 * The randomly generated webpage's (html files) data is sent as JSON to this session * Using curl request, we load the html files for this session ID based on the session we just created * For each html file it renders, it takes the screenshot after rendering which is saved in the current working directory in the format "testing#.png" --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: --> - [x] `./mach build -d` does not report any errors - [x] `./mach test-tidy` does not report any errors - [ ] These changes fix #__ (github issue number if applicable). <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because _____ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/20448) <!-- Reviewable:end -->
☀️ Test successful - android, arm32, arm64, linux-dev, linux-rel-css, linux-rel-wpt, mac-dev-unit, mac-rel-css1, mac-rel-css2, mac-rel-wpt1, mac-rel-wpt2, mac-rel-wpt3, mac-rel-wpt4, windows-msvc-dev |
Hi @Manishearth, Thank you. Could you also review the random web content generator repository and check if we are on the right track? Thanks! |
Here is the link to our repository which contains the web-page fuzzer: Link
File: start_servo.py
The script is intended to start servo on localhost:7002
File: servo_automation_screenshot.py
The below program is intended to test rendering mismatches in servo based on random web pages generated by the web page fuzzer.
Here is the breakdown of how our code works:
./mach build -d
does not report any errors./mach test-tidy
does not report any errorsThis change is