Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NCSU Project - Random web content generator #20448

Merged
merged 1 commit into from Apr 27, 2018
Merged

Conversation

asoni3
Copy link
Contributor

@asoni3 asoni3 commented Mar 27, 2018

Here is the link to our repository which contains the web-page fuzzer: Link
File: start_servo.py
The script is intended to start servo on localhost:7002
File: servo_automation_screenshot.py
The below program is intended to test rendering mismatches in servo based on random web pages generated by the web page fuzzer.

Here is the breakdown of how our code works:

  • A session is started on localhost:7002
  • The randomly generated webpage's (html files) data is sent as JSON to this session
  • Using curl request, we load the html files for this session ID based on the session we just created
  • For each html file it renders, it takes the screenshot after rendering which is saved in the current working directory in the format "testing#.png"

  • ./mach build -d does not report any errors
  • ./mach test-tidy does not report any errors
  • These changes fix #__ (github issue number if applicable).
  • There are tests for these changes OR
  • These changes do not require tests because _____

This change is Reviewable

@highfive
Copy link

Thanks for the pull request, and welcome! The Servo team is excited to review your changes, and you should hear from @asajeffrey (or someone else) soon.

@highfive highfive added the S-awaiting-review There is new code that needs to be reviewed. label Mar 27, 2018
@servo-wpt-sync
Copy link
Collaborator

Opened new PR for upstreamable changes.

Completed upstream sync of web-platform-test changes at jdm/web-platform-tests#45.

@jdm jdm assigned jdm and unassigned asajeffrey Mar 27, 2018
@jdm
Copy link
Member

jdm commented Mar 28, 2018

cc @Manishearth so you know what's going on with this project

Copy link
Member

@jdm jdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start! I've left some comments about ways the code could be cleaner and more extensible.

Please also run chmod -x tests/wpt/web-platform-tests/tools/webdriver/setup.py to revert the changes to that file. It shouldn't be necessary to make it executable in this PR.

etc/file0.html Outdated
@@ -0,0 +1,145 @@
<!DOCTYPE html><html>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add the generated files in this pull request.

@@ -0,0 +1,27 @@
from client import Cookies, Element, Find, Session, Timeouts, Window
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the tests/wpt/web-platform-tests/tools/webdriver/{build,dist} directories from this pull request.

"""
Created on Mon Mar 26 20:08:25 2018

@author: Pranshu Sinha, Abhay Soni, Aayushi Agrawal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is missing the Servo license header that is present in other files.

"""

"""
The below program is intended to test rendering mismatches in servo based on random web pages generated by the web page fuzzer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This program in particular is only supposed to take screenshots of arbitrary files. It doesn't need to mention the random web page generator.

start_servo.start_servo()

#Since servo takes time to load and to get the session ID for subsequent steps, we added a sleep for 60 seconds
time.sleep(60)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessarily long. Can we run some commands in a loop to wait only as long as necessary until Servo returns some status indicating that it's ready?

#Since servo takes time to load and to get the session ID for subsequent steps, we added a sleep for 60 seconds
time.sleep(60)

#The program assumes that the html files to be rendered in servo are present in the same current working directory in the format "file#.html"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's accept URLs or file names as command line arguments rather than making this assumption.


#The program assumes that the html files to be rendered in servo are present in the same current working directory in the format "file#.html"
cwd = os.getcwd()
url = 'http://localhost:7002/session'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid hardcoding this port value in more than one place if possible.

payload = "{}"
headers = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'}
r = requests.post(url, data=payload, headers=headers)
json_string = r.json()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's give this a more meaningful name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was not addressed.

json_data = json.dumps(json_data)
url2 = 'http://localhost:7002/session/'+json_string['value']['sessionId']+'/url'
payload2 = json_data
headers2 = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we reuse the previous headers if they're identical?

headers2 = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'}
r2 = requests.post(url2, data=payload2, headers=headers2)

curl_command2 = 'curl -v http://localhost:7002/session/'+json_string['value']['sessionId']+'/screenshot | jq -r ".value" | base64 -d > testing'+str(x)+'.png'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than running this specific command, can we translate it into an equivalent request.get, and subsequent python library calls to get the equivalent result? Additionally, let's print some output show that the user knows where the generated screenshots are and what they're called.

@jdm jdm added S-needs-code-changes Changes have not yet been made that were requested by a reviewer. and removed S-awaiting-review There is new code that needs to be reviewed. labels Mar 28, 2018
@highfive highfive added S-awaiting-review There is new code that needs to be reviewed. and removed S-needs-code-changes Changes have not yet been made that were requested by a reviewer. labels Apr 3, 2018
@servo-wpt-sync
Copy link
Collaborator

Transplanted upstreamable changes to existing PR.

Completed upstream sync of web-platform-test changes at jdm/web-platform-tests#45.

@servo-wpt-sync
Copy link
Collaborator

No upstreamable changes; closed existing PR.

Completed upstream sync of web-platform-test changes at jdm/web-platform-tests#45.

time.sleep(60)
# We are waiting for reponse and we will edit as soon as we get the response.

# sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed by email, this is the right way to check for the webdriver server's availability.

elif opt in ("-u", "--url"): # store the value provided with the option -u in url_cl variable.
global url_cl
url_cl = arg
elif opt in ("-i", "--ifile"): # store the value provided with the option -i in inputfile variable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have separate arguments for a URL and input file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the idea was to have user input localhost as URL and the location where the files are stored in input file variable. Should we always just open servo on localhost? That would mean we will just need the location of where the files are located?

What are your thoughts? @Manishearth

session_request = requests.post(url, data=payload, headers=headers)
json_string = session_request.json()
# Currently, we have generated 5 random html pages, hence the loop runs for 5 iterations
for x in range(5):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than making this assumption, we should have the caller provide all of the URLs or filenames to load on the command line. There's no need to make any guesses or assumptions.

# connected = False
# while not connected:
# try:
# sock.connect(('localhost',7002))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to un-hardcode the port.

payload = "{}"
headers = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'}
r = requests.post(url, data=payload, headers=headers)
json_string = r.json()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was not addressed.

json_data = {}
json_data['url'] = 'file://' + file_url + 'file' + str(x) + '.html'
json_data = json.dumps(json_data)
payload2 = json_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we create this payload2 value?

url_request = requests.post(url + json_string['value']['sessionId'] + '/url', data=payload2, headers=headers)
screenshot_request = requests.get(url + json_string['value']['sessionId'] + '/screenshot')
image_data_encoded = screenshot_request.json()['value']
with open("screenshots/output_image_" + str(x) + ".png", "wb") as image_file:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this fail if the screenshots directory does not exist?


def start_servo(port, resolution):
cmds = ['cd ..', './mach run --webdriver ' + port + ' --resolution ' + resolution]
p = subprocess.Popen('/bin/bash', stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments about this function from the previous review were not addressed.

print('python etc/servo_automation_screenshot.py -u <url> -p <port> -i <html_file_url> -r <resolution>')
sys.exit()
elif opt in ("-p", "--port"): # store the value provided with the option -p in port variable.
global port
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of running half of the program inside of a main() function and storing values in global variables, why don't we run the entire program inside main and use local variables?

@jdm jdm added S-needs-code-changes Changes have not yet been made that were requested by a reviewer. and removed S-awaiting-review There is new code that needs to be reviewed. labels Apr 3, 2018
@jdm jdm assigned Manishearth and unassigned jdm Apr 3, 2018
process_servo = start_servo.start_servo(port, resolution)

# Since servo takes time to load and to get the session ID for subsequent steps, we added a sleep for 60 seconds
time.sleep(60)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we'd have Servo in webdriver mode output something to the console when ready so we can start immediately after

sys.exit(2)
for opt, arg in opts:
if opt == '-h': # -h means help. Displays how to input command line arguments
print('python etc/servo_automation_screenshot.py -u <url> -p <port> -i <html_file_url> -r <resolution>')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Other print's mention python3, but this one is just python. Also, moving it to a new function like print_help() would make things easier if you want to add/change command line arguments in the future and reduce chances of making a typo.

cmds = ['cd ..', './mach run --webdriver ' + port + ' --resolution ' + resolution]
p = subprocess.Popen('/bin/bash', stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
for cmd in cmds:
p.stdin.write(cmd + "\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a .encode('utf-8') on the passed string to work



def start_servo(port, resolution):
cmds = ['cd ..', './mach run --webdriver ' + port + ' --resolution ' + resolution]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should os.chdir() instead of calling cd, though it's unclear as to why we need to change the directory at all

@highfive highfive added S-awaiting-review There is new code that needs to be reviewed. and removed S-needs-code-changes Changes have not yet been made that were requested by a reviewer. labels Apr 24, 2018
@asoni3
Copy link
Contributor Author

asoni3 commented Apr 24, 2018

Hi @Manishearth,
I have tried to address most of the comments, let me know if I forgot something. There is one comment that I have responded to, as soon as I get feedback, I will try to address it. The issues on the Random Web Content Generator are yet to be addressed. Let me know if you need additional changes.

I am using this command to run the file from servo:
python3 ./etc/servo_automation_screenshot.py -u localhost -p 7002 -i /path/to/folder/containing/files -r 1200x800 -n 5
Thanks!

import getopt


def servoReadyToAccept(url, payload, headers):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: python functions are in snake_case


json_data = {}
json_data['url'] = 'file://' + file_url + 'file' + str(x) + '.html'
print(json_data['url'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove debugging prints, or make them output some text about what it's doing

json_data['url'] = 'file://' + file_url + 'file' + str(x) + '.html'
print(json_data['url'])
json_data = json.dumps(json_data)
requests.post(url + '/' + json_string['value']['sessionId'] + '/url', data=json_data, headers=headers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a format string here

json_data = json.dumps(json_data)
requests.post(url + '/' + json_string['value']['sessionId'] + '/url', data=json_data, headers=headers)
# print('Response for html file post: ',url_request)
screenshot_request = requests.get(url + '/' + json_string['value']['sessionId'] + '/screenshot')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here

image_file.write(base64.decodebytes(image_data_encoded.encode('utf-8')))
print("################################")

print("The screenshot is stored in the location: " + cwd +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here

(anywhere you use string concatenation)

file_url = arg
elif opt in ("-r", "--resolution"): # store the value provided with the option -r in resolution variable.
resolution = arg
elif opt in ("-n", "--numOfFiles"): # store the value provided with the option -n in num_of_files variable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--num-files

elif opt in ("-p", "--port"): # store the value provided with the option -p in port variable.
port = arg
elif opt in ("-u", "--url"): # store the value provided with the option -u in url variable.
url = arg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this argument? Seems like url will always be localhost.

(We can also make port default to 8080 or something)

@asoni3
Copy link
Contributor Author

asoni3 commented Apr 25, 2018

Hi @Manishearth,

I have addressed the comments. Let me know any additional changes you might want me to do.

Thanks!

Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor changes, then we'll land. Please also squash your commits

json_data['url'] = 'file://{0}file{1}.html'.format(file_url, str(x))
print(json_data['url'])
json_data = json.dumps(json_data)
requests.post('{0}/{1}/url'.format(url, json_string['value']['sessionId']), data=json_data, headers=headers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave some comments with links to webdriver docs about these endpoints and what's going on

@asoni3 asoni3 force-pushed the master branch 2 times, most recently from f0a9299 to b277eaa Compare April 27, 2018 04:28
except getopt.GetoptError:
# an error is generated if the options provided in commandline are wrong.
# The help statement is printed on how to input command line arguments.
print('Enter the correct command to run this file')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both of these should output some info on usage (expected flags, etc). Just saying "Usage: ./etc/servo_automation_screenshot.py filename -p portnumber" or whatever the format is would be enough

@asoni3 asoni3 force-pushed the master branch 2 times, most recently from f9be855 to 0b20543 Compare April 27, 2018 16:55
if __name__ == "__main__":
if len(sys.argv) < 8:
print('Enter the correct command to run this file')
print('python3 ./etc/servo_automation_screenshot.py -p <port>'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This usage print should be a function so that you don't have to copy paste it

@asoni3
Copy link
Contributor Author

asoni3 commented Apr 27, 2018

Hi @Manishearth,

Sorry for that, I hope there aren't anymore problems. Let me know.

Thanks!

@Manishearth
Copy link
Member

@bors-servo r+

looking good!

Your next steps are to get this working with Firefox, yes?

@bors-servo
Copy link
Contributor

📌 Commit 437276d has been approved by Manishearth

@highfive highfive added S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. and removed S-awaiting-review There is new code that needs to be reviewed. labels Apr 27, 2018
@asoni3
Copy link
Contributor Author

asoni3 commented Apr 27, 2018

@Manishearth Yes! We are in progress, you can review it from here. Let us know what are your thoughts on this. We are almost done with fixing the issues in that repository. Will push the code soon.
Thanks!

@bors-servo
Copy link
Contributor

⌛ Testing commit 437276d with merge a3fc009...

bors-servo pushed a commit that referenced this pull request Apr 27, 2018
NCSU Project - Random web content generator

<!-- Please describe your changes on the following line: -->
Here is the link to our repository which contains the web-page fuzzer: [Link](https://github.com/asoni3/Random-web-content-generator---CSC-517-OSS-Project)
**File: start_servo.py**
The script is intended to start servo on localhost:7002
**File: servo_automation_screenshot.py**
The below program is intended to test rendering mismatches in servo based on random web pages generated by the web page fuzzer.

Here is the breakdown of how our code works:
*   A session is started on localhost:7002
*   The randomly generated webpage's (html files) data is sent as JSON to this session
*   Using curl request, we load the html files for this session ID based on the session we just created
*   For each html file it renders, it takes the screenshot after rendering which is saved in the current working directory in the format "testing#.png"

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [ ] These changes fix #__ (github issue number if applicable).

<!-- Either: -->
- [ ] There are tests for these changes OR
- [ ] These changes do not require tests because _____

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/20448)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

💔 Test failed - linux-rel-wpt

@highfive highfive added S-tests-failed The changes caused existing tests to fail. and removed S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. labels Apr 27, 2018
@asoni3
Copy link
Contributor Author

asoni3 commented Apr 27, 2018

Hi @Manishearth,

I tried looking at the logs and I am not able to understand why is this failing. I am not sure if this is related to my work. Can you please take a look?

Thanks!

@Manishearth
Copy link
Member

@bors-servo retry

something might be wrong with the CI

@bors-servo
Copy link
Contributor

⌛ Testing commit 437276d with merge 625634a...

bors-servo pushed a commit that referenced this pull request Apr 27, 2018
NCSU Project - Random web content generator

<!-- Please describe your changes on the following line: -->
Here is the link to our repository which contains the web-page fuzzer: [Link](https://github.com/asoni3/Random-web-content-generator---CSC-517-OSS-Project)
**File: start_servo.py**
The script is intended to start servo on localhost:7002
**File: servo_automation_screenshot.py**
The below program is intended to test rendering mismatches in servo based on random web pages generated by the web page fuzzer.

Here is the breakdown of how our code works:
*   A session is started on localhost:7002
*   The randomly generated webpage's (html files) data is sent as JSON to this session
*   Using curl request, we load the html files for this session ID based on the session we just created
*   For each html file it renders, it takes the screenshot after rendering which is saved in the current working directory in the format "testing#.png"

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [ ] These changes fix #__ (github issue number if applicable).

<!-- Either: -->
- [ ] There are tests for these changes OR
- [ ] These changes do not require tests because _____

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/20448)
<!-- Reviewable:end -->
@highfive highfive added S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. and removed S-tests-failed The changes caused existing tests to fail. labels Apr 27, 2018
@bors-servo
Copy link
Contributor

☀️ Test successful - android, arm32, arm64, linux-dev, linux-rel-css, linux-rel-wpt, mac-dev-unit, mac-rel-css1, mac-rel-css2, mac-rel-wpt1, mac-rel-wpt2, mac-rel-wpt3, mac-rel-wpt4, windows-msvc-dev
Approved by: Manishearth
Pushing 625634a to master...

@bors-servo bors-servo merged commit 437276d into servo:master Apr 27, 2018
@highfive highfive removed the S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. label Apr 27, 2018
@asoni3
Copy link
Contributor Author

asoni3 commented Apr 28, 2018

Hi @Manishearth,

Thank you. Could you also review the random web content generator repository and check if we are on the right track?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants