Implemented issue #93 - "Manually Specify Paths" #197

bobbajs · 2013-03-27T04:45:36Z

EECE310 L2A2 Group, implementing Issue #93

Crawljax is now able to crawl a path specified by using alsoCrawl() function in CrawljaxConfigurationBuilder.

Short description regarding the implementation of this feature:
Instead of having one URL to store the seed URL, we replace the member variable with an ArrayList of URLs. By calling alsoCrawl(), new url specified by user will be added to this ArrayList.
WorkQueue is modified not to be final anymore, since it needs to crawl another URL once it is done crawling the seedURL.

Should there be any questions / concerns, please let us know.

Added simpleExample Added ArrayList<URL> urls to CrawljaxConfiguration

…en it yet

Is it because of new feature implemented? or maybe I forgot to port some code...

crawljax can crawl different URL by calling alsoCrawl(). Needs refactoring.

Before only strings were accepted for crawling additional sites now URLs can be entered too.

Removing print statements that were used for debugging purposes. Removing unused functions

Checks that it is possible to build the CrawljaxController after adding a second url as a url and as a string.

Merging diana-new branch with master

alexnederlof · 2013-03-27T12:48:07Z

There's one conceptual problem with this solution: you won' see the links between two sites. For example: I specify to crawl my.blog.com which links to my.othersite.com and vice versa. Then I want the crawler to show the links between the two. Using the solution provided, it will first crawl the first, and then the second, but will will never cross over to the other one. Preventing me from inspecting the relation between the two.

bobbajs and others added 16 commits March 26, 2013 12:43

CrawljaxConfiguration has URL lists but no getters yet

5a9bb46

Added simpleExample Added ArrayList<URL> urls to CrawljaxConfiguration

Crawljax aware of additional URL added via alsoCrawl(), but cannot op…

d9170a3

…en it yet

Ported my code from the old code, but it has Firefox Driver error

43ff4ce

Is it because of new feature implemented? or maybe I forgot to port some code...

Enhancement implemented.

c9f795b

crawljax can crawl different URL by calling alsoCrawl(). Needs refactoring.

User can now crawl additional sites via URL

6e4b097

Before only strings were accepted for crawling additional sites now URLs can be entered too.

Refactoring

7a18d1f

This file doesnt get merged

dbfbc4e

Refactoring

73edff4

Removing print statements that were used for debugging purposes. Removing unused functions

Added Not Null check to alsoCrawl(url)

727c1c7

Adding Junit test for adding multiple urls

b60ee95

Checks that it is possible to build the CrawljaxController after adding a second url as a url and as a string.

Removed unneeded comment

60e5037

Removing excess spaces

ce83d46

Removed more excess spaces

13c22b7

Readded final to WebDriver browser

7b0c63e

Removing url, we are using URLs now

44b7544

Merge pull request #1 from texasinstrument/diana-new

7e65902

Merging diana-new branch with master

alexnederlof closed this Mar 27, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implemented issue #93 - "Manually Specify Paths" #197

Implemented issue #93 - "Manually Specify Paths" #197

bobbajs commented Mar 27, 2013

alexnederlof commented Mar 27, 2013

Implemented issue #93 - "Manually Specify Paths" #197

Implemented issue #93 - "Manually Specify Paths" #197

Conversation

bobbajs commented Mar 27, 2013

alexnederlof commented Mar 27, 2013