Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract absolute paths from JavaScript strings and convert them to absolute URLs in Web::Spider #32

Closed
postmodern opened this issue May 1, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request feature New Feature
Milestone

Comments

@postmodern
Copy link
Member

postmodern commented May 1, 2023

Use every_javascript_path_string to extract absolute paths from JavaScript and merge them with the page's URL.

@postmodern postmodern added enhancement New feature or request feature New Feature labels May 1, 2023
@postmodern postmodern added this to the 0.1.0 milestone May 1, 2023
@postmodern postmodern self-assigned this May 1, 2023
@postmodern postmodern changed the title Extract relative/absolute paths from JavaScript strings and convert them to absolute URLs in Web::Spider Extract absolute paths from JavaScript strings and convert them to absolute URLs in Web::Spider May 1, 2023
@postmodern
Copy link
Member Author

Currently every_javascript_*_string methods do not also yield the page or page's URL that the string was found in/on. Will need to check the block's arity and optionally yield the page or page URL for additional context.

@postmodern
Copy link
Member Author

We should limit it to absolute paths, as relative paths can match any word (ex: foo is a valid relative path).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature New Feature
Projects
No open projects
Status: Done
Development

No branches or pull requests

1 participant