Skip to content
This repository has been archived by the owner on Dec 17, 2021. It is now read-only.

remove need for a11y.py to run alongside inspect.py #109

Closed
gbinal opened this issue Feb 5, 2017 · 3 comments
Closed

remove need for a11y.py to run alongside inspect.py #109

gbinal opened this issue Feb 5, 2017 · 3 comments

Comments

@gbinal
Copy link
Member

gbinal commented Feb 5, 2017

In the current workflow, in step 5, I run the following command: docker-compose run scan domains.csv --scan=inspect,a11y --debug. However, if I am using a domains.csv that has been derived straight from recent DAP results, there's no need to run the inspect command (which adds a decent bit of time to the scan). It would be faster if I could just run the a11y scan without the inspect scan: 'docker-compose run scan domains.csv --scan=a11y --debug`.

This does not work though, as it seems that the a11y scan depends on the inspect scan having already been run. You can see the error message below.

It would be handy to be able to run the a11y scan without it needing the inspect scan cache results.


[youthrules.gov][a11y]
Traceback (most recent call last):

  File "./scan", line 120, in process_scan
    rows = list(scanner.scan(domain, options))

  File "/home/scanner/scanners/a11y.py", line 197, in scan
    inspect_data = get_from_inspect_cache(domain)

  File "/home/scanner/scanners/a11y.py", line 23, in get_from_inspect_cache
    inspect_raw = open(inspect_cache).read()

FileNotFoundError: [Errno 2] No such file or directory: './cache/inspect/youthrules.gov.json'
@gbinal
Copy link
Member Author

gbinal commented Feb 5, 2017

As I've thought through this though, it's only a moderate inconvenience to go ahead and run the inspect scan by itself and then to go back and run the a11y scan by itself, so maybe this isn't worth worrying about.

The time I was trying to save was in every time I had to rerun a11y with inspect included. But perhaps I can just run a11y by itself after I've populated the inspect cache folder.

@konklone
Copy link
Contributor

konklone commented Feb 6, 2017

However, if I am using a domains.csv that has been derived straight from recent DAP results, there's no need to run the inspect command

Is that because you think that any hostname appearing in DAP results is a live website? Unfortunately, that's not the case. When I did my Dec 2016 domain analysis, I found that of the 5,483 domains in DAP data, only 3,987 of them responded live to HTTP calls.

@gbinal
Copy link
Member Author

gbinal commented Feb 17, 2017

Sorry for the confusion. I was referring to the other DAP results, that directly feed that section of Pulse. Thanks though for the reminder.

I'm actually going to go ahead and close this issue as I've found that the simple modification to my workflow I mention above addresses the problem sufficiently.

@gbinal gbinal closed this as completed Feb 17, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants