Skip to content
This repository has been archived by the owner on Dec 17, 2021. It is now read-only.

a11y scanner freezes on certain domains #110

Open
gbinal opened this issue Feb 7, 2017 · 7 comments
Open

a11y scanner freezes on certain domains #110

gbinal opened this issue Feb 7, 2017 · 7 comments
Assignees
Labels

Comments

@gbinal
Copy link
Member

gbinal commented Feb 7, 2017

The following domains break the a11y scan such that I have to stop it, remove the domain, and re-restart the scan all over again.

Two problems result:

  • We don't get any a11y scan results for these domains.
  • Having to restart the scan significantly adds to the time and effort that goes into it.
afadvantage.gov
ama.gov
banknet.gov
biomassboard.gov
broadband.gov
dea.gov
disasterhousing.gov
export.gov
flightschoolcandidates.gov
grantsolutions.gov
gsaadvantage.gov
gsaauctions.gov
hrsa.gov
hydrogen.gov
idmanagement.gov
invasivespecies.gov
myfdicinsurance.gov
nationalbank.gov
nationalbanknet.gov
nationalhousing.gov
nationalhousinglocator.gov
nhl.gov
nls.gov
onhir.gov
pay.gov
realestatesales.gov
safetyact.gov
sciencebase.gov
segurosocial.gov
selectusa.gov
stopfakes.gov
tvaoig.gov
usdebitcard.gov
@konklone
Copy link
Contributor

konklone commented Feb 8, 2017

Could you paste the command line results of one of the errors?

In general, domain-scan scanners should fail gracefully, in that they print out an error or note in the saved data that it's an error, but it should never crash the scanner itself.

@gbinal
Copy link
Member Author

gbinal commented Mar 2, 2017

Note that some of these (possibly all) are because they use meta-redirects (here's a partial list of domains that do so).

@gbinal
Copy link
Member Author

gbinal commented Mar 21, 2017

#114 is taking a crack at this.

@gbinal
Copy link
Member Author

gbinal commented Mar 23, 2017

Alas, it's still happening, here's what is in the terminal after I let afadavantage.gov run for 9 hours and finally had to skip it with control + C.

[afadvantage.gov][a11y]
the_domain_is_cached: False
the_cache_is_not_forced: True
	Not cached.
[afadvantage.gov][a11y]
^CTraceback (most recent call last):
  File "./scan", line 178, in <module>
    run(options)
  File "./scan", line 84, in run
    scan_domains(scans, domains)
  File "./scan", line 142, in scan_domains
    executor.map(process_scan, tasks)
  File "/usr/lib/python3.4/concurrent/futures/_base.py", line 574, in __exit__
    self.shutdown(wait=True)
  File "/usr/lib/python3.4/concurrent/futures/thread.py", line 131, in shutdown
    t.join()
  File "/usr/lib/python3.4/threading.py", line 1060, in join
    self._wait_for_tstate_lock()
  File "/usr/lib/python3.4/threading.py", line 1076, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt
Writing to cache: afadvantage.gov
Writing data for afadvantage.gov

Here's the same but for banknet.gov:

[bankhelp.gov][a11y]
the_domain_is_cached: False
the_cache_is_not_forced: True
	Not cached.
[bankhelp.gov][a11y]
Writing to cache: bankhelp.gov
Writing data for bankhelp.gov
[banknet.gov][a11y]
the_domain_is_cached: False
the_cache_is_not_forced: True
	Not cached.
[banknet.gov][a11y]
^CError in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/lib/python3.4/concurrent/futures/thread.py", line 38, in _python_exit
    t.join()
  File "/usr/lib/python3.4/threading.py", line 1060, in join
    self._wait_for_tstate_lock()
  File "/usr/lib/python3.4/threading.py", line 1076, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt

@gbinal
Copy link
Member Author

gbinal commented Apr 3, 2017

A number of these are no longer factors b/c the DAP exclusion list removes them at an earlier stage. These still remain though:

afadvantage.gov
banknet.gov
biomassboard.gov
dea.gov
export.gov
flightschoolcandidates.gov
grantsolutions.gov
gsaadvantage.gov
gsaauctions.gov
hrsa.gov
hydrogen.gov
idmanagement.gov
nationalbank.gov
pay.gov
realestatesales.gov
safetyact.gov
sciencebase.gov
selectusa.gov
stopfakes.gov
usdebitcard.gov

@konklone konklone added the bug label Nov 20, 2017
@konklone konklone self-assigned this Nov 20, 2017
@konklone konklone changed the title Address domains that break the a11y scan and require a restart a11y scanner freezes on certain domains Nov 20, 2017
@konklone
Copy link
Contributor

@gbinal @micahsaul - I'm seeing some hangs during a11y scans too, though it's not necessarily the same as this list. Do you still see issues with these domains?

@micahsaul
Copy link
Contributor

Augh! No, I hadn't been, I'll take a look this week.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants