Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in getemails module #59

Closed
PSNAppz opened this issue Feb 2, 2018 · 11 comments
Closed

Error in getemails module #59

PSNAppz opened this issue Feb 2, 2018 · 11 comments

Comments

@PSNAppz
Copy link
Member

PSNAppz commented Feb 2, 2018

python3 torBot.py -u https://www.rapidtables.com/web/html/mailto.html -m

Traceback (most recent call last):
  File "torBot.py", line 186, in <module>
    main()
  File "torBot.py", line 163, in main
    emails = getemails.getMails(html_content)
  File "/home/psn/Documents/TorBoT/modules/getemails.py", line 30, in getMails
    emails.append(email_addr[1])
IndexError: list index out of range

```

@waisuan
Copy link

waisuan commented Feb 5, 2018

Hello. Submitted a pull request for this one. Let me know what you think! Thanks.

@PSNAppz PSNAppz added this to the TorBot V1.2 milestone Feb 6, 2018
@PSNAppz
Copy link
Member Author

PSNAppz commented Feb 12, 2018

Traceback (most recent call last): File "torBot.py", line 186, in <module> main(conn=True) File "torBot.py", line 162, in main emails = getemails.getMails(html_content) File "/home/psn/Documents/TorBoT/modules/getemails.py", line 24, in getMails links = soup.find_all('a') AttributeError: 'NoneType' object has no attribute 'find_all'

Need a fix asap

@waisuan
Copy link

waisuan commented Feb 13, 2018

Hmmm . Doesn’t seem to be related to my changes. Will take a look.

@waisuan
Copy link

waisuan commented Feb 14, 2018

Do you have a way of replicating the error, @PSNAppz ? I'm not getting anything on my side.

@PSNAppz
Copy link
Member Author

PSNAppz commented Feb 14, 2018

@waisuan Try this one
python3 torBot.py -u https://www.helloaddress.com/ -m
It has one mail id in the page but its returning 0 found.
But on the other hand this one is working perfect
python3 torBot.py -u https://www.asianpaints.com/contact-us.html -m

@waisuan
Copy link

waisuan commented Feb 15, 2018

I'm not getting any errors for both links...

image

@PSNAppz
Copy link
Member Author

PSNAppz commented Feb 15, 2018

See its returning 0 . It should return 1

@waisuan
Copy link

waisuan commented Feb 18, 2018

@PSNAppz -- I did a bit of debugging and I suspect that the issue at hand here is not bug-related. It would seem that the site that you have cited as an example is (probably) blocking visits from Tor nodes.

You would get an "Access Denied" error when accessing the page on Tor but OK on any other non-Tor browsers:-
image

For more info on this:: https://tor.stackexchange.com/questions/6646/why-do-i-get-access-denied-at-certain-web-sites-when-i-didnt-last-year

@waisuan
Copy link

waisuan commented Feb 18, 2018

You can confirm this by turning https://github.com/DedSecInside/TorBoT/blob/dev/torBot.py#L108 off and running the crawler. You should be able to see the "mail" then.

@KingAkeem
Copy link
Member

Has this issue been fixed?

@PSNAppz
Copy link
Member Author

PSNAppz commented Mar 14, 2018

@waisuan Ok. @KingAkeem Seems like a problem which cannot be fixed on our side.

@PSNAppz PSNAppz closed this as completed Mar 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants