Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Erroneous output to terminal when using --extract-links #114

Closed
Greenwolf opened this issue Nov 5, 2020 · 9 comments · Fixed by #117
Closed

[BUG] Erroneous output to terminal when using --extract-links #114

Greenwolf opened this issue Nov 5, 2020 · 9 comments · Fixed by #117
Assignees
Labels
bug Something isn't working

Comments

@Greenwolf
Copy link

Is your feature request related to a problem? Please describe.
When using --extract-links, it would be nice to have an option which only grabbed links from the original domain. I'm also not sure if it is starting to dir bust on other domains that are extracted? The output is unclear.

Describe the solution you'd like
A flag to limit the scope of the tool would be great. Also additional clarity in the ReadMe on if it starts busting new domains when using the --extract-links option would be great.

P.S. - Absolutely loving the tool! I think you've got a real edge on gobuster & ffuf with this one 👍. I've been sharing will all my colleagues! You've done some really great work on this!

@Greenwolf Greenwolf added the enhancement New feature or request label Nov 5, 2020
@epi052
Copy link
Owner

epi052 commented Nov 5, 2020

Hi @Greenwolf,

Thanks for the request and the kind words! I'm really glad you're enjoying it and getting some use out of it.

When using --extract-links, it would be nice to have an option which only grabbed links from the original domain

The current logic is as follows when --extract-links is used:

  • parse response body
  • find absolute and relative links
  • if absolute
    • does domain/ip match original target's domain/ip? yes - make request or bust dir, as appropriate : no - skip
  • if relative
    • append the relative path to the current target and make request/bust

I'd love to know if you're seeing requests off the primary target domain, as that's definitely not intended. Can you let me know what you've observed and whether or not the description above meets the intent of this feature request?

@Greenwolf
Copy link
Author

Hi @epi052, i ran it on domain A, and it seemed to start making requests on domain B. Am i misreading the output?

I've checked the proxy logs and actually it doesn't seem to be making the request, but it's messing up the console output with all the non in scope items. Is that intentional?

200      27133 https://original.domainA.org/img/X.png
200      14950 https://original.domainA.org/img/Y.png
200       4510 https://original.domainA.org/img/Z.png
ERR    716.988 Error while making request: error sending request for url (http://sub.domainB.org/300x700_X.html/X.php): error trying to connect: dns error: failed to lookup address information: nodename nor servname provided, or not known
[#######>------------] - 11m   148814/373534  207/s   https://original.domainA.org
[>-------------------] - 9m      1932/373534  3/s     http://domainC.com/
[>-------------------] - 3m      3954/373534  21/s    http://sub.domainB.org/
[>-------------------] - 3m      4014/373534  21/s    http://sub.domainB.org/2055.php
[>-------------------] - 3m      3889/373534  20/s    http://sub.domainB.org/IM
[>-------------------] - 3m      3994/373534  21/s    http://sub.domainB.org/info
[>-------------------] - 3m      3966/373534  21/s    http://sub.domainB.org/fixed

@epi052
Copy link
Owner

epi052 commented Nov 6, 2020

Just to make sure I understand correctly:

When run with --proxy no requests are actually made to any off-target domain, however, console output shows that directories on other domains are being busted.

Do you ever see any of the off-target domain lines in the 'upper' output area, i.e. not just the progress bar? I'm guessing if they're not in the proxy logs, they're not in that output either.

@Greenwolf
Copy link
Author

Yes that is correct. But i actually got 1000's of lines of the off-target domain output listed in the console. The command i used was this:

./feroxbuster -u https://original.domainA.org/ --extract-links --depth 2 --wordlist ./content-discovery/content_discovery_all.txt

@epi052
Copy link
Owner

epi052 commented Nov 6, 2020

Good deal. Definitely sounds like it needs some attention. I'm wrapping up 1.5.0 now and should be able to check this out over the weekend.

You've already narrowed down the possible location of the problem significantly, thank you!

I'm switching this to a bug for now.

@epi052 epi052 added bug Something isn't working and removed enhancement New feature or request labels Nov 6, 2020
@epi052 epi052 self-assigned this Nov 7, 2020
@epi052 epi052 changed the title [FEATURE REQUEST] Option to Limit --extract-links by domain [BUG] Erroneous output to terminal when using --extract-links Nov 7, 2020
@epi052
Copy link
Owner

epi052 commented Nov 7, 2020

@Greenwolf good morning!

I'm trying to replicate what you're seeing. If you're able, could you confirm that some of the domains you saw requested are included below?

@epi052
Copy link
Owner

epi052 commented Nov 7, 2020

probably some more

http:assistenza.oliviero.it/ajax  
http:dreambox.de/board            
http:fixelcloud.com               
http:jxshop.ir/json               
http:krasivaya662.jimdo.com/http:krasivaya662.jimdo.com/http:krasivaya662.jimdo.com              
http:localhost                    
http:pad.appbako.com/jikanawari   
http:pad.appbako.com/kaiseki      
http:pad.appbako.com/zatsudan     
http:pegueraeu.tumblr.com         
http:puradsifm.net:9994           
http:stm20.srvstm.com:23110       
http:studiokeya.com               
http:techblog.dahmus.org          
http:thg.ne.jp                    
http:www.domprazdnika.ru          
http:www.grozingerlaw.com         
http:0matome.com
http:0matome.com
http:1000mg.jp
http:1000mg.sblo.jp
http:16bit.blog.jp
http:18mn.blog89.fc2.com
http:2ch.anything-navi.net
http:2ch.logpo.jp
http:2ch-mi.net
http:2ch-mma.com
http:2ch-mma.com
http:2d.news-edge.com
http:acopy.blog55.fc2.com
http:ad-feed.com
http:afo-news.com
http:afo-news.com
http:afo-news.com
http:akb48mato.com
http:akb48m.com
http:aki680.dtiblog.com
http:akunaki2.blog.fc2.com
http:ameblo.jp
http:animalch.net
http:antch.net
http:antenasu.net
http:antennabank.com
http:antennabank.com
http:antenna-ga.com
http:antenow.com
http:aqua2ch.net
http:aresoku.blog42.fc2.com
http:asugaru.blog77.fc2.com
http:avzyoyuumatome.jp
http:axia-hakusan.com
http:besttrendnews.net
http:besttrendnews.net
http:blog-livedorr.com
http:bokuteki.com
http:buhidoh.net
http:carp.nanj-antenna.net
http:chaos2ch.com
http:daimajin.net
http:digi-6.com
http:dividendlife.net
http:dng65.com
http:doujinch.com
http:doumori-app.com
http:douzingame.com
http:dq-antena.com
http:dqmsl-antenna.com
http:dqmsl-dq.antenna-chan.info
http:dqmsl.site
http:ebitsu.net
http:edde.blog75.fc2.com
http:egone.org
http:equal-love.club
http:eroch8.com
http:erodaioh.blog8.fc2.com
http:erohop.dtiblog.com
http:ero-kawa.com
http:ero-kawa.com
http:ero-kawa.com
http:eromanga-kingdom.com
http:eromon.info
http:ero-nuki.net
http:erosnoteiri.com
http:erotube.org
http:esite100.com
http:fc23.blog63.fc2.com
http:fesoku.net
http:gallife.blog89.fc2.com
http:gameblogrank.com
http:gehasoku.com
http:geitsubo.com
http:gookc.blog.fc2.com
http:gorirarara.dtiblog.com
http:hamusoku.com
http:hana.kachoufugetsu.info
http:headline.mtfj.net
http:high-oku.com
http:hilite000.blog.fc2.com
http:hima-game.com
http:h-nijisoku.net
http:hoshi-dq.co
http:ichliebefussball.net
http:idol-blog.com
http:iphonech.info
http:iryujon.blog.fc2.com
http:ituki88.com
http:jin115.com
http:jplol.blog.fc2.com
http:jyouhouya3.net
http:kachimuka-matome.com
http:kaigai-antena.com
http:kankore.44ant.biz
http:kaoru-office.biz
http:karapaia.com
http:katuru.com
http:kaze.kachoufugetsu.info
http:kb24lal.blog9.fc2.com
http:keiba.blog.jp
http:ken-ch.vqpv.biz
http:kijosoku.com
http:kikonboti.com
http:kisslog2.com
http:kizitora.jp
http:kojimedia.me
http:konowaro.net
http:konowaro.net
http:konowaro.net
http:konowaro.net
http:ks4402.blog94.fc2.com
http:kyousoku.net
http:marumie55.com
http:matomenomori.net
http:matometatta-news.net
http:minkch.com
http:minnanonx.com
http:mix2ch.blog.fc2.com
http:moeimg.net
http:moerank.com
http:moero25.blog.fc2.com
http:momo96ch.com
http:mushitori.blog.fc2.com
http:nanjdragons.com
http:nbama.blog.fc2.com
http:nekomemo.com
http:nekowan.com
http:netatama.net
http:news109.com
http:news-choice.net
http:news-choice.net
http:news-choice.net
http:news-choice.net
http:news-choice.net
http:newser.cc
http:newsnow-2ch.com
http:newsnow-2ch.com
http:newsnow-2ch.com
http:newsnow-2ch.com
http:newsoku.jp
http:news-three-stars.net
http:nextneo.blog.fc2.com
http:niconico.boy.jp
http:nikkanerog.com
http:ninshinda.com
http:nmb48matome.jp
http:nocky.blog.fc2.com
http:occugaku.com
http:onesoku.com
http:ooiotakara.com
http:pakan.blog91.fc2.com
http:panpilog.com
http:pazudora-ken.com
http:picosoft.blog.fc2.com
http:pinkomen.blog.fc2.com
http:pretty77.blog9.fc2.com
http:ps3dominater.com
http:railgun-antenna-x.info
http:ranks1.apserver.net
http:rd.app-heaven.net
http:saionji.net
http:sbrmsg.blog.fc2.com
http:sexy4you.dtiblog.com
http:sexytvcap.com
http:shock-tv.com
http:shuuya.blog114.fc2.com
http:sketan.com
http:sociatenna.com
http:sousharu.blog.fc2.com
http:sow.blog.jp
http:taiken.blog24.fc2.com
http:timtmb.com
http:titimark.blog2.fc2.com
http:tokka1147.com
http:tossoku.net
http:toyop.net
http:tuma.dtiblog.com
http:turbo-bee.com
http:uhouho2ch.com
http:uhouho2ch.com
http:uhouho2ch.com
http:usepocket.com
http:vippers.jp
http:wapuwapu.com
http:waranew.net
http:webnew.net
http:webnew.net
http:webnew.net
http:webnew.net
http:webnew.net
http:worldfn.net
http:wtube.blog89.fc2.com
http:www.antena-2ch.net
http:www.appbank.net
http:www.boku-vipper.com
http:www.dousyoko.net
http:www.dql0.com
http:www.elog-ch.net
http:www.erokiwami.com
http:www.eropad.com
http:www.gurum.biz
http:www.hiroiro.com
http:www.mangajunky.net
http:www.matomech.com
http:www.nukistream.com
http:www.pinkape.net
http:www.vsnp.net
http:xn--gdk4cy65r.xyz
http:xxeronetxx.info
http:yonimo.net
http:yunyunyun.net

@epi052
Copy link
Owner

epi052 commented Nov 7, 2020

Here's the update on this one. The wordlist you used from jhaddix contains entries like i showed above. Normally, a word from the wordlist is joined using reqwest::Url::join. When that function is called using a fully formed url as the 'word', it actually overwrites the base url.

Example:

Url("http://localhost").join("http:yunyunyun.net")
=> Url("http:yunyunyun.net")

So, the urls from the wordlist were the reason those requests were being shown. I tested with and without --extract-links and got the same result both times.

I added logic that issues a warning if a url is found in the wordlist, but it stops processing that word before anything actually happens.

@Greenwolf
Copy link
Author

Sounds great, thank you @epi052. Sorry for the late reply, but yes I was seeing: 'http:techblog.dahmus.org'. Thank you for looking at this and for making a great project even better! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants