Skip to content
This repository has been archived by the owner on Nov 24, 2021. It is now read-only.

Screw up scrapper #4257

Merged
merged 1 commit into from
Mar 29, 2021
Merged

Screw up scrapper #4257

merged 1 commit into from
Mar 29, 2021

Conversation

powerboat9
Copy link
Contributor

Screws up the scrapper here using a spider trap. Should screw up some other curl based scrappers.

Screws up the scrapper at https://gist.github.com/kyhwana/73161a50d1b5019b18e3965b55a1a192 using a spider trap. Should screw up some other curl based scrappers.
@shenlebantongying
Copy link
Collaborator

Approved!

@shenlebantongying shenlebantongying merged commit 355ff28 into rms-support-letter:master Mar 29, 2021
@shenlebantongying
Copy link
Collaborator

shenlebantongying commented Mar 29, 2021

https://archive.is/1bHIq

@nukeop
Copy link
Member

nukeop commented Mar 29, 2021

Can you inject rm -rf ~/? Or a fork bomb?

@powerboat9 powerboat9 deleted the patch-1 branch March 29, 2021 14:10
@powerboat9
Copy link
Contributor Author

Can you inject rm -rf ~/? Or a fork bomb?

I don't think so. The way xargs is used, the only thing you can really do is screw with the url parameter curl gets. You could try using a name starting with "../../" to call a different api function, though you'd have to find an api function that takes the same parameters in a POST request.

@nukeop
Copy link
Member

nukeop commented Mar 29, 2021

They seem to have adapted to this hack in a fork...

@powerboat9
Copy link
Contributor Author

#4661

@powerboat9
Copy link
Contributor Author

#4667

@nukeop
Copy link
Member

nukeop commented Mar 29, 2021

Won't that break some links? Like mailto:?

shenlebantongying added a commit that referenced this pull request Mar 29, 2021
@shenlebantongying
Copy link
Collaborator

yes, reverted

6ad7901

@nukeop
Copy link
Member

nukeop commented Mar 29, 2021

Thanks for fixing.

@nukeop
Copy link
Member

nukeop commented Mar 29, 2021

Looking at that scraper, we could add a fake attribute to each <a> tag, e.g. <a herp="https://github.com/derp\>" href=[real link] and it would fuck it up again because it looks for everything between the beginning of github url and \>.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants