Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to avoid fake account flooding? How to rate limit against spam? #1311

Open
worldpeaceenginelabs opened this issue Feb 7, 2023 · 9 comments

Comments

@worldpeaceenginelabs
Copy link

I've been fiddling with authentication, spam, flodding, scripts, bots...
Maybe you see the solution to the problem at the end?

I want to keep the relays as Mark thinks of them. So i will NOT touch anything on the relay!!!

Gun apps can only control 1. the data, 2. the encryption, and 3. the node path.

The rest is up to the relay logic(some original Gun Relay of someone = open, cant modify), and the sender/receivers logic(our client-side app=modifiable).


PROBLEM 1
VSCode + Gun dependency + 100x .get(couchsurfing).get(freeCouchOnMap).put(data) (hacker/spammer/troll)
(causes map full of sofas which not exist)

=>

SOLUTION (i think solves problem 1 100%???)
Our app users subscribe to content exclusively on User Spaces user.get(couchsurfing).get(freeCouchOnMap).on(data) (does this line subscribe to the node freeCouchOnMap in all user-spaces? 😅)
(of course we still can use public space for other scenarios within the same app)

So our app only works within the user space, which we control client-side through "common sense limiting"/"rate limiting" per user.

The processing of the data itself by the app, is the only thing we can control at all, because the relay just does whatever it is told to do whether .get(), user.get(), user.create, user.auth (Gun wants this openness, which is good)


PROBLEM 2 (stucking here)
VSCode + Gun dependency + 1000 authenticated users posting only ONE time. (all belonging to one entity, manpower or botnet)(causes map full of sofas which not exist)

Automated Script: 1000x user.create + !!!1x!!! user.get(couchsurfing).get(freeCouchOnMap).put(data) per user account, which breaks through my "common sense limiting"/"rate limiting" of 1 post/each user/ each day).

Someone could execute user.create a thousand times offline and then sync with our relay.

EFFECT:
We can NOT limit the rate of user.create in a malicious app or the syncing of thousands of malicious user-accounts through the relay.

=>

POSSIBLE SOLUTIONS (stucking)

I think that against 1000 humans any strategy fails and is just bad luck!

But we can certainly use something like this, client-side:
"X times the same post within a X time period by different users => hide the posts and give them to moderation" maybe even combined with XXX detection?

btw: Moderation does not have to be a centralized entity. Moderations of this kind could be polled by all users. "Please choose, is this spam?" for instance...

Pattern recognition theoretically helps against this, but we know how annoying this can be when legitimate content is blocked simply because it is "similar". (fb)


STILL A PROBLEM
Bots and scripts seem to be a problem for Gun afaik for now!
One script attacks from one IP, a botnet with multiple scripts from multiple IPs.
IPs could be scanned for suspicious behaviour, but syncing 1000 user.creates and their follwing sync of 1000x .put() is not suspicious to DNS services. Neither to a relay for now🤷

@worldpeaceenginelabs
Copy link
Author

worldpeaceenginelabs commented Feb 7, 2023

wrapping my head since hours around human verfication and stuff.

But i came to a simple solution maybe which doesnt involve third party solutions or dependencies at all:

all my past attempts to make sure the relay gets not flooded with fake or bot accounts, ended always like this:

More user effort, a bit more secure but eh... and
"can easily be circumvented by VSCode + Gun dependency + 1000x offline user.create then sync with a normal desktop relay, then sync relay-to-relay."

So the simplest way to take all my attempts into one simple mechanism is


UPDATED!!! Organic distribution speed of user accounts

client-to-relay sync of a new keypair

  • When a newly created Gun App user, hacker or bot account connects to a Gun relay that does not yet know their new keypair, it will only accept and distribute the keypair if the user is also logged in.
  • We could check if the new key pair (user) can write to his user space, which means that the user is logged in with the same key pair. This can't be faked except with stolen credentials.

For relay-to-relay sync of keypairs, we could establish the same mechanism.

  • Before a relay shares a keypair with a new relay, they BOTH check if the new keypair(user) can write into his user space, which means the user is logged in with the same keypair.
  • This way two valid relays both validate, but even a hacked relay which wants to sync with a valid relay cant validate without the valid relay checking for himself. (relay to relay validation needs both canWriteToUserSpace? = true)

On first glance it will slow down the distribution of user accounts (so the ability to log-in to Gun from everywhere) but maybe think of it as an organic distribution speed of user accounts.

  • Why in general should relay-to-relay synchronize user accounts that are not used anyway?
  • In times the user is online, the keypair will get stored and synced on the relays as usual.
  • The keypairs and the data already on the relays get distributed to clients as usual no matter the user online or not ;) (two flies with one clap)

This measurement would not interfere with public space requests anyway, and also not with local first scenarios (Gun has no multi user support yet anyway and multiple incognito browser sessions would still work)


No user efforts necessary, simple script on relay. Doesnt inflict the already existing functions, i had them in mind!

Any thoughts?


PS: Maybe this can be combined with a proof of work, or a captcha, to prevent us from a fake database or VM machines.

I really dislike captchas, but keep in mind that a captcha requested by the relay could further ensure that it is one human on one device.

I think human verification is a good fit with Gun relays in general. Because we want full openness/sharing resources, but we want the relay users to be legit humans, we dont want to waist resources to botnets or script hackers/trolls.


UPDATED!!! : Maybe we could recreate the Friendly Captcha mechanism with SEA.work?

Friendly Captcha is a proof-of-work based solution in which the user’s device does all the work.

Friendly Captcha generates a unique crypto puzzle for each visitor. As soon as the user starts filling a form it starts getting solved automatically. Solving it will usually take a few seconds. By the time the user is ready to submit, the puzzle is probably already solved.

image

@amark amark changed the title Security Issue - Strong vulnerability to VSCode with Gun dependency, scripts and bots. How to rate limit? Feb 9, 2023
@amark
Copy link
Owner

amark commented Feb 9, 2023

Performance (speed) of being able to post thousands of messages is not a security issue, GUN is intentionally fast. Tho nor is this GUN-specific, any database or filesystem being able to write thousands of messages is not a security vulnerability.

In fact, GUN being fast is what allows it to handle larger traffic while reducing likelihood of DDoS etc. these are GOOD things not BAD things.

Relays already check signature verification on data - they are, after all, just another peer. So seems like your question or concern here is already fine, there isn't an issue.

Remember, peers don't store/backup data unless they query it. So don't worry about someone else generating data, if you are not interested in it, you won't be storing it. A relay might but most relays are on ephemeral clouds so they wipe storage. If you want some data saved long-term, you should make sure your peer (relay or not) is subscribed to it.

GitHub issues are meant for bug reports, preferably not questions/discussions. Please close?

@worldpeaceenginelabs
Copy link
Author

worldpeaceenginelabs commented Feb 9, 2023

@amark
Hi Mark.

GitHub issues are meant for bug reports, preferably not questions/discussions.

The original title was "Security issue - ..." and a proposal how to rate limit the security issue.
This is how i understand Github Issues section works.
And if you can do 1000x ..., thats concerning to me!

A relay might but most relays are on ephemeral clouds so they wipe storage. If you want some data saved long-term, you should make sure your peer (relay or not) is subscribed to it.

I understand this, but long-term storing is not even of my concern. My strategy will be a mix from local storage, subscription, distribution logic and median relay life-time(before it gets wiped eventually)

Remember, peers don't store/backup data unless they query it. So don't worry about someone else generating data, if you are not interested in it, you won't be storing it.

I am really concerned about the scenario, because i am close to production.
Peers will query the content-address user.get(couchsurfing).get(freeCouchOnMap). This one is were i store the couch posts which populate the map with couchs!

So a simple script like the following renders 1000 couchs from 1000 fake users on my map, or not?

const Gun = require('gun');

var gun = Gun(['http://yourrelay.com/gun']).get('yourapp')

for (var i = 0; i < 1000; i++) { // repeat the loop 1000 times
user.create(username, password, cb); // create user
user.auth(username, password, cb); // log user in
gun.user.get(couchsurfing).get(freeCouchOnMap).put(data); // drop data
await new Promise(resolve => setTimeout(resolve, 10000)); // pause for 10 seconds
user.leave(console.log("user logged out")); // log user out
}

Or do you guys do something totally different with the content addressing then me?

If i am just wrong and it works different from what i describe, of course we can close this issue!
But at least i want to know...

@worldpeaceenginelabs
Copy link
Author

worldpeaceenginelabs commented Feb 9, 2023

Performance (speed) of being able to post thousands of messages is not a security issue, GUN is intentionally fast. Tho nor is this GUN-specific, any database or filesystem being able to write thousands of messages is not a security vulnerability.
In fact, GUN being fast is what allows it to handle larger traffic while reducing likelihood of DDoS etc. these are GOOD things not BAD things.

I am not criticising Gun in any way. My proposal does not point on the general data transfer and distribution.

My proposal focuses only on the distribution of the pub/priv keys, to prevent fake account flooding.

@worldpeaceenginelabs worldpeaceenginelabs changed the title How to rate limit? How to avoid fake account flooding? How to rate limit against spam? Feb 9, 2023
@michaelbrown169
Copy link

This is a concern I have had since starting to work on my own P2P project using gun.js. Interestingly, I came up with the same solution (proof of work). I was not aware of how the syncing was done (I'm still learning gun.js) with the relays but I see the way Mark describes it as being a viable solution in the general case. However, in this specific case any ability to spam that can then force a memory "wipe" can prevent communication and is a security flaw (even if it is only occasional). On the surface this seems like a serious problem that should be addressed. If it is not then a good (easy to understand) explanation should be made available in the gun.js docs.

@worldpeaceenginelabs
Copy link
Author

worldpeaceenginelabs commented Feb 9, 2023

@michaelbrown169

One problem is the script hacker, the other problem the bot or botnet. Both can be obstacled with proof-of-work and some pub/priv keys logic like described above. It prevents the bot, it prevents the script hacker. And it prevents us from dead accounts.

btw: The proof of work puzzle could client-side be used before any send action, without being anoying.

@worldpeaceenginelabs
Copy link
Author

worldpeaceenginelabs commented Feb 11, 2023

@amark Hi Mark,

This is again a proposal, i just always talk in the way of questioning to activate the listener's questioner 😅 Sorryyyy

What if you would implement a sea.work routine into user.create in the Gun lib itself? I mean it in the way, that Gun's user.create would run integrated a proof of work for user creation, which cant be circumvented because its in user.create itself.

Same could go with user.auth. Either with just sea.work or even with the self solving crypto puzzle, but integrated into user.auth directly, so it cant be circumvented.

What about proof of work in .put maybe? (a thought while writing this)

You know that i am still a beginner, so would that work? Can it be circumvented? If so, would it still deter 99%?
(i could imagine someone just using the scripts from your repo manually still works, but just by npm install gun, is done!)
Bots and Scripts done? Client-side rate limiting each user does the rest?


This proposal is the above proof-of-work part in short and more mature.
Its a stand-alone-solution, but could be combined with the "organic distribution speed of user accounts" concept from above.

@draeder
Copy link
Contributor

draeder commented Feb 12, 2023

This is a concern I have had since starting to work on my own P2P project using gun.js. Interestingly, I came up with the same solution (proof of work). I was not aware of how the syncing was done (I'm still learning gun.js) with the relays but I see the way Mark describes it as being a viable solution in the general case. However, in this specific case any ability to spam that can then force a memory "wipe" can prevent communication and is a security flaw (even if it is only occasional). On the surface this seems like a serious problem that should be addressed. If it is not then a good (easy to understand) explanation should be made available in the gun.js docs.

This is not a problem when using user space. Only the user can mutate its own graph unless the user gives permission to another user to mutate its graph. If you're only using public space in gun, this can appear to be a problem, but it's not. It is by design.

@worldpeaceenginelabs
Copy link
Author

worldpeaceenginelabs commented Feb 20, 2023

@amark

Hi.

I identified the three parts that i would suggest to update with a selfsolving crypto puzzle which is a proof of work and a human verification in one:
https://github.com/amark/gun/blob/master/sea/create.js (maybe not because create calls pair anyway)
https://github.com/amark/gun/blob/master/sea/pair.js
https://github.com/amark/gun/blob/master/sea/auth.js

In my opinion you may insert the crypto puzzle, just at the begin of each of this three files.
With await and I think promise? The original code, just as it is now, will be simply executed AFTER the promise. What do you think? Would that work?

const puzzle needs to be harder though... and it uses sea.work not in the way intended!!!
i would rather go for encrypting the solution and then decrypt it, which resolves the promise.

image

I found a harder puzzle:

image

Nr. 2 performs the following operations in a random order:

  • Reverse the string
  • Shuffle the characters of the string
  • Convert the string to uppercase or lowercase
  • Replace some characters with symbols or numbers

This will result in a puzzle that is much harder to solve than a simple string sort, as it requires multiple steps to reverse and transform the string.

same code

const Gun = require('gun');
const SEA = require('gun/sea');

// Generate a complex puzzle
const operations = [
  (str) => str.split('').reverse().join(''),
  (str) => str.split('').sort(() => Math.random() - 0.5).join(''),
  (str) => Math.random() < 0.5 ? str.toUpperCase() : str.toLowerCase(),
  (str) => {
    const replacements = {
      'a': '4',
      'e': '3',
      'i': '1',
      'o': '0'
    };
    return str.split('').map(c => replacements[c] || c).join('');
  }
];

let puzzleComplexity = 4; // Number of operations to apply
let puzzleComplexityRemaining = puzzleComplexity;
let puzzle = Math.random().toString(36).substring(2, 22);

while (puzzleComplexityRemaining > 0) {
  const randomOpIndex = Math.floor(Math.random() * operations.length);
  const operation = operations[randomOpIndex];
  puzzle = operation(puzzle);
  puzzleComplexityRemaining--;
}

// Solve the puzzle
const solution = puzzle.split('').sort().join('');

// Encrypt the puzzle using GunJS's SEA library
const passphrase = Math.random().toString(36).substring(2, 8) + '-' +
                   Math.random().toString(36).substring(2, 8) + '-' +
                   Math.random().toString(36).substring(2, 8);
SEA.work(puzzle, passphrase, (encryptedPuzzle) => {
  console.log('Encrypted puzzle:', encryptedPuzzle);

  // Store the encrypted puzzle and solution in GunJS database
  Gun().get('puzzle').put({
    puzzle: encryptedPuzzle,
    solution: solution
  });
});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants