Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yay spammers arrived 馃帀 #9

Open
etemiz opened this issue Jan 17, 2023 · 22 comments
Open

Yay spammers arrived 馃帀 #9

etemiz opened this issue Jan 17, 2023 · 22 comments

Comments

@etemiz
Copy link

etemiz commented Jan 17, 2023

Need to think seriously about spam.

Basic rate limiting but also enabling bursts of events/msgs. So longer time window, bigger limit can achieve this..

Instead of 3/sec maybe 20/min.

If nobody is following the guy, he should get more penalties, or more rate limiting. If the pubkey is new, more rate limiting. It quickly turns to a statistics problem which bloat the relayer but these are some of the easiest things that can be implemented I think.

Another suggestion is incremental PoW. The relay requires more and more PoW from an IPv4 when it finds it is spamming. IPv6 is harder to control I think because it is cheaper. I don't know if there is a NIP for this. When a relay rejects the spam, client tries harder to find more PoW and resubmits it..

Similar messages can also be slowed down even though they come from different IP. The current spammer sends 'similar' good morning messages constantly. I don't think it is using different IP though, this is more for future proofing.

@etemiz
Copy link
Author

etemiz commented Jan 17, 2023

Maybe spam detection process/thread/app can be a different app and speaking to the main relayer to ease the load..

@hoytech
Copy link
Owner

hoytech commented Jan 17, 2023

I have a list of possible spam mitigations like IP and/or pubkey-based rate limits. I'm not sure what would be the most effective. Looking at the number of followers of a possible spammer is an interesting idea! You'd probably want some kind of web-of-trust approach to avoid the spammer creating a bunch of fake followers.

You're right, the GM bot could be filtered out with simple content rules. I think ultimately we'll have to take cues from email like bayesian content filtering, IP-based blacklists, whitelisting indicators (SPF/DKIM), etc.

I think the relay will need some kind of built-in rate limits to prevent high-rate spam from ever getting recorded/rebroadcast. I do like your idea of a dedicated "filtering" relay process. You could put it in front of any nostr relay and it would enforce whatever kinds of policies you want.

@etemiz
Copy link
Author

etemiz commented Jan 17, 2023

I guess NIP-13 needs to be improved for client-relay negotiation about the required PoW.

Yes that web-of-trust can be like a Python machine learning process to detect spams. It coordinates with the relay on a different mechanism. I was thinking like a background process, looking at past events and coming up with rules for relay to follow..

@etemiz
Copy link
Author

etemiz commented Jan 19, 2023

I tried to add to NIP-20. fiatjaf said just use the existing return mechanism.
nostr-protocol/nips#178

@etemiz
Copy link
Author

etemiz commented Jan 26, 2023

IPv4 based rate limiting would work for a while. Maybe 20 events/min if a user is liking all the posts in a thread? Queries can be a lot higher / minute. Like 240 / min? Just throwing numbers. Have no idea what the actuality is.

@etemiz
Copy link
Author

etemiz commented Jan 28, 2023

note1k0c0yssl4h0x6vyrfcmuw83mgjl53qv2ja8ajvd68wgw9xsy8ulqumju2v @jb55 talks about how clients submit 'reports' and how other clients can use it.
I guess relays can use that data too.

@etemiz
Copy link
Author

etemiz commented Jan 30, 2023

https://github.com/nostr-protocol/nips/blob/reporting/56.md
This is not master branch though.

@jb55
Copy link
Contributor

jb55 commented Feb 5, 2023

I'm behind cloudflare, where's the best place to pull the CF-Connecting-IP header and use that for the source IP for logs and ip blocking?

@jb55
Copy link
Contributor

jb55 commented Feb 16, 2023

spammers are slowly filling up my harddrive space, any way to rate limit them?

@jb55
Copy link
Contributor

jb55 commented Feb 16, 2023

or even a way to drop lots of data at once? how do you even query this database. it was pretty easy with sqlite but now I'm lost here :P

@hoytech
Copy link
Owner

hoytech commented Feb 16, 2023

You are using the version from the master branch right? This version has a known issue that the harddrive usage will grow excessively over time. I'm almost ready to release an official "0.1" branch, but for now I think the beta branch is pretty stable. @etemiz has been running it on wss://nos.lol for a few days now. Unfortunately, you'll have to rebuild the DB (strfry export > dump.jsonl and then mv strfry-db/data.mdb data.mdb.old and then strfry import < dump.jsonl).

For dropping data, the beta branch also has a new command strfry delete. You can pass it a nostr filter and it will delete all events that matched, as well as an optional --age param. For example, here's how you could delete all events older than 1 day (86400 seconds) that have kind=1: strfry delete --age 86400 --filter '{"kinds":[1]}'

The lack of a custom query language is one of the downsides of using a key-value store like LMDB. What kind of queries would be most helpful? For basic things there's a strfry scan that will print out all events (in jsonl format) that match a particular nostr filter, for example, all events by pubkeys starting with 00: strfry scan '{"authors":["00"]}'

@hoytech
Copy link
Owner

hoytech commented Feb 16, 2023

Regarding rate limiting, there is a plugin architecture in beta branch that we're working on, described here: https://github.com/hoytech/strfry/blob/beta/docs/plugins.md

The trick is being able to programmatically determine what is spam and what isn't!

@etemiz
Copy link
Author

etemiz commented Feb 16, 2023

My Python plugins are effective to a degree. Thinking of using 1984 reports soon after I can find the most discerning and uncorruptible users that will never make mistakes (I need to meditate a lot for this hehe)

@etemiz
Copy link
Author

etemiz commented Feb 16, 2023

Sent you an email @jb55

@jb55
Copy link
Contributor

jb55 commented Feb 17, 2023

will try the master branch. I'll just grep out the spam on reimport

@hoytech
Copy link
Owner

hoytech commented Feb 17, 2023

@jb55 - The master branch is still the older "stable" one. The beta branch has most of the bug fixes and improvements. Let me know if you encounter any issues on reimport!

@jb55
Copy link
Contributor

jb55 commented Feb 17, 2023

@hoytech can't seem to build beta:

golpe/external/rasgueadb/rasgueadb-generate golpe.yaml build
duplicate tableId:  in CompressionDictionary at golpe/external/rasgueadb/rasgueadb-generate line 39.
make: *** [golpe/rules.mk:49: build/defaultDb.h] Error 255

@hoytech
Copy link
Owner

hoytech commented Feb 17, 2023

Ahh sorry, I should write better docs for this. I think you need to update submodules:

git submodule update
make setup-golpe

You'll also need to install a new package: apt install -y libzstd-dev

@jb55
Copy link
Contributor

jb55 commented Feb 17, 2023

I've done all those things but still get that error =/

@jb55
Copy link
Contributor

jb55 commented Feb 17, 2023

even after cleaning

@hoytech
Copy link
Owner

hoytech commented Feb 17, 2023

I believe that error is being thrown because the sub-sub-module inside golpe/external/rasgueadb/ is out of date. I would've expected the make setup-golpe to update that.

Can you try cloning a fresh copy of the repo and building from there?

git clone https://github.com/hoytech/strfry.git
cd strfry
git checkout beta
git submodule update --init
make setup-golpe
make -j4

There must be some reason it's failing to update that submodule, but I'm not sure what it would be.

@jb55
Copy link
Contributor

jb55 commented Feb 17, 2023

weird ya fresh checkout worked

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants