New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Channel blacklist #2

Open
FireEater64 opened this Issue Jan 3, 2016 · 11 comments

Comments

Projects
None yet
4 participants
@FireEater64
Owner

FireEater64 commented Jan 3, 2016

Twitch streamers should be able to dismiss scraper from their channels

@BillieJackFu

This comment has been minimized.

Show comment
Hide comment
@BillieJackFu

BillieJackFu Jan 3, 2016

I understand programming, shouldn't you be able to add an if/then of on ban /part channel?

BillieJackFu commented Jan 3, 2016

I understand programming, shouldn't you be able to add an if/then of on ban /part channel?

@FireEater64

This comment has been minimized.

Show comment
Hide comment
@FireEater64

FireEater64 Jan 3, 2016

Owner

In theory, yes - the scraper can listen for a specific command and add the channel to a blacklist (probably a file so it persists between restarts). Ideally, though, I only want channel mods/owners to be able to dismiss the scraper, so might be a little more involved. I'll look into it as soon as I get a chance :)

Owner

FireEater64 commented Jan 3, 2016

In theory, yes - the scraper can listen for a specific command and add the channel to a blacklist (probably a file so it persists between restarts). Ideally, though, I only want channel mods/owners to be able to dismiss the scraper, so might be a little more involved. I'll look into it as soon as I get a chance :)

@BillieJackFu

This comment has been minimized.

Show comment
Hide comment
@BillieJackFu

BillieJackFu Jan 3, 2016

Only broadcasters, mods, can ban someone. However you won't know you're banned unless you chat. So maybe add "Hello" to your scraper, then you will receive a message "You are permanently banned from talking in ".

BillieJackFu commented Jan 3, 2016

Only broadcasters, mods, can ban someone. However you won't know you're banned unless you chat. So maybe add "Hello" to your scraper, then you will receive a message "You are permanently banned from talking in ".

@FireEater64

This comment has been minimized.

Show comment
Hide comment
@FireEater64

FireEater64 Jan 3, 2016

Owner

@BillieJackFu Good point about only mods/owner having permission to ban. I believe twitch IRC sends a NOTICE message when you're banned from a channel - I should be able to simply intercept that

Owner

FireEater64 commented Jan 3, 2016

@BillieJackFu Good point about only mods/owner having permission to ban. I believe twitch IRC sends a NOTICE message when you're banned from a channel - I should be able to simply intercept that

@BillieJackFu

This comment has been minimized.

Show comment
Hide comment
@BillieJackFu

BillieJackFu Jan 3, 2016

I'm a graphic designer, but my grad project for IT was programming; I understand it, but haven't touched any code since 2007. I leave that to the true geeks.

BillieJackFu commented Jan 3, 2016

I'm a graphic designer, but my grad project for IT was programming; I understand it, but haven't touched any code since 2007. I leave that to the true geeks.

@FireEater64

This comment has been minimized.

Show comment
Hide comment
@FireEater64

FireEater64 Jan 3, 2016

Owner

@BillieJackFu Well, I can't draw to save my life - so it goes both ways! Appreciate the contribution regardless 😃

Owner

FireEater64 commented Jan 3, 2016

@BillieJackFu Well, I can't draw to save my life - so it goes both ways! Appreciate the contribution regardless 😃

FireEater64 added a commit that referenced this issue Jan 4, 2016

Added temporary workaround for #2.
Manual blacklist creation now possible.

@FireEater64 FireEater64 self-assigned this Jan 4, 2016

FireEater64 added a commit that referenced this issue Jan 4, 2016

Added temporary workaround for #2.
Manual blacklist creation now possible.
@theinfinitelurker

This comment has been minimized.

Show comment
Hide comment
@theinfinitelurker

theinfinitelurker Jan 4, 2016

Probably not the appropriate place to ask this, but, what do you run something like this on? How many channels does it connect to at once?

theinfinitelurker commented Jan 4, 2016

Probably not the appropriate place to ask this, but, what do you run something like this on? How many channels does it connect to at once?

@FireEater64

This comment has been minimized.

Show comment
Hide comment
@FireEater64

FireEater64 Jan 4, 2016

Owner

@theinfinitelurker I'm guessing you mean with regards to hardware? I'll write up a post on this at some point, but at the moment (anacdotally) I'm running an instance of the scraper connected to some 20,000 Twitch channels on - dumping chat messages into a single ElasticSearch instance on the same machine (a HP Microserver G7), with the following specs:

  • AMD Turion II Neo N54L (2.2 Ghz)
  • 4GB RAM
  • 1TB WD Red 5400 rpm HDD
  • Debian 8.2

This box processes around 15 million chat messages/day (which works out to ~2.5GB on disk) - and the scraper consumes between 5-10% of available CPU resources. The box is a little underpowered (especially in the RAM department) for some of the larger ElasticSearch queries - but seems to handle the ingesting of data without too much difficulty.

Hopefully that helps, if you have any further questions - feel free to reach out to me on Twitter 😄

Owner

FireEater64 commented Jan 4, 2016

@theinfinitelurker I'm guessing you mean with regards to hardware? I'll write up a post on this at some point, but at the moment (anacdotally) I'm running an instance of the scraper connected to some 20,000 Twitch channels on - dumping chat messages into a single ElasticSearch instance on the same machine (a HP Microserver G7), with the following specs:

  • AMD Turion II Neo N54L (2.2 Ghz)
  • 4GB RAM
  • 1TB WD Red 5400 rpm HDD
  • Debian 8.2

This box processes around 15 million chat messages/day (which works out to ~2.5GB on disk) - and the scraper consumes between 5-10% of available CPU resources. The box is a little underpowered (especially in the RAM department) for some of the larger ElasticSearch queries - but seems to handle the ingesting of data without too much difficulty.

Hopefully that helps, if you have any further questions - feel free to reach out to me on Twitter 😄

@FireEater64

This comment has been minimized.

Show comment
Hide comment
@FireEater64

FireEater64 Jan 5, 2016

Owner

As per the Twitch IRC documentation, we only receive NOTICE messages, if we request the 'commands' capability as follows:

< CAP REQ :twitch.tv/commands
> :tmi.twitch.tv CAP * ACK :twitch.tv/commands

We then receive notification of a ban, through the 'CLEARCHAT' notice:

05-01-2016 23:15:07.764+0000 [DEBUG] Received: :tmi.twitch.tv CLEARCHAT #fire_eater64 :throwaway4132

In theory, we can check the target of the CLEARCHAT command, and remove ourselves from the chat if we've been banned. We'll never know if we're unbanned, however.

Owner

FireEater64 commented Jan 5, 2016

As per the Twitch IRC documentation, we only receive NOTICE messages, if we request the 'commands' capability as follows:

< CAP REQ :twitch.tv/commands
> :tmi.twitch.tv CAP * ACK :twitch.tv/commands

We then receive notification of a ban, through the 'CLEARCHAT' notice:

05-01-2016 23:15:07.764+0000 [DEBUG] Received: :tmi.twitch.tv CLEARCHAT #fire_eater64 :throwaway4132

In theory, we can check the target of the CLEARCHAT command, and remove ourselves from the chat if we've been banned. We'll never know if we're unbanned, however.

@jaybyrrd

This comment has been minimized.

Show comment
Hide comment
@jaybyrrd

jaybyrrd Feb 7, 2016

FireEater64, it is easy to verify who is a mod or not. Simply hit the endpoint: http://tmi.twitch.tv/group/user/CHANNELNAME/chatters

And listen for a specific command. It is NOT HARD TO DO and should take all of 5-10 minutes to figure out. I think it is a safe bet that if we send a command to blacklist our own channel that we don't want your bot back.

jaybyrrd commented Feb 7, 2016

FireEater64, it is easy to verify who is a mod or not. Simply hit the endpoint: http://tmi.twitch.tv/group/user/CHANNELNAME/chatters

And listen for a specific command. It is NOT HARD TO DO and should take all of 5-10 minutes to figure out. I think it is a safe bet that if we send a command to blacklist our own channel that we don't want your bot back.

@FireEater64

This comment has been minimized.

Show comment
Hide comment
@FireEater64

FireEater64 Feb 7, 2016

Owner

@jaybyrrd Thanks for the suggestions.

Simply hit the endpoint: http://tmi.twitch.tv/group/user/CHANNELNAME/chatters

This is certainly possible, however, the Twitch API only returns a list of currently online moderators - which prevents us from being able to cache them (or else a moderator who's just logged in would not be able to successfully request removal). If we hit the Twitch API every time we receive the '!removechatscraper' command - then we open ourselves up to abuse (chat spamming '!removechatscraper'). I'm not trying suggest that this isn't possible, simply more complicated than you make it out to be.

And listen for a specific command.

Again, entirely possible (with the caveat around identifying moderators given above). However, my problem with this (which is similar to the solution UniqBot uses) is that it requires channel operators to know a specific command (something like '!removechatscraper'). The nice part about listening for 'BAN' messages - is that everyone already knows how to ban, and with a few changes the scraper could be made to behave as expected when banned (leave the channel, and never come back). In my mind, introducing a custom keyword has no advantages over the current system (channel owners contacting me manually) - channel owners would still have to find my Twitter/Blog. I'm more than happy to consider pull requests that add that functionality - I just don't believe it's the right solution.

Owner

FireEater64 commented Feb 7, 2016

@jaybyrrd Thanks for the suggestions.

Simply hit the endpoint: http://tmi.twitch.tv/group/user/CHANNELNAME/chatters

This is certainly possible, however, the Twitch API only returns a list of currently online moderators - which prevents us from being able to cache them (or else a moderator who's just logged in would not be able to successfully request removal). If we hit the Twitch API every time we receive the '!removechatscraper' command - then we open ourselves up to abuse (chat spamming '!removechatscraper'). I'm not trying suggest that this isn't possible, simply more complicated than you make it out to be.

And listen for a specific command.

Again, entirely possible (with the caveat around identifying moderators given above). However, my problem with this (which is similar to the solution UniqBot uses) is that it requires channel operators to know a specific command (something like '!removechatscraper'). The nice part about listening for 'BAN' messages - is that everyone already knows how to ban, and with a few changes the scraper could be made to behave as expected when banned (leave the channel, and never come back). In my mind, introducing a custom keyword has no advantages over the current system (channel owners contacting me manually) - channel owners would still have to find my Twitter/Blog. I'm more than happy to consider pull requests that add that functionality - I just don't believe it's the right solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment