Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creator dislike submission #392

Open
aryavsaigal opened this issue Dec 31, 2021 · 23 comments
Open

Creator dislike submission #392

aryavsaigal opened this issue Dec 31, 2021 · 23 comments
Labels
enhancement New feature or request

Comments

@aryavsaigal
Copy link
Collaborator

aryavsaigal commented Dec 31, 2021

FOR THE CREATORS
They install a browser extension which gives them a unique key, they upload an unlisted video and use the unique key as the title, paste the video url in the extension and click a button, afterwards they visit studio.youtube.com (?) and a sign comes that specifies that its getting the dislike data, then another which says it sent the dislike data and another which says there was an error insert http code here

WHAT THE EXTENSION DOES
Upon install the extension fetches an api endpoint (GET /creators) which gives it a unique key.
After getting the video url it sends the video id to the server (POST /creators) which returns an api key that is saved securely.
Every time studio.youtube.com is loaded it gets the dislikes, sends the data (POST /votes/submit) with the data and api key (for auth).

WHAT THE API ENDPOINTS DO
(mind you im not good at cryptography so try to get the basic idea from here)
for GET /creators:
Generates a unique key and stores it some where
for POST /creators
Fetches the video id's title and uploader's channel id, if the title is the same as any unique key which was generated then generate another key, attach the channel id to it and delete the unique key (or mark it as used if deleting is expensive).
Return the API key
For POST /votes/submit
The dislike data can only be incremented
Since there is authorisation which tells us which channel sent the data we check if the video ids belong to that channel or not.
Update the data and send some cool http code

WHAT THE SERVER NEEDS TO DO

  1. It needs to go over all the dislikes the creators submitted at a set interval and find out whether
    a) it is being updated regularly, if not then attach some level of warning which the ryd extension will show for that video.
    b) if the dislikes are increasing in a way which is believable (compared to the videos views and likes increasing)
    if this does not happen then we do a manual check on them (maybe) and blacklist them if theyre tampering with the dislikes

  2. The server needs to store video ids to channel ids so we dont have to keep using the youtube api to verify that the videos the creator sent really belongs to their channel.

OTHER STUFF
Since the API does not allow to reduce the dislike counts and if there has been a significant dislike reduction on a video (for whatever reason) the content creator can contact us and give us an api key temporarily and we manually check and update it.
To bring trust we can upload the script to do this on github or somewhere and do it on a voice call.
On the voice call we can show the checksum of the script and compare it to the github scripts checksum.
This will happen rarely so it shouldnt be an issue


WHAT THE RYD EXTENSION CAN DO
For the 2nd point in the backend heading the ryd extension can send the channel id for the video theyre watching (the channel id should be present in the page) and send it to the server, if multiple ips report the same channel id then add it in the database.

WHAT THE USERS CAN DO
Report skewed dislike counts, if multiple people report it then a manual check is done? (or a more resource heavy but accurate check than what the backend is doing at set intervals).

thanks to @jrwr for originally suggesting this

@aryavsaigal aryavsaigal added the enhancement New feature or request label Dec 31, 2021
@cyrildtm
Copy link
Contributor

Just want to point out, if I understand the whole process correctly, then there is a vulnerability that anyone can use. After the initial authorization and authentication by uploading an unlisted video with public unique key, I can replace the entire youtube website with a local private host, effectively giving myself a MITM attack. But rather I can just replace the dislike number in each video, while everything else can actually be fetched real-time from the official server.

@aryavsaigal
Copy link
Collaborator Author

Just want to point out, if I understand the whole process correctly, then there is a vulnerability that anyone can use. After the initial authorization and authentication by uploading an unlisted video with public unique key, I can replace the entire youtube website with a local private host, effectively giving myself a MITM attack. But rather I can just replace the dislike number in each video, while everything else can actually be fetched real-time from the official server.

That is indeed true, this should help prevent it to some degree

It needs to go over all the dislikes the creators submitted at a set interval and find out whether
a) it is being updated regularly, if not then attach some level of warning which the ryd extension will show for that video.
b) if the dislikes are increasing in a way which is believable (compared to the videos views and likes increasing)
if this does not happen then we do a manual check on them (maybe) and blacklist them if theyre tampering with the dislikes

The dislike data can only be incremented

@jetbalsa
Copy link
Contributor

jetbalsa commented Dec 31, 2021

So, This is written a little incorrectly. The basic flow is this:

  • Creator contacts us, we give them a special API key to use with our API (A password if you will)
  • Creator installs a extension or enables a setting in the existing one
  • Extension will then auto scan the creators logged in youtube accounts for its videos and submit its dislikes (Maybe every time they visit their studio page)
  • It then pushes this data to us, we only allow the data under some conditions:
    • The number can be only increased.
    • The Video has to belong to a preregistered channel and is confirmed by the backend with a backend API call to youtube (To ensure that they can only update their own videos)
  • The backend saves this data with some metadata like date submitted and such.

The following risks are as follows with their outcomes and mitigation

  • The creator only updates the number once
    Since we are storing metadata about this, we can run reports and see when this is happening. we can report this to the end user or a moderation team to deal with it.
  • The creator only submits fake numbers to show a slow increase of dislikes, completely faking the system.
    While this can happen, there isn't a ton you can do about it. You can do some fancy backend detection for when this is happening and report on it. Overall, there isn't a /ton/ of motivation for this. You could run reports for creators and see patterns, or have users submit something that shows that its off, or some fancy math to detect some kind of delta between users with the extension vs not disliked it.

With my suggestions, it takes most of the work out from us and moves most of the gathering logic to the creators browser and with that we can show them the exact data we are gathering and gain their trust!

@cyrildtm
Copy link
Contributor

cyrildtm commented Dec 31, 2021

Well that's the point of hosting my own fake site, so I can write an automatic routine and keep some or all video's real like & dislike counters in an internal database, and perform a "beautify" function before reporting to the user (the dishonest creator).

The function must be monotonic, so it meets the plugin's requirement that it always and only increases. But for example instead of increasing by 100 I can only report 1.

Then I can spawn a thousand new channels uploading idk very authentic bot videos and do the same thing with the plugin, and finally I can provide this discrepancy as misleading evidence that "this plugin is inaccurate for a thousand small content creators". Sure we can provide our own plugin user data and point out that they are way off, but public trust is lost.

Also the plugin's user population is way smaller than youtube users. It's not sophisticated math to work out a sweet spot that the fake dislike count is bigger than the plugin user reports but still smaller than the real one hidden by youtube. As long as there is a significant margin then it's profitable.

@jetbalsa
Copy link
Contributor

So we still have the human element here, We going to allow someone to register that many channels under their name or even in bulk? I mean sure, once the userbase grows where we have a few thousand channels, but most creators only have a handful, A validation workflow for a channel would be required to gain a key. This alone would stop most attempts to game the system. Don't forget that users can still report a channel to us, we can then hand check what is going on and maybe require manual reporting (screenshots, video, something) to prove its OK.

Your issue arises if the whole thing is automated, but even just having a ticketing system in discord would slow this right down.

Don't forget, we have ban hammers and can deal with creators. Also don't forget that the numbers we are using are for the end users. so detecting that a video with a set of the userbase disliked the video and the delta is /way/ off could be detected. along side with user reports on awful videos to be hand checked if it gets reported too much can be investigated.

@cyrildtm
Copy link
Contributor

Okay maybe this can work, but it really needs a lot of data from a lot of plugin users.

Check the plugin user's usual voting behavior, and build a profile. It covers how well the user's like and dislike lines up with others. From here we can build trust on this particular plugin user. Given that all users- with and without the plugin - dislike a video at a certain rate, how likely is this particular plugin user gonna dislike or like? This requires historical data as we need to know about the videos that the plugin user disliked prior to the API shutdown. Then reversely we can project how likely a certain video should be disliked by the general audience given plugin user's dislike count.

This entire thing is based on probability and even with straightforward Bayesian formulas I still don't trust it. I can argue that the model is unreliable: I treat all channels equally, and the behavior of a user is assumed to be the same across different channels, assumed not to change over time, and assumed to be equally active over all times. It's not true at all.

As for the human factor. Witnessing social media attacks and misinformation all over the place nowadays, I don't think anything is impossible. Since this plugin is only a small player, public trust is easy to destruct. I can split one hundred thousand currency units to a thousand people and ask them to do the same thing, each with their real identity but the right to hide it from the unofficial you and the chance to cover it with a fake. Once this plugin's method is proven vulnerable, public interest will move on, and not even the basic function we have as of now will be trusted or used. That's how social media works isn't it?

@cyrildtm
Copy link
Contributor

Also don't forget that the numbers we are using are for the end users. so detecting that a video with a set of the userbase disliked the video and the delta is /way/ off could be detected.

Your data is always way off. This plugin only has no more than two million users at Chrome extension store. Billions of people are watching youtube everyday, according to the Internet. Your delta is 99.9%. It's your projection and the real dislike that you want to match. Currently linear extrapolation is being used (or at least claimed), and it's pretty accurate. All I want to do (and I need to do) is to show your projection is off, and this can be done by providing massive fabricated creator data.

Once this extension catches up its momentum, you can no longer keep anything away from automation, meaning you can't do authentication on Discord. Then you will need routines and algorithms that costs money and time. Is it a FOSS project can achieve?

@sy-b
Copy link
Contributor

sy-b commented Dec 31, 2021

@jrwr @DARKDRAGON532
Your proposed methods can prevent bot attacker to a good level
but how are they going to deter rouge creators?
@cyrildtm's claim is still valid for that

Also the reason to give the backend the access to creator's dislike (using OAuth & API) is to prevent forging of data.


The method assumes a bunch of things but doesn't describe them specifically

eg.

  • Creator contacts us, we give them a special API key to use with our API (A password if you will)

    You probably assumed that there'll be a team for this

  • The number can be only increased.

    Problem is that dislikes & likes can decrease.

  • Since we are storing metadata about this, we can run reports and see when this is happening. we can report this to the end user or a moderation team to deal with it.

    Again, this requires a team

  • While this can happen, there isn't a ton you can do about it.

    Can't say until we experiment with lot of options

  • Overall, there isn't a /ton/ of motivation for this

    I doubt that

  • You can do some fancy backend detection for when this is happening and report on it.

    You could run reports for creators and see patterns, or have users submit something that shows that its off, or some fancy math to detect some kind of delta between users with the extension vs not disliked it.

    You completely assumed that this one will work somehow.
    You didn't describe what & how?

@cyrildtm
Copy link
Contributor

Also the reason to give the backend the access to creator's dislike (using OAuth & API) is to prevent forging of data.

Agreed and Thank you. Polling official data directly is both the right and easy way. But then there's trust issue between creators and developers. Currently OAuth gives you access to a lot of data of business interest other than dislike count, but at least it's manageable and there may be a solution**

** jk we can't ask youtube to make a separate permission just to see dislike counter can we lol

@coldcanuk
Copy link

coldcanuk commented Dec 31, 2021 via email

@SyntaxBlitz
Copy link

Currently OAuth gives you access to a lot of data of business interest other than dislike count, but at least it's manageable and there may be a solution

In the Discord we discussed a way to let creators fetch from the API (using code on their own computer, meaning they don't have to trust RYD with the scope) in a way that's (mostly) verifiable, so that creators can't spoof their counts. link here

@cyrildtm
Copy link
Contributor

@coldcanuk yeah and this pet project is no fun any more-

@coldcanuk
Copy link

coldcanuk commented Dec 31, 2021 via email

@cyrildtm
Copy link
Contributor

cyrildtm commented Dec 31, 2021

IMHO Ryd is onto something. I think that something is worth it. What is that "it" and how do you go from pet project to billable product ? These are big questions .....

I can try to answer, it's basically a reinvention of PageRank, and the whole reason is Google said they're not gonna do it anymore two months ago.

Well the c&d is expected but not a real thing ... Your extension is telling ppl what it's doing

Not any more if Github gets a C&D.

You can justify billing creators assuming you are delivering more than a repackaged pie chart.

As a casual person who learns from the street, I am aware of PR companies offering consultant services on how to improve your channel.

@hrichiksite
Copy link

Wow, many people (even including me), had a similar idea but as @Anarios said, #396 (comment)

Depending on creators to give the dislike count is not that viable as they have other work to do and also some bad actors can just self MiTM themselves to fool the system. I think OAuth is the only way that is 1) fully true 2) verifiable 3) official 4) better than doing something client-side as no one much likes to install stuff also this ain't even solving Linus's problem as he said in the video, the extension can still take other stuff. Solving that issue which Linus shared is not possible, so he has to trust what's running on the server. My try on this problem is this https://github.com/OpenDislikeAPI/Code, an API (which I plan to give unlimited access to RYD for the good) that would just get the details from youtube and cache it so other's don't have to :) It would still need creators to sign in with Google to connect the accounts.

@cyrildtm
Copy link
Contributor

cyrildtm commented Jan 2, 2022

@hrichiksite I looked at your project statement, and it sounds promising to be a compromised solution. I have a few questions;

  • (Verifiability) How do you prove that the code running on your side is what you will post openly?
  • (Weak link) How do you/we/the creators trust Redis or any other open-source data repository for archiving purposes? -- Sure purely dislike counts are barely valuable, and assuming you are doing honest. My concern is when the "trusted" repository gets stolen and our data is lost.
  • (Scale) You mentioned "...but I thought it might be an overkill to store just a JSON object containing just two to three items" If you mean each video will contain a couple of items like time, ID(s) and dislike counts, then it's fine. Otherwise I think this may be an oversight. Creators have hundreds even thousands of videos in their channel(s), so it's more than a couple of JSON items for each uploader.

I would really see Linus host a proxy API and archiving site with his own overkill Internet and storage facility (this may not be an overkill in such a time of zombie apocalypse after all) But this will be a cannon shell dropping on his feet.

@hrichiksite
Copy link

@cyrildtm Hi, I am answering your questions so you can understand better now:

  1. Well, for that, there is no current method to prove it. All I can do is give a creator like Linus to audit my server as I said in Integration with OpenDislikeAPI #401 's Trust part.
  2. Redis is a key-value store database software that I will host on my server. You can read more about it here https://en.wikipedia.org/wiki/Redis
  3. I said that for storing the dislike count in Redis, like a JSON object like
{
  "dislikes": Number,
  "somethingElseIWouldNeedToStore": DataType
}
  1. Cannon's shell dropping on his feet will cause it to break if was wearing socks and sandals. I had that thought too, make Linus host the code and everybody will have no doubt, but it's really a hassle to host and maintain code. Also, his vault is made for his videos, not this stuff anyways. Tho he can just make a server anytime.

Also, you can comment on this issue #401 for keeping everything in place.

Have a good day :)

@aryavsaigal
Copy link
Collaborator Author

aryavsaigal commented Jan 2, 2022

the extension can still take other stuff.

it's gonna be open source if we take this approach

@TorutheRedFox
Copy link

as for MITM attacks with studio.youtube.com, you can just add SSL certificate verification

@cyrildtm
Copy link
Contributor

cyrildtm commented Jan 10, 2022

as for MITM attacks with studio.youtube.com, you can just add SSL certificate verification

Elaborate?

Besides, I can always MITM modify any verification process run locally. That's how paid software were hacked in the past.

@TorutheRedFox
Copy link

yeah you're right
as for the cert verification, you can just verify that the certificate it's signed by is the one you expect it to be

@KraXen72
Copy link

KraXen72 commented Mar 7, 2023

is there any progress on this? i'm aware it is a complex issue, but this has been first requested 2 years ago in 2021 and would be really useful, because atleast some transparent creators could make the youtube a better place by providing their real dislike count

@ranazee
Copy link

ranazee commented Mar 7, 2023

solve this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

10 participants