Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update about page #1348

Closed
Hans5958 opened this issue Apr 25, 2022 · 21 comments · Fixed by #1383
Closed

Update about page #1348

Hans5958 opened this issue Apr 25, 2022 · 21 comments · Fixed by #1383
Labels
enhancement Improvements and new feature requests

Comments

@Hans5958
Copy link
Member

Most specifically the contributors. Just putting here for ther record.

@Hans5958 Hans5958 added the enhancement Improvements and new feature requests label Apr 25, 2022
@Hans5958 Hans5958 changed the title Update about page when the time comes Update about page Apr 28, 2022
@Hans5958
Copy link
Member Author

I was thinkng for putting real names for the contributors if they wanted. (Like Stefano, instead of TC-something). Also thinking on linking multiple accounts outside of Reddit.

@Codixer Codixer linked a pull request May 9, 2022 that will close this issue
@Hans5958
Copy link
Member Author

This wouldn't be addressed on #1315, sadly.

@Hans5958 Hans5958 removed a link to a pull request May 10, 2022
@AnonymousRandomPerson
Copy link
Contributor

Reposting my previously stated objections to a contributors field:

Credits are the kind of thing that needs to be all or nothing, and going for all credits is infeasible. The previous submitted_by field was completely messed up because of manual edit requests, dupe removals and merging entries, and direct GitHub submissions. No way anyone is going back through all of those and assigning credit, and thus they will be permanently messed up.

It was suggested that the submissions field be edited only through the crawler, but this is also a flawed solution. This solution does not address any of the concerns I put forth, and it is unfair to only credit users who submitted in a particular way after a certain date (i.e., when the remaster released).

@mxdanger
Copy link
Member

mxdanger commented May 16, 2022

I think he is referring to the contributors to the development of the website itself, and not the users who submitted entries.

@AnonymousRandomPerson
Copy link
Contributor

@mxdanger According to discussion on Discord, this is for the contributors field.

@Hans5958
Copy link
Member Author

Hans5958 commented May 16, 2022

This is for all kinds of credits, meaning acknowledgements for code contributions, entries contributions, other kinds of acknowledgements, and so on.

Especially, for entries contributions, this is written on the about page:

The 2022 Atlas would not have been possible without the help of our Reddit contributors. This section will be updated with all of the contributor's usernames.

I think this is the game plan for later, especially since the old Atlas has it. This is why I argue to keep the contributors list/field. You can argue that we can move this to another text file, which is fine, but there should be at least a safe keeping.

Also, I stated on Discord that it's fine for incomplete credits. It's better than nothing, after all. I have a project which it has a credits page for all the contributors, whether it is a bug report, or a huge chunk of code. Everyone is open to include them on the credits page, but you would know that there would be some that we haven't noticed, some that we forgot, and some that just don't want their name on the credits page.

Also, about the "including those that only contributing in a particular way, which is unfair" point, you can expand the argument into something as "so, only Place Atlas team is mentioned, huh?" or "so, only the project manager (Stefano) is mentioned, eh?" or "so, only this certain group who think they contributed enough on the Atlas, but not the entries, can be part of the about page, huh?" (I hope you understand what I meant, it's a language barrier). It would happen nonetheless, so, if that's how you look at it, we should just do a census of all the contributors (which is hard), or remove all the credits (which would be disrespectful). Both of this are not absolutely good, so the "incomplete = unfair" shouldn't be worried too much.

If you ask me, it's still good to give credit where it is due, even if it would be difficult to have all the credits, just like how it's difficult to catalog all the art on the Atlas.

@AnonymousRandomPerson
Copy link
Contributor

All of my comments up to this point were targeted towards entry contributions. The topic of code contributions is different, and I have no objections to listing code contributions however small.

The Atlas entries and listing contributors are so fundamentally different that they cannot be equated. The latter deals with people on a personal level and is inherently more touchy. The Atlas entries are the main focus of the project, and their nature means they will reach a satisfactory level of completion, while the credits are full of inaccuracies and will never get anywhere near completion. Combined with credits being personal, that level of inaccuracy in the credits means including them is unacceptable.

I dislike the original Atlas having contributors either for the same reasons, and it's not a good model to follow. Case in point, I contributed to the original Atlas and my name is not in that list. Because user-submitted content leads to a massive list, no user will care about looking through that entire list other than to find their own name, and they are simply going to be disappointed if their own name is not there.

A feasible alternative for crediting entry contributions would be to keep the blanket term, "our Reddit contributors" (and Discord, and GitHub). We can list code contributions individually, but I see no benefit to listing individual contributors for entries.

@Hans5958
Copy link
Member Author

I can see why, but here's a part where we have opposing viewpoints, that we agree to disagree. To reiterate, I think all contributions are equally "useful", and it is fine for the incomplete credit (as long as we are open for credit additions, but a problem of motivation is a different argument). I don't even think if that it will be a problem if it is "personal" (it's not like other kinds of credit can't be personal).

I think we ask others regarding this, or have a vote for it.

@AnonymousRandomPerson
Copy link
Contributor

AnonymousRandomPerson commented May 17, 2022

The problem of motivation to keep track of credit additions is a major issue here that cannot be ignored. As the main person maintaining the atlas entries, keeping track of credit as I merge or rewrite submissions is a major burden that I'm not keen on taking on. It also causes confusion for anyone else submitting via GitHub, and I do not want to increase the workload on contributors by sending their PRs back to fix contributors. If nobody wants to maintain this, there is no point implementing it or having a vote on it.

I already stated my points to Stefano, and he agrees with my position. https://discord.com/channels/960791635342524496/960814065901502546/974724325590573087

@Codixer
Copy link
Member

Codixer commented May 17, 2022

Almost all posts on the reddit forum have a unique ID, what if we take all of them that have the "Approved" tag? And based on that we get the contributers. No matter if their entry was removed.

@Codixer
Copy link
Member

Codixer commented May 17, 2022

These contributers would be placed on the about page, just like the initial r/place atlas./

@AnonymousRandomPerson
Copy link
Contributor

I previously attempted to crawl the entirety of the subreddit posts with praw and psaw, but the API isn't built for that and misses a large number of submissions. If someone wants to hack away at doing that comprehensively, that could work.

Stepping back a bit, my main reservation is having to maintain contributors on individual entries. If that is abandoned in favor of a simple list on the about page, I wouldn't be nearly as opposed to that. It'd still be a massive wall of text that probably nobody would read, but it'd be more maintainable.

@Hans5958
Copy link
Member Author

Hans5958 commented May 17, 2022

To clarify, I was not talking about making contributors on each of the infoboxes. I was talking about the mentions on the about page, hence I wanted to discuss it here.

@AnonymousRandomPerson
Copy link
Contributor

In that case, I don't have an issue with credits in the about page if you can figure out how to crawl the entirety of the subreddit per @Codixer 's suggestion.

@Codixer
Copy link
Member

Codixer commented May 17, 2022

I previously attempted to crawl the entirety of the subreddit posts with praw and psaw, but the API isn't built for that and misses a large number of submissions. If someone wants to hack away at doing that comprehensively, that could work.

Stepping back a bit, my main reservation is having to maintain contributors on individual entries. If that is abandoned in favor of a simple list on the about page, I wouldn't be nearly as opposed to that. It'd still be a massive wall of text that probably nobody would read, but it'd be more maintainable.

I mean, we have all the unique id's from a certain moment, so we don't have to get all posts. We just need to get the owner of a post through the ID of a post

@AnonymousRandomPerson
Copy link
Contributor

If you mean read-ids.txt, that is missing hundreds of posts from before IDs were stored by the Reddit crawl, along with all manual edit requests.

@Codixer
Copy link
Member

Codixer commented May 17, 2022

Then those aren't pulled. Just talking about the ID field in the atlas.json

@AnonymousRandomPerson
Copy link
Contributor

The atlas.json ID field is more incomplete than read-ids.txt due to dupe removals and merged entries, while also missing the edit requests and early submissions. Figuring out how to crawl the Reddit submissions directly would account for these edge cases.

@Hans5958
Copy link
Member Author

I would just go and scrape https://www.reddit.com/r/placeAtlas2/?f=flair_name%3A%22Processed%20Entry%22 if possible, and I don't know if it is possible when praw won't work on here. Maybe I need to try Node.js.

@AnonymousRandomPerson
Copy link
Contributor

AnonymousRandomPerson commented May 18, 2022

praw has difficulties because the Reddit API only returns about 200 submissions from a query, regardless of whether you set the limit higher. These limits also apply when accessing Reddit web, so simply scraping the website with something like curl doesn't work either. I had more success with psaw (wrapper that combines praw and PushShift), though it wasn't perfect.

For context, I wanted to fetch every Reddit entry ever submitted (including removed entries) to be able to search through them now that the remaster is live. I was able to use psaw to get most of the entries, but it had some threading difficulties that missed some entries. I resorted to mopping up a couple of the missed entries with read-ids.txt, and after all that there were still a few missed entries. Usage is similar to praw if you'd like to play around with it; here's my branch.

If there's a better solution with a Node library then go for it. I haven't looked into libraries there myself.

@Hans5958
Copy link
Member Author

Hans5958 commented Jul 3, 2022

Hmm, I didn't know psaw existed. I may have to try that instead of messing with praw on this enviroment, in case I have time to do the scrape.

I'm sure Reddit API has the pagination thing, like the after field that you can give on a request. I'm not fond on the API, so it is just my prediction.

A little bit of update: I have tried using Pushshift on read-ids.txt (#1383) and using the submitted_by field (#1385). For now this should be enough, but there is a room of improvement, which is obviously the scrape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvements and new feature requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants