Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate genre, subgenre & theme meta for all games #366

Closed
nikuda opened this issue Oct 10, 2016 · 12 comments
Closed

Populate genre, subgenre & theme meta for all games #366

nikuda opened this issue Oct 10, 2016 · 12 comments

Comments

@nikuda
Copy link
Collaborator

nikuda commented Oct 10, 2016

Volunteers needed.

It mostly involves going to www.giantbomb.com/games, searching for the game and then noting down the genre and theme entries from the info box on the right side.

screen shot 2016-10-10 at 11 12 08 am

If GiantBomb doesn't have a record of the game then use a combination of wikipedia game description and pick a fitting genre from:

See schema.yaml file for list of genres, subgenres and themes. The list of genres should stay more or less the same, the list of subgenres is the one that should be expanded if need be.

The meta format looks like this:

- name: Game name
  meta:
    genre: [Role-Playing]
    subgenre: [Roguelike]
    theme: [Sci-Fi, Horror]
  clones: ...
  ...
@piranha
Copy link
Collaborator

piranha commented Oct 10, 2016

Maybe with a script? Could make errors which hard to notice though...

@tukkek
Copy link
Contributor

tukkek commented Oct 11, 2016

I'm up for doing the genre gathering as I mentioned when opening #353. I guess I would just change the games.yaml with the data?

Also your list of genres and sub-genres sound pretty arbitrary. Real-time strategy and strategy make no sense as both top categories. There isn't a "turn based strategy" or "third person shooter" anywhere. "Beat ''Em Up" and "Shoot ''Em Up" aren't really input friendly at all. I'd like to know beforehand if you're OK with me doing a complete overhaul on these, based around what I come up during the gathering, as opposed to the predefined, arbitrary list we have now.

@tukkek
Copy link
Contributor

tukkek commented Oct 11, 2016

Also it's not clear to me if a game can have more than 1 genre and/or more than 1 sub-genre.

EDIT: I see now the 3 meta items seem to be arrays. Would I be correct in assuming all of these could receive multiple values?

@tukkek
Copy link
Contributor

tukkek commented Oct 11, 2016

Finally, just to make clear: I'm volunteering to gather genre (and subgenre when needed) for every single one of the games currently on the list. It's a lot of work so let me know if there isn't anybody working on it already as far as you know because it will certainly take a lot of browsing and copy-pasting YAML lines :P Really wouldn't want to see the time put into it get wasted.

@cxong
Copy link
Member

cxong commented Oct 11, 2016

I think it's better to treat all genres equally, i.e. no subgenre hierarchy. Taxonomy is a hard problem and evolves all the time for games, so rather than try to solve that, we can just tag with whatever genres are most relevant.

For example, Mario used to be described under the very broad genre of "action game" but today we'd probably just say it's a platformer. Also, technically roguelikes are a subgenre of RPG but I think most people don't think of roguelikes when they say "RPG".

@nikuda
Copy link
Collaborator Author

nikuda commented Oct 11, 2016

@cxong I agree, though subgenres as I've set them up here are mostly used as a visibility thing.

For example it's really useful with the Sports genre, we don't want all the Sports subgenres in the genre cloud, but you still want to keep that information:

[
    "Baseball",
    "Basketball",
    "Billiards",
    "Bowling",
    "Boxing",
    "Cricket",
    "Fishing",
    "Fitness",
    "Football",
    "Golf",
    "Hockey",
    "Skateboarding",
    "Snowboarding/Skiing",
    "Soccer",
    "Surfing",
    "Tennis",
    "Track & Field",
    "Wrestling"
]

@nikuda
Copy link
Collaborator Author

nikuda commented Oct 11, 2016

It's a lot of work so let me know if there isn't anybody working on it already as far as you know because it will certainly take a lot of browsing and copy-pasting YAML lines

Just open up a PR when you've done a certain amount. Don't wait months to do the whole thing.

Would I be correct in assuming all of these could receive multiple values?

Yes

@piranha
Copy link
Collaborator

piranha commented Oct 11, 2016

@tukkek maybe it makes sense waiting a bit for me to split up games.yaml in many files :-)

@tukkek
Copy link
Contributor

tukkek commented Oct 14, 2016

@piranha good thing I was stalling then! I have been trying to take a good chunk of time to do it in one sitting but it's been hard this week. Once I manage to do so though it should be done in a single day (unless it takes me really more time than I am expecting, which should be a good bunch of hours). Can you tell me which bug should I be following to know once it's OK to go?

About genres/subgenres, I like the idea of having both but it's not only a bit more work to get right but it also will pollute the tag cloud. Having a huge "action" genre with hundreds of games that encompasses everything from platformers to beat-em-ups, shoot-em-ups, FPS, TPS, roguelites, action-rpgs, etc etc is not really valuable to anyone. I think having genres as single "tags" is better, which would also support "parent categories" as well in the future. Anyways, you guys haven't mentioned if you're all OK with me redefining the categories as I go along. Let me know, please.

@piranha
Copy link
Collaborator

piranha commented Oct 15, 2016

It should be ok right now! Also, there is no need to do it in one sitting - I'd imagine in this case it'll be somewhat tedious task. :)

As for redefinition - I'm ok with that, since anything is good if it's done with good intentions and thoughtfulness. :)

@tukkek
Copy link
Contributor

tukkek commented Oct 16, 2016

Thanks! I want to find a good chunk of time exactly because it's tedious -
get it over with in one go, you know?

I just wanted to make sure about the redefinition because being such a big
amount of data I'd hate for my effort to go to waste or something like
that. As long as no one else here has something against it that's what I'll
do then, to be the best of my ability. As I said before I think it makes
more sense to define a taxonomy based on real data from the games anyway
instead of trying to come up with something beforehand and making the games
fit this arbitrary categorization.

As @cxong suggests I'm thinking about doing it with categories only (no
subcategory) but I'll check here again before starting the work (probably
next week) to see if any further discussion on the subject comes up.

On 15 October 2016 at 16:08, Alexander Solovyov notifications@github.com
wrote:

It should be ok right now! Also, there is no need to do it in one sitting

  • I'd imagine in this case it'll be somewhat tedious task. :)

As for redefinition - I'm ok with that, since anything is good if it's
done with good intentions and thoughtfulness. :)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#366 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADpJMtXpuquooPFZptOds_8iMLN75x18ks5q0SS6gaJpZM4KSKEl
.

@tukkek
Copy link
Contributor

tukkek commented Jan 4, 2017

Finally got around to doing this. Now every game has one or more genres associated with it. I didn't have to actually download any clone/remake because most of that information could be found either by a Google search, lookup on the clone pages or watching short YouTube videos of the game in question. I'm pretty confident on the current state of things but a few errors could have slipped up since I'm not familiar with most of these games and just copy-pasted what I found from other sources.

I do have to say that I didn't put as much effort into completeness as I could have: for example, I didn't differentiate strategy and tactical games (tagging them either TBS or RTS). I didn't add "action" to all shooter/platform/arcade games either and I didn't use "flight" as a genre (looking back I think I should have, instead marking many flight games as simulation instead). In the same way, I used "simulation" instead of "sandbox" in a few titles. Anyway, I think the current result is accurate and these cases can be improved on a game-by-game basis moving ahead where and if necessary.

The reason for deliberately opting for vagueness over factual precision is because I didn't think the list would benefit a lot from having 1 MOBA game and 1 educational game, for example. The same with dividing strategy and tactics into different subgroups, etc. If anyone thinks it would be better to have more tags, even if the content becomes less organized then be my guest and change whatever you like - but I think that having tags with 10 or less games isn't really helpful at the moment.

In that sense you can see I also took some arbitrary decisions regarding some genres (like roguelike, horror and rhythm) because I though they stand out enough on their own even without many examples in the list. For example: merging survival horror games, even just 2 of them, to the "action" genre would be a disservice to the list.

You'll also notice that I didn't see the need to use the subgenre tag either. It seemed unnecessary to me while doing this work and also hard to input, if I had to discriminate "strategy" and TBS/RTS on a case by case basis. It's a lot of extra work without any real benefit: strategy fans can check TBS first and then RTS. I also didn't use the "theme" tags but I left what was already there without change.

My main goal was to have any single game tagged as only one genre but that clearly isn't possible with so many genre defying games out there. Here's a rundown of the categories so far (note that the % is relative to the total of games, not of genre tags since there can be multiple tags per game):

Genre Count Percent
arcade 100 21%
RTS 51 11%
puzzle 51 11%
simulation 49 10%
platform 48 10%
RPG 43 9%
action 43 9%
FPS 38 8%
adventure 35 7%
TBS 32 7%
shmup 25 5%
racing 21 4%
TPS 8 2%
MMORPG 6 1%
sports 6 1%
fighting 5 1%
roguelike 5 1%
rhythm 4 1%
horror 2 0.42%

I'll be creating a pull request soon after writing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants