Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translated Tags should be smarter (split by delimiters) #2915

Closed
SD-DAken opened this issue Mar 3, 2017 · 3 comments
Closed

Translated Tags should be smarter (split by delimiters) #2915

SD-DAken opened this issue Mar 3, 2017 · 3 comments

Comments

@SD-DAken
Copy link

SD-DAken commented Mar 3, 2017

If you try to upload e.g. http://www.pixiv.net/member_illust.php?mode=medium&illust_id=48562628 the only suggested translated tags are:

深海棲艦 => shinkaisei-kan
艦これ => kantai_collection
艦隊これくしょん => kantai_collection

It would be nice if it also suggested:

伊19/陸奥/多摩/島風/イ級/エラー娘/空母棲姫/川内 => i-19_(kantai_collection) mutsu_(kantai_collection) tama_(kantai_collection) shimakaze_(kantai_collection) i-class_destroyer error_musume aircraft_carrier_hime sendai_(kantai_collection)

I.e. check if a tag has a translated result, if it has use that; if not, split the tag by its delimiters (/ seems to be common but there might be others) and check if the parts have matches.

@BrokenEagle
Copy link
Collaborator

BrokenEagle commented Mar 3, 2017

What about Ranma 1/2 (らんま1/2), Fate/Stay Night (Fate/staynight), every other Fate/Whatever incarnation, .Hack// (.hack//), etc. In essence, what about tags that contain "delimiters".

@SD-DAken
Copy link
Author

SD-DAken commented Mar 4, 2017

That's why I said to check if the tag as a whole matches first and only if that fails split by delimiters.

The current behavior is:
艦これ => kantai_collection
らんま1/2 => ranma_1/2
伊19/陸奥 => no matches
らんま1/2/早乙女乱馬 => no matches

The adjusted behavior would be:
艦これ => kantai_collection
らんま1/2 => ranma_1/2
伊19/陸奥 => i-19_(kantai_collection) mutsu_(kantai_collection)
らんま1/2/早乙女乱馬 => saotome_ranma (doesn't match ranma_1/2 because it is split between 1 and 2)

This would be an improvement over the current results and everything that currently works will still work; only some edge cases that currently don't work will continue to fail.

Or in pseudo-code something along the lines of

for each (tag in tags) {
   if (get_translated_tag(tag) != "") {
      translated_tags += get_translated_tag(tag)
   } else {
   parts = tag.split("/");
   for each (part in parts) {
      if (get_translated_tag(part) != "") {
          translated_tags += get_translated_tag(part)
       } 
   }
}

@r888888888
Copy link
Collaborator

The problem is there's no way to intelligently parse out which slashes are part of a tag and which are part of a collection. So cases like fate/stay night and ranma 1/2 will not work unless we try out every permutation. I think this feature can be done for the general case however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants