So far I've found form feeds in Bulbasaur's English Pokemon Red dex entry and Stantler's English Pokemon Gold entry has an interesting control sequence I hadn't seen of 0xC2AD right where it splits the word "reality." I don't know any of the other language that exist in the data so I don't know if I should be blindly replacing these characters with spaces or just removing them. Is there a rule of thumb for how to parse these characters? Are there other character sequences I should be parsing out that I just haven't found because there's 802 pokemon in 2 dozen games, in a dozen languages?
So far I've found form feeds in Bulbasaur's English Pokemon Red dex entry and Stantler's English Pokemon Gold entry has an interesting control sequence I hadn't seen of 0xC2AD right where it splits the word "reality." I don't know any of the other language that exist in the data so I don't know if I should be blindly replacing these characters with spaces or just removing them. Is there a rule of thumb for how to parse these characters? Are there other character sequences I should be parsing out that I just haven't found because there's 802 pokemon in 2 dozen games, in a dozen languages?