-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Player names in Baseball Reference stats contain mis-encoded non-ASCII characters #393
Comments
Hi, these are "tildes" for spanish words, Will try to replicate your example and apply my workaround |
Hi, @AndrewsOR this now has been merged into master Feel free to close the issue. |
This issue can be closed since the solution was merged. @schorrm |
Thank you @BrayanMnz ! |
The FanGraphs functions
pitching_stats()
andbatting_stats()
appear to convert names from that site such as Ronald Acuña Jr. and José Abreu toRonald Acuna Jr.
,Jose Abreu
etc., which are recognizable if not entirely correct.On the other hand, the Baseball Reference functions
batting_stats_bref()
andpitching_stats_bref()
return what seems like mis-converted HTML encodings of those names, resulting in lower readability, although the names on the site itself appear correct.For example:
prints:
My current workaround is to use
playerid_reverse_lookup
to bridge to FanGraphs names and use those instead. (I like to use the Baseball Reference batting stats because of how it labels players who played in multiple teams/leagues in a given season, providing both team names instead of "---".)I love
pybaseball
... thank you!The text was updated successfully, but these errors were encountered: