Infobox inconsistent attribute names. #127
Comments
I can fix this pretty quick... |
The houses are not the only issue. There are several inconsistencies in the infobox of the caracters. For example "Titles" != "Aliases" or "Book(s)" != "books" != "Books". I am actually reimplenting the whole scraper, because the regex stuff from theo is not very maintanable and slow... Sorry. |
Again @Adiolis do what you can and feel free to delegate to someone. Family is more important, always. |
Yeah. I know. |
But guys, there are more problems than only the houses. Feel free to fix that. |
How about using http://www.w3schools.com/jsref/jsref_tolowercase.asp for all the fields to normalize these things somewhat? |
Jep. Still extra fixes for "Titles" != "Aliases" and so on are necessary. |
Correct. You should have an array of synonyms for a given field. |
Or you can machine learn what goes where :D I think the static approach is easier :D |
Any volunteers? 😆 |
check out
http://awoiaf.westeros.org/index.php/Stannis_Baratheon
The scrapper did not pick up a House affiliation for Stannis because the title in the info box is royal house and not house. The scraper needs to be reconfigured to handle this.
The text was updated successfully, but these errors were encountered: