Identify "Corpus" by IETF language tag rather than just language #504

duramato · 2016-12-13T23:44:22Z

My suggestion is to Identify the "Corpus" by IETF language tag rather than just language. What are the benefits? it would allow for different dialects to be made independent as i might want to just "teach" just one of them.

Why i'm bringging this idea up?

For example the corpus for the Portuguese language taking a look at it seems to have mostly Portuguese Brazilian (pt-BR) strings and some, here and there, in the Portuguese (pt or pt-PT). Using said corpus makes the bot a bit of biliangual freak, i'f im allowed to call it that :P

Same goes for the english corpus which, i'd say (not totally sure, but from some expressions), is English (en or en-GB) with United States English (en-US) in it.

Chinese is another language with ALOT of dialects.. (but this one is an unkown to me as i have zero knowledge in the language)
The story goes on..

gunthercox · 2016-12-13T23:49:11Z

This is a good suggestion. Making changes to use IETF language tags sounds like it would really benefit ChatterBot.

gunthercox added the enhancement label Dec 14, 2016

gunthercox changed the title ~~Identify "Corpus" by IETF language tag rather than just language.~~ Identify "Corpus" by IETF language tag rather than just language Jan 21, 2017

gunthercox mentioned this issue Jun 25, 2017

Identify "Corpus" by IETF language tag rather than just language gunthercox/chatterbot-corpus#24

Closed

gunthercox closed this as completed Jun 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify "Corpus" by IETF language tag rather than just language #504

Identify "Corpus" by IETF language tag rather than just language #504

duramato commented Dec 13, 2016 •

edited

Loading

gunthercox commented Dec 13, 2016

Identify "Corpus" by IETF language tag rather than just language #504

Identify "Corpus" by IETF language tag rather than just language #504

Comments

duramato commented Dec 13, 2016 • edited Loading

Why i'm bringging this idea up?

gunthercox commented Dec 13, 2016

duramato commented Dec 13, 2016 •

edited

Loading