Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Improve logic to redact question texts #13
To give enough context for the answer, the redaction algorithm does not redact context words. However, this sometimes leads to nothing at all being redacted, e.g. for the question "an Earth year is about 365.26 years long", the correct answer "365.26" will be shortened to "36526" and then we'll fail to redact that token in the question text.
This problem was, for example, reported in #9. After applying this patch, the situation described in that issues is fixed:
This patch also fixes another failure mode related to context words: we may have a question like "the atmosphere is made up of ?% CO2" and one of the answers is "52%". That's too easy! After this change, the correct answer in this example would be redacted to "52" making the question less easy to answer.