The issue is that some filenames are within URLs (because of WARC scanning) and where sf thinks the name is a URL it strips characters following a "?" because in a URL that's the query string. E.g. it is trying to get the name within a string like "http://www.mysite.com/file.pdf?user=richard"
But in your case where the ? is legitimately part of a regular file name, this is breaking extension matching.
I'll have a think about how to re-jig this bit of the code to fix