-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing <TEXT> fails #62
Comments
Hello, please make sure to only parse documents that follow the XBRL or iXBRL specification. |
Ah, this is just a different error indicating non ixbrl file. I could add a pre-check in the lxml implementation that would filter this out. |
I am not entirely sure what you mean with the following statement:
for an ixbrl instance document to be valid, it must comply with the iXBRL specification. This includes many validation rules. |
But yes, you are right that it would be nice if the parser could check if a document contains valid xbrl taggings. |
Parsing of
https://www.sec.gov/Archives/edgar/data/0001634379/000156459020053234/mtcr-10q_20200930.htm
causes exception in XbrlParser(cache).parse_instance(url)
Saying: not well-formed (invalid token): line 7, column 2 Thus most likely also other fillings from the same company.
SEC's response:
Please look at the contents of the link. You will see that like every other one of the millions of HTML documents on the EDGAR site, the first six lines are document metadata in SGML, that a browser ignores. They look like this:
trace
The text was updated successfully, but these errors were encountered: