v5.31.1
Fixed
-
Schedule 13D/13G silently dropped CUSIPs with the new
<issuerCusips>wrapper — SEC began wrapping<issuerCusipNumber>inside an<issuerCusips>container element on some Schedule 13D/13G filings (e.g. CIK 1906837 13D, CIK 1425851 13G). The parser's BS4recursive=Falselookup at the top-level only matched the flat layout, sosubject_company.cusipcame back as''whenever the wrapper was present. Parsing now falls back to a recursive lookup when the flat probe misses, handling both wire formats. (#802, PR #803 by @HristoRaykov) -
Schedule 13D/13G event-date attribute name mismatch —
Schedule13Dexposed the triggering-event date asdate_of_eventwhileSchedule13Gexposed it asevent_date, breaking duck-typing across a mixed list of 13D/13G filings and forcing callers to usegetattr/hasattr. Both classes now accept either name; the underlying attribute is unchanged, so existing code keeps working. (#804, PR #805 by @0ywfe) -
Spurious
DocumentTooLargeErrorfromStreamingParseron legitimate documents — The streaming HTML parser accumulatedlen(etree.tostring(elem))on every lxmliterparseendevent. Becausetostringserializes the full subtree andendfires for every closing tag, nested elements were counted multiple times — large nested HTML could tripmax_document_sizeeven though the source document was under the limit. The per-event accumulator is also redundant:HTMLParser._parsealready validateslen(html.encode("utf-8"))againstmax_document_sizebefore invoking streaming mode. The accumulator and its state are removed; size is now checked once at the top ofStreamingParser.parse()and the same encoded bytes are reused foriterparse. (#806 by @kevinchiu)
Full Changelog: v5.31.0...v5.31.1