Add PDF keycard parsing functionality#8946
Open
mohammadalfaiyazbitgo wants to merge 4 commits into
Open
Conversation
…d sdk - add parseKeycard.ts with pure string parsing logic (parseKeycardFromLines, buildLinesFromPDFNodes, KeycardEntry, PDFTextNode types) - add extractKeycardFromPDF.ts with pdfjs-dist based PDF extraction (browser-only; consumer must configure GlobalWorkerOptions.workerSrc) - add pdfjs-dist ^5.0.0 dependency to @bitgo/key-card - export new functions from module index - add mocha unit tests for parseKeycardFromLines covering all Part N edge cases - add pdf parse demo UI to web-demo KeyCard page with file upload, result table, and worker configuration for webpack WCN-19
pdfjs-dist@5.7.x requires node >=22.13.0 || >=24, which is incompatible with the node 20.x CI runner. downgrade to ^4.0.0 which supports node 18+. update web-demo worker path from .mjs to .js to match v4 build output. fix prettier formatting in web-demo KeyCard component. WCN-19
pdfjs-dist initializes browser-only globals (DOMMatrix) at module load time. using a static top-level import caused all tests in @bitgo/key-card to crash in node.js environments. switching to a dynamic import() inside extractKeycardEntriesFromPDF defers loading to call time (browser only). WCN-19
…nsions pdfjs-dist v4 only ships .mjs worker files (no .js variants). the previous commit incorrectly changed the worker path to .mjs -> .js. reverted to pdf.worker.min.mjs and added .mjs to webpack resolve extensions in both dev and prod configs so webpack can locate the file. WCN-19
2d5100a to
45ec5bf
Compare
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR adds functionality to extract and parse BitGo keycard data from PDF files. It includes:
New parsing module (
parseKeycard.ts): Core logic to parse keycard sections (A–D) from text lines extracted from PDFs, with robust handling of:PDF extraction module (
extractKeycardFromPDF.ts): Wrapper around pdfjs-dist to extract text nodes from PDF files and convert them to keycard entriesWeb demo integration: Added file upload UI in the KeyCard component to allow users to upload a keycard PDF and view extracted sections in a table
Dependencies: Added
pdfjs-dist(^5.0.0) to both@bitgo/key-cardandweb-demopackagesType of change
How Has This Been Tested?
Added comprehensive unit tests in
parseKeycardFromLines.test.tscovering:All tests verify correct reconstruction of encrypted wallet password JSON and other section values.
Checklist:
https://claude.ai/code/session_01Pj4SsrmSnoBNj8h6zX1vaz