Fine-tuning a Model on Your Own Data tutorial throws error #2881

erenaldis · 2022-07-26T03:54:16Z

TypeError: forward() got an unexpected keyword argument 'passage_start_t'
a bunch of the parameters in "batch" also seem to throw the same error upon further investigation, such as "start_of_word"
It seems like it is attempting to train a distilbert base like model rather than BertForQuestionAnswering.

TypeError Traceback (most recent call last)
in ()
5 # reader.train(data_dir=data_dir, train_filename="dev-v2.0-test.json", use_gpu=True, n_epochs=1, save_dir="my_model")
6 # data_dir = "PATH/TO_YOUR/TRAIN_DATA"
----> 7 reader.train(data_dir=data_dir, train_filename="answers.json", use_gpu=True, n_epochs=1, save_dir="my_model")

3 frames
/usr/local/lib/python3.7/dist-packages/haystack/nodes/reader/farm.py in train(self, data_dir, train_filename, dev_filename, test_filename, use_gpu, devices, batch_size, n_epochs, learning_rate, max_seq_len, warmup_proportion, dev_split, evaluate_every, save_dir, num_processes, use_amp, checkpoint_root_dir, checkpoint_every, checkpoints_to_keep, caching, cache_path)
419 checkpoints_to_keep=checkpoints_to_keep,
420 caching=caching,
--> 421 cache_path=cache_path,
422 )
423

/usr/local/lib/python3.7/dist-packages/haystack/nodes/reader/farm.py in _training_procedure(self, data_dir, train_filename, dev_filename, test_filename, use_gpu, devices, batch_size, n_epochs, learning_rate, max_seq_len, warmup_proportion, dev_split, evaluate_every, save_dir, num_processes, use_amp, checkpoint_root_dir, checkpoint_every, checkpoints_to_keep, teacher_model, teacher_batch_size, caching, cache_path, distillation_loss_weight, distillation_loss, temperature, tinybert, processor)
325
326 # 5. Let it grow!
--> 327 self.inferencer.model = trainer.train()
328 self.save(Path(save_dir))
329

/usr/local/lib/python3.7/dist-packages/haystack/modeling/training/base.py in train(self)
289 batch = {key: batch[key].to(self.device) for key in batch}
290
--> 291 loss = self.compute_loss(batch, step)
292
293 # Perform evaluation

/usr/local/lib/python3.7/dist-packages/haystack/modeling/training/base.py in compute_loss(self, batch, step)
373 def compute_loss(self, batch: dict, step: int) -> torch.Tensor:
374 # Forward & backward pass through model
--> 375 logits = self.model.forward(**batch)
376 per_sample_loss = self.model.logits_to_loss(logits=logits, global_step=self.global_step, **batch)
377 return self.backward_propagate(per_sample_loss, step)
TypeError: forward() got an unexpected keyword argument 'passage_start_t'

sjrl · 2022-07-26T07:40:13Z

Hi, @erenaldis could you provide us with the entirety of your code that was used when this error occurred? This will make it easier for us to help debug your problem.

Additionally, could you provide us with a few example entries of data from your answers.json file?

ZanSara · 2022-07-26T09:48:07Z

Hey @erenaldis, we need a bit more information from you to be able to help. So:

Are you running https://github.com/deepset-ai/haystack/blob/master/tutorials/Tutorial2_Finetune_a_model_on_your_data.ipynb without changes, or did you modify it?
- if you didn't modify it, can you provide the link you found the code at?
- If you modified it, can you share the code directly?
Are you running it with the same data provided by the tutorial, or on your data?
- If you're using your data, can you share a sample of it?
Are you executing it locally or on Colab?
- If you're running locally, which version of Haystack are you using?
- If you're on Colab, is it a GPU environment?

erenaldis · 2022-07-26T22:16:02Z

Thank you for your reply @ZanSara @sjrl
I am running https://github.com/deepset-ai/haystack/blob/master/tutorials/Tutorial2_Finetune_a_model_on_your_data.ipynb with the following change to point to my own data:

reader = FARMReader(model_name_or_path="distilbert-base-uncased-distilled-squad", use_gpu=True)
data_dir = "data"
reader.train(data_dir=data_dir, train_filename="answers.json", use_gpu=True, n_epochs=1, save_dir="my_model")

Here is a sample of it:

{ "data": [ { "paragraphs": [ { "qas": [ { "question": "Who are the borrowers?", "id": 424471, "answers": [ { "answer_id": 497727, "document_id": 844029, "question_id": 424471, "text": " AIR INDUSTRIES MACHINING, CORP., a New York corporation (â€œAIMâ€�), NASSAU TOOL WORKS, INC., a New York corporation (â€œNTWâ€�), THE STERLING ENGINEERING CORPORATION, a Connecticut corporation", "answer_start": 123, "answer_end": 309, "answer_category": null } ], "is_impossible": false }, { "question": "Who are the lenders?", "id": 424472, "answers": [ { "answer_id": 497728, "document_id": 844029, "question_id": 424472, "text": "WEBSTER BANK, NATIONAL ASSOCIATION", "answer_start": 623, "answer_end": 657, "answer_category": null } ], "is_impossible": false }, { "question": "Who are the guarantors?", "id": 424476, "answers": [ { "answer_id": 497730, "document_id": 844029, "question_id": 424476, "text": "AIR INDUSTRIES GROUP, a Nevada corporation (together with its successors and permitted assigns, â€œParentâ€�), and AIR REALTY GROUP, LLC, a Connecticut limited liability company", "answer_start": 391, "answer_end": 564, "answer_category": null } ], "is_impossible": false } ], "context": "This FOURTH Amendment TO LOAN AND SECURITY AGREEMENT (the â€œAmendmentâ€�), is dated May 17, 2022, and is made by and among (a) AIR INDUSTRIES MACHINING, CORP., a New York corporation (â€œAIMâ€�), NASSAU TOOL WORKS, INC., a New York corporation (â€œNTWâ€�), THE STERLING ENGINEERING CORPORATION, a Connecticut corporation (â€œEngineeringâ€�, and together with AIM and NTW, collectively the â€œBorrowerâ€�), (b) AIR INDUSTRIES GROUP, a Nevada corporation (together with its successors and permitted assigns, â€œParentâ€�), and AIR REALTY GROUP, LLC, a Connecticut limited liability company (â€œRealtyâ€�, and together with Parent, the â€œGuarantorâ€�) and WEBSTER BANK, NATIONAL ASSOCIATION, a national banking association (successor by merger to Sterling National Bank), (together with its successors and permitted assigns, the â€œLenderâ€�).", "document_id": 844029 } ] }, { "paragraphs": [ { "qas": [ { "question": "Who are the borrowers?", "id": 424471, "answers": [ { "answer_id": 497720, "document_id": 844022, "question_id": 424471, "text": "RAINMAKER SYSTEMS. INC", "answer_start": 82, "answer_end": 104, "answer_category": null } ], "is_impossible": false }, { "question": "Who are the lenders?", "id": 424472, "answers": [ { "answer_id": 497721, "document_id": 844022, "question_id": 424472, "text": "BRIDGE BANK, National Association", "answer_start": 122, "answer_end": 155, "answer_category": null } ], "is_impossible": false } ], "context": "THIS BUSINESS LOAN AGREEMENT dated February 2, 2005, is made and executed between RAINMAKER SYSTEMS. INC (â€œBorrowerâ€�) and BRIDGE BANK, National Association (â€œLenderâ€�) on the following terms and conditions. Borrower has received prior commercial loans from Lender or has applied to Lender for a commercial loan or loans or other financial accommodations, including those which may be described on any exhibit or schedule attached to this Agreement (â€œLoanâ€�). Borrower understands and agrees that: (A) in granting, renewing, or extending any Loan, Lender is relying upon Borrowerâ€™s representations, warranties, and agreements as set forth in this Agreement; (B) the granting, renewing, or extending of any Loan by Lender at all times shall be subject to Lenderâ€™s sole judgment and discretion; and (C) all such Loans shall be and remain subject to the terms and conditions of this Agreement.", "document_id": 844022 } ] }, { "paragraphs": [ { "qas": [ { "question": "Who are the borrowers?", "id": 424471, "answers": [ { "answer_id": 497722, "document_id": 844024, "question_id": 424471, "text": "AeroVironment, Inc.", "answer_start": 0, "answer_end": 19, "answer_category": null } ], "is_impossible": false } ], "context": "AeroVironment, Inc., a Delaware corporation (the â€œBorrowerâ€� or â€œyouâ€�), has advised Bank of America, N.A. (through itself or one of its designated affiliates or branch offices, â€œBank of Americaâ€�), BofA Securities, Inc. (or any of its designated affiliates, â€œBofA Securitiesâ€�), JPMorgan Chase Bank, N.A. (â€œJPMâ€�) and U.S. Bank National Association (â€œU.S. Bankâ€�; U.S. Bank, together with Bank of America, BofA Securities and JPM, the â€œCommitment Parties,â€� â€œweâ€� or â€œusâ€�) that you intend to acquire (the â€œAcquisitionâ€�), directly or indirectly, all of the outstanding equity interests of Arcturus UAV, Inc., a California corporation (the â€œTargetâ€�) pursuant to that certain Stock Purchase Agreement, dated as of the date hereof (together with all schedules, exhibits and annexes thereto, the â€œAcquisition Agreementâ€�), among the Target, the persons or entities identified therein as Sellers (collectively, the â€œSellersâ€�), Dâ€™Milo Hallerberg, solely in his capacity as the representative of the Sellers, and you. You have further advised us that, in connection with the foregoing, you intend to consummate the transactions described in the transaction description attached hereto as Exhibit A (the â€œTransaction Descriptionâ€�). Capitalized terms used but not defined herein shall have the meanings assigned to them in the Transaction Description, the Summary of Terms (as defined below) or the Conditions Annex (as defined below), as applicable.", "document_id": 844024 } ] }, { "paragraphs": [ { "qas": [ { "question": "Who are the joint book runners?", "id": 424474, "answers": [ { "answer_id": 497726, "document_id": 844026, "question_id": 424474, "text": "BofA Securities, JPM and U.S. Bank", "answer_start": 0, "answer_end": 34, "answer_category": null } ], "is_impossible": false }, { "question": "Who are the joint lead arrangers?", "id": 424475, "answers": [ { "answer_id": 497725, "document_id": 844026, "question_id": 424475, "text": "BofA Securities, JPM and U.S. Bank", "answer_start": 0, "answer_end": 34, "answer_category": null } ], "is_impossible": false } ], "context": "BofA Securities, JPM and U.S. Bank are pleased to advise you of their willingness, as joint lead arrangers and joint bookrunners (in such capacities, the â€œJoint Lead Arrangersâ€� and each a â€œJoint Lead Arrangerâ€�) for the Facilities, to use their commercially reasonable efforts to form a syndicate of financial institutions (including Bank of America, JPM and U.S. Bank) (collectively, the â€œLendersâ€�) acceptable to you for the Facilities. It is understood and agreed that BofA Securities shall have the â€œleftâ€� placement in any and all marketing materials or other documentation used in connection with the Facilities and shall hold the leading role and responsibilities conventionally associated with such â€œleftâ€� placement, including sole selling role in respect of the Facilities. No additional agents, co-agents or arrangers will be appointed without your and our prior written approval.", "document_id": 844026 } ] }, { "paragraphs": [ { "qas": [ { "question": "Who are the administrative agents?", "id": 424473, "answers": [ { "answer_id": 497723, "document_id": 844025, "question_id": 424473, "text": "Bank of America", "answer_start": 38, "answer_end": 53, "answer_category": null } ], "is_impossible": false } ], "context": "In connection with the foregoing, (a) Bank of America is pleased to offer to be the sole administrative agent (in such capacity, the â€œAdministrative Agentâ€�) for the Facilities and Bank of America is pleased to offer its several and not joint commitment to lend $100 million of the Facilities (to be allocated ratably between the Term Loan Facility and the Revolving Facility), upon and subject to the terms and conditions set forth in this commitment letter, in the Transaction Description, in the Summary of Terms and Conditions attached as Exhibit B hereto (the â€œSummary of Termsâ€�) and in the conditions annex attached hereto as Exhibit C (the â€œConditions Annexâ€�; this commitment letter, the Transaction Description, the Summary of Terms and the Conditions Annex, collectively, this â€œCommitment Letterâ€�), (b) JPM is pleased to offer its several and not joint commitment to lend $100 million of the Facilities upon and subject to the terms and conditions of this Commitment Letter (to be allocated ratably between the Term Loan Facility and the Revolving Facility) and (c) U.S. Bank is pleased to offer its several and not joint commitment to lend $100 million of the Facilities upon and subject to the terms and conditions of this Commitment Letter (to be allocated ratably between the Term Loan Facility and the Revolving Facility).", "document_id": 844025 } ] } ] }

I am running on Colab, with GPU and high-ram

ZanSara · 2022-07-27T07:48:39Z

Thank you! This is a real bug in training, we're actively working on it (#2886). The fix might be out today or tomorrow, so stay tuned 😊

sjrl added the topic:reader label Jul 26, 2022

This was referenced Jul 26, 2022

DPR training is broken #2885

Closed

Explicitly specify all parameters to forward call #2886

Merged

ZanSara mentioned this issue Sep 26, 2022

Run GPU tutorials nightly deepset-ai/haystack-tutorials#33

Closed

sjrl mentioned this issue Jul 28, 2022

Tutorial 2 in colab Fails #2902

Closed

1 task

sjrl added type:bug Something isn't working topic:tutorials labels Jul 28, 2022

ZanSara closed this as completed in #2886 Jul 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tuning a Model on Your Own Data tutorial throws error #2881

Fine-tuning a Model on Your Own Data tutorial throws error #2881

erenaldis commented Jul 26, 2022

sjrl commented Jul 26, 2022

ZanSara commented Jul 26, 2022 •

edited

Loading

erenaldis commented Jul 26, 2022 •

edited

Loading

ZanSara commented Jul 27, 2022

Fine-tuning a Model on Your Own Data tutorial throws error #2881

Fine-tuning a Model on Your Own Data tutorial throws error #2881

Comments

erenaldis commented Jul 26, 2022

sjrl commented Jul 26, 2022

ZanSara commented Jul 26, 2022 • edited Loading

erenaldis commented Jul 26, 2022 • edited Loading

ZanSara commented Jul 27, 2022

ZanSara commented Jul 26, 2022 •

edited

Loading

erenaldis commented Jul 26, 2022 •

edited

Loading