-
Notifications
You must be signed in to change notification settings - Fork 24
Closed
Description
Issue: KeyError 'seed' When Running Snippet to Concepts Generation Script
Description
When attempting to run the 'Snippet to concepts generation' script, I encounter a KeyError related to a missing 'seed' key in the input data.
Error Output
File "/share/home/starcoder2-self-align/src/star_align/self_ossinstruct.py", line 514, in <module>
asyncio.run(main())
File "/share/home/anaconda3/envs/starcoder-generate/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/share/home/anaconda3/envs/starcoder-generate/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/share/home/starcoder2-self-align/src/star_align/self_ossinstruct.py", line 411, in main
kwargs = build_kwargs(args.instruct_mode, example)
File "/share/home/starcoder2-self-align/src/star_align/self_ossinstruct.py", line 307, in build_kwargs
kwargs["snippet"] = example["seed"]
KeyError: 'seed'
Command Used
python src/star_align/self_ossinstruct.py \
--instruct_mode "S->C" \
--seed_data_files seed.jsonl \
--max_new_data 50000 \
--tag concept_gen \
--temperature 0.7 \
--seed_code_start_index 0 \
--model bigcode/starcoder2-15b \
--num_fewshots 8 \
--num_batched_requests 32 \
--num_sample_per_request 1Source of Seed Data
The seed data file used is sourced from the following URL:
bigcode/python-stack-v1-functions-filtered-sc2
Steps to Reproduce
- Setup the environment using the provided command.
- Run the script as shown in the command section.
Actual Behavior
The script fails with a KeyError indicating that the 'seed' key is missing from the input data examples.
Additional Information
- Environment: Conda environment with Python 3.10
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels