-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
System Info
OS Version:
Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS
Release: 20.04
Codename: focal
8 A-100 GPUS
Using latest text-generation-inference docker version.
I've run fine-tuning on a Flan-T5-Small model and saved the checkpoint in my local directory. I've stored this local model checkpoint in my data2 volume and run the command as follows:
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data2 ghcr.io/huggingface/text-generation-inference:0.9 --model-id /data2/checkpoint-20 --num-shard $num_shard
But I run into errors with the converting weights as mentioned below.
Information
- Docker
- The CLI directly
Tasks
- An officially supported command
- My own modifications
Reproduction
Run docker command above.
I get this error now:
2023-07-12T05:45:31.707548Z INFO text_generation_launcher: Args { model_id: "/data2/checkpoint-20", revision: None, sharded: None, num_shard: Some(2), quantize: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: 16000, max_waiting_tokens: 20, hostname: "0341f92fe465", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_domain: None, ngrok_username: None, ngrok_password: None, env: false }
2023-07-12T05:45:31.707602Z INFO text_generation_launcher: Sharding model on 2 processes
2023-07-12T05:45:31.707781Z INFO text_generation_launcher: Starting download process.
2023-07-12T05:45:33.261253Z WARN download: text_generation_launcher: No safetensors weights found for model /data2/checkpoint-20 at revision None. Converting PyTorch weights to safetensors.
2023-07-12T05:45:33.711218Z ERROR text_generation_launcher: Download encountered an error: Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 164, in download_weights
utils.convert_files(local_pt_files, local_st_files)
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 53, in convert_files
convert_file(pt_file, sf_file)
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 21, in convert_file
if "state_dict" in loaded:
TypeError: argument of type 'Seq2SeqTrainingArguments' is not iterable
Error: DownloadError
Expected behavior
I would expect the local model to load as do the models from the hugging-face library. Appreciate any help!