Skip to content

Can't load local flan-small models due to weight conversion failure  #589

@arnavsinghvi11

Description

@arnavsinghvi11

System Info

OS Version:
Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS
Release: 20.04
Codename: focal

8 A-100 GPUS

Using latest text-generation-inference docker version.

I've run fine-tuning on a Flan-T5-Small model and saved the checkpoint in my local directory. I've stored this local model checkpoint in my data2 volume and run the command as follows:
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data2 ghcr.io/huggingface/text-generation-inference:0.9 --model-id /data2/checkpoint-20 --num-shard $num_shard

But I run into errors with the converting weights as mentioned below.

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Run docker command above.

I get this error now:

2023-07-12T05:45:31.707548Z INFO text_generation_launcher: Args { model_id: "/data2/checkpoint-20", revision: None, sharded: None, num_shard: Some(2), quantize: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: 16000, max_waiting_tokens: 20, hostname: "0341f92fe465", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_domain: None, ngrok_username: None, ngrok_password: None, env: false }
2023-07-12T05:45:31.707602Z INFO text_generation_launcher: Sharding model on 2 processes
2023-07-12T05:45:31.707781Z INFO text_generation_launcher: Starting download process.
2023-07-12T05:45:33.261253Z WARN download: text_generation_launcher: No safetensors weights found for model /data2/checkpoint-20 at revision None. Converting PyTorch weights to safetensors.

2023-07-12T05:45:33.711218Z ERROR text_generation_launcher: Download encountered an error: Traceback (most recent call last):

File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())

File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 164, in download_weights
utils.convert_files(local_pt_files, local_st_files)

File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 53, in convert_files
convert_file(pt_file, sf_file)

File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 21, in convert_file
if "state_dict" in loaded:

TypeError: argument of type 'Seq2SeqTrainingArguments' is not iterable

Error: DownloadError

Expected behavior

I would expect the local model to load as do the models from the hugging-face library. Appreciate any help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions