Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"load weight error" and "Not compiled with GPU support" #4

Open
FJR-Nancy opened this issue Oct 9, 2023 · 10 comments
Open

"load weight error" and "Not compiled with GPU support" #4

FJR-Nancy opened this issue Oct 9, 2023 · 10 comments

Comments

@FJR-Nancy
Copy link

While torch.load('mq-glip-l') , there is a RuntimeError: "PytorchStreamReader failed reading zip archive: not a ZIP archive". mq-glip-t is the same as well. mq-glip-l and mq-glip-t are downloaded from https://drive.google.com/file/d/1O_eb1LrlNqpEsoxD23PAIxW8WB6sGoBO/view and https://drive.google.com/file/d/1n0_D-tisqN5v-IESUEIGzMuO-9wolXiu/view

@YifanXu74
Copy link
Owner

Hi,
I didn't encounter any problem when following these steps:

  1. Download mq-glip-t from google drive.
  2. Load the checkpoint: ckpt = torch.load("mq-glip-t", map_location="cpu").

The problem might be due to incomplete file downloads. Please verify the file sizes and redownload the checkpoint files. The expected file sizes are as follows:
mq-glip-l: 1.8G
mq-glip-t: 1.1G
Let me know if you need any further assistance.

@FJR-Nancy
Copy link
Author

FJR-Nancy commented Oct 10, 2023

The download is complete and the downloaded file sizes are as follows:
mq-glip-l: 2.0G
mq-glip-t: 1.2G
By the way, my python version is 3.8, because 3.9 is incompatible with my environment. Maybe this is the reason why the weight could not be loaded.

@YifanXu74
Copy link
Owner

Hi, I didn't encounter any problem when following these steps:

  1. Download mq-glip-t from google drive.
  2. Load the checkpoint: ckpt = torch.load("mq-glip-t", map_location="cpu").

The problem might be due to incomplete file downloads. Please verify the file sizes and redownload the checkpoint files. The expected file sizes are as follows: mq-glip-l: 1.8G mq-glip-t: 1.1G Let me know if you need any further assistance.

I tested the mentioned process on three different environments: python3.8&torch1.5.0, python3.8&torch2.0.1, and python3.8&torch2.1.0. In all cases, the process executed successfully.
Could you provide the details of your implementation environment, including the versions of python and torch, and a detailed procedure to reproduce the error?

@FJR-Nancy
Copy link
Author

FJR-Nancy commented Oct 10, 2023

I also tried python 3.9 in another environment but it's all the same. I also downloaded the weights several times with different device, but nothing changed. My environment is Python 3.8/3.9 with torch 2.0.1+cu117. Just tried "torch.load('mq-glip-l')" and failed.

@YifanXu74 YifanXu74 assigned YifanXu74 and unassigned YifanXu74 Oct 10, 2023
@YifanXu74
Copy link
Owner

This is very strange, and at the moment, I'm unsure how to resolve it. I tested it with different devices and there were no such issues. Additionally, I have successfully utilized the checkpoints in the past to reproduce the model, so I am certain that the released checkpoints are valid.

@FJR-Nancy
Copy link
Author

Sorry, I found the problem is that some data is lost while transferring the weight to cluster. So the checkpoints are valid.

However, there is another problem in finetuning-free evaluation for custom dataset. "RuntimeError: Not compiled with GPU support" is reported in _C.modulated_deform_conv_forward() of deform_conv.py. Is only CPU supported by now?

@YifanXu74
Copy link
Owner

The code is GPU-compatible. This error might be due to your environment, such as an incorrect CUDA installation or using a CPU version of torch.

@FJR-Nancy
Copy link
Author

The GPU problem is solved following microsoft/GLIP#41.

But while finetuning, an error "too many values to unpack" happens in "for iteration, (images, targets, idxs, positive_map, positive_map_eval, greenlight_map) in enumerate(data_loader, start_iter):" (line 92 of maskrcnn_benchmark/engine/trainer.py). What is this problem?

@YifanXu74
Copy link
Owner

YifanXu74 commented Oct 16, 2023

Hi, this issue seems unrelated to the previous topic. In order to better track and address it, would you mind opening a new issue to discuss this problem? It would be helpful if you could provide detailed instructions or commands to reproduce the error you encountered. And I will handle this issue tomorrow.

@FJR-Nancy
Copy link
Author

ok

@YifanXu74 YifanXu74 changed the title load weight error "load weight error" and "Not compiled with GPU support" Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants