Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when i running the code, have a bug #6

Closed
Cmnotjx opened this issue Apr 19, 2023 · 14 comments
Closed

when i running the code, have a bug #6

Cmnotjx opened this issue Apr 19, 2023 · 14 comments

Comments

@Cmnotjx
Copy link

Cmnotjx commented Apr 19, 2023

image

@KarhouTam
Copy link
Owner

KarhouTam commented Apr 19, 2023

The version of code you running is the latest? If not, pull the latest and try again. In my environment, the server/fedavg.py can be finished without bug.

@Cmnotjx
Copy link
Author

Cmnotjx commented Apr 19, 2023

Thank you very much for your reply. The bug has been fixed. Besides, I am very curious about how you reproduced the structure of the hypernetwork mentioned in the pFedLA article you mentioned, as the original text does not provide specific details. I think you are very talented!

@KarhouTam
Copy link
Owner

KarhouTam commented Apr 19, 2023

Thanks for your recognition first. You are right that the detail of model structure is not clear. About the shape of hypernetwork, the rough structure is shown in the appendix of pFedLA paper. But the actual parameters of pFedLA are guessed.

Actually, I used to doubt that should I segment the neural network in the chunk way (each chunk includes a conv, bn, relu, pool and one $\alpha$ for one chunk) or the layer way (one $\alpha$ for one conv, bn and linear).

The chunk way would bring the more complicated model structure to this bench, which is hard for generalization and I don't want that.

So in this bench, I reproduce it using the layer way.

However, In my another repo pFedLA, I reproduce the pFedLA in the chunk way. If you are more interest in that way, go check it.😏

@Cmnotjx
Copy link
Author

Cmnotjx commented Apr 19, 2023

Your work is excellent! Thank you very much. When I tried to run fedavg.py and pfedla.py using the "easy" method, I encountered the following issues. I am not sure if it's due to my torch version.
image

@KarhouTam
Copy link
Owner

KarhouTam commented Apr 19, 2023

Can you show me the command you run the code? In my Ubuntu 22.04 workstation, there is no problem.
Python: 3.9.13
PyTorch: 1.13.1
Torchvision: 0.14.1
image
image

@Cmnotjx
Copy link
Author

Cmnotjx commented Apr 19, 2023

image
but my can not work

@Cmnotjx
Copy link
Author

Cmnotjx commented Apr 19, 2023

然后我在ubuntu上也试了一下同样的问题,所以我怀疑是不是版本的问题

@KarhouTam
Copy link
Owner

我没有在低版本的 pytorch 上进行过代码测试,所以感觉应该是版本问题。如果你那边方便配置环境,可以 anaconda 配一个高版本的 pytorch 来运行我的代码。

@KarhouTam
Copy link
Owner

如果还是出现这种错误,可以试试把函数中参数的类型提醒去除
e.g.
def clone_params(src: Union[OrderedDict[str, torch.Tensor], torch.nn.Module]) -> def clone_params(src)

@Cmnotjx
Copy link
Author

Cmnotjx commented Apr 20, 2023

好的谢谢,我还有问题想问您,就是他论文中提到那个中间参数是由什么构成的呢,就是服务器中用于产生各个客户机local模型参数。公式5:客户机模型参数=所有客户机中间参数*当前客户机的alpha(聚合矩阵)。这个所有客户机的中间参数是所有客户机本地训练模型产生的梯度的聚合嘛,还是各个客户端的单个梯度呢

@KarhouTam
Copy link
Owner

pFedLA 中 server 会负责保存所有的 client model parameters,而 e.g. $\theta^{l1}$ 就是所有的 client model 的第 1 层的参数的 torch.stack() 之后的形态。其厚度跟 $\alpha^{l1}$ 的长度一致,所以直接使用乘法来求出一个新的 tensor,这个新的 tensor 包含了 $\alpha^{l1}$$\theta^{l1}$ 的计算关系,所以 pytorch 可以据此求出 $\alpha^{l1}$ 的梯度从而更新负责生成该 $\alpha$ 的超网络。

@zingjk
Copy link

zingjk commented May 19, 2023

Thank you very much for your reply. The bug has been fixed. Besides, I am very curious about how you reproduced the structure of the hypernetwork mentioned in the pFedLA article you mentioned, as the original text does not provide specific details. I think you are very talented!

请问您是怎么解决的呀?我改了版本也一直出现这个问题,能具体说一下么?

@KarhouTam
Copy link
Owner

Hi, @zingjk.

如果遇到环境问题,或许可以尝试一下用 poetry 来构建虚拟环境。用 poetry 能够保证你安装的所有依赖都与我的一样,确保不会出现版本问题。

关于如何使用 poetry 来构建环境,README 有简单的教程。

@zingjk
Copy link

zingjk commented May 19, 2023

Hi, @zingjk.

如果遇到环境问题,或许可以尝试一下用 poetry 来构建虚拟环境。用 poetry 能够保证你安装的所有依赖都与我的一样,确保不会出现版本问题。

关于如何使用 poetry 来构建环境,README 有简单的教程。

我这就去试试,十分感谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants