Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fedopt baseline #919

Merged
merged 73 commits into from Mar 2, 2022
Merged

Fedopt baseline #919

merged 73 commits into from Mar 2, 2022

Conversation

pedropgusmao
Copy link
Contributor

80% of it done

@danieljanes danieljanes removed their assignment Jan 5, 2022
baselines/pyproject.toml Outdated Show resolved Hide resolved
torch = "^1.10.1"
torchvision = "^0.11.2"
omegaconf = { git = "https://github.com/pedropgusmao/omegaconf.git" }
hydra-core = { git = "https://github.com/pedropgusmao/hydra.git", branch="main" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hydra-core and omegaconf deps make the process of poetry install hang in an endless resolving dependencies loop. If i comment these two lines out, poetry install runs (but ofcourse these two deps are not installed)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind. It eventually ended... I found it faster to do first poetry install without those two deps. Then add them back and do poetry update....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my machine, it takes 1205.3s, which I think is way too long. I will have to dig deeper on this issue, but I think it could be a different PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, maybe i'm thinking the Readme can be updated to say ("oh, this can take a while...")


You can run specific experiments by passing a `conf` subfolder and a given `strategy` as follows.
```sh
python main.py --config-path conf/cifar10 strategy=fedyogi
Copy link
Contributor

@jafermarq jafermarq Feb 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After followiing the instructions in flower/baselines/README.md, I'm unable to run this:

➜  baselines poetry install # I ran this earlier and all got installed
Installing dependencies from lock file

No dependencies to install or update

Installing the current project: flwr_baselines (0.17.0)
➜  baselines cd flwr_baselines/publications/adaptive_federated_optimization 
➜  adaptive_federated_optimization python main.py --config-path conf/cifar10 strategy=fedyogi
Traceback (most recent call last):
  File "main.py", line 5, in <module>
    import flwr as fl
ModuleNotFoundError: No module named 'flwr'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was probably the commented requirement you spotted earlier. I ran it again and it worked. Feel free to try again if you want

Copy link
Contributor

@jafermarq jafermarq Feb 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the readme should be updated to indicate that after doing the poetry install one should do first poetry shell and then run the python command you have. I'd say let's assume nobody knows anything about virtualenvs or poetry. The error i had above is because i ran the python line exactly as instructed in the readme

Copy link
Contributor

@jafermarq jafermarq Feb 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, the strategy=fedyogi doesn't seem to work (it runs but hits NaN loss almost immediately) so maybe let's put good old FedAvg as the first example line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using version 1.44 of grpcio ? This might be related to: ray-project/ray#22518

Copy link
Contributor

@jafermarq jafermarq Mar 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems poetry install now does pull grpcio 1.43. I ran the example and seems to be fine (also w/ FedYogi)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But i still stand to my previous message that it would be good to have some very minor instructions on how to setup the environment. If someone just comes to the readme of this example and tries to run the command in the readme, only error messages await. I'd say is enough to add a link to the top baselines readme (showing how to setup the env) and then update the readme here and add poetry run python3 main.py<...> instead of directly doing python main.py <..>

@pedropgusmao
Copy link
Contributor Author

@jafermarq, would you mind running poetry add grpcio==1.43 and then re-running the example (with FedYogi). To verify that that was the issue, please?

@jafermarq
Copy link
Contributor

jafermarq commented Mar 1, 2022

@pedropgusmao, I ran the example from scratch. Poetry pulls grpcio==1.43 and the example runs fine.

@danieljanes danieljanes enabled auto-merge (squash) March 2, 2022 14:09
@danieljanes danieljanes merged commit 85f80a9 into main Mar 2, 2022
@danieljanes danieljanes deleted the fedopt_baseline branch March 2, 2022 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request new baseline
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants