Skip to content

Soulios graphs#22

Merged
mai00fti merged 53 commits intoyigbt:masterfrom
soulios:soulios-graphs
Aug 31, 2023
Merged

Soulios graphs#22
mai00fti merged 53 commits intoyigbt:masterfrom
soulios:soulios-graphs

Conversation

@soulios
Copy link
Contributor

@soulios soulios commented May 17, 2023

No description provided.

@soulios soulios marked this pull request as draft May 17, 2023 09:21
Copy link
Contributor Author

@soulios soulios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I checked the files, it doesnt seem to be something wrong with them. Can someone review this? do I have to do anything else besides checking the files?

@bernt-matthias
Copy link
Contributor

There seem to be still conflicts: This branch has conflicts that must be resolved

My strategy would be the following:

  • checkout main and your branch in two different directories
  • manually diff the files listed as conflicts using a tool like meld and try to solve all conflicts (if needed try a merge of the branches .. which should show the conflicts .. if not even better .. then you can skip all this)

If needed I can help next week. Should be around Monday and Wednesday.

Have a nice git-free weekend :)

@soulios
Copy link
Contributor Author

soulios commented Jul 3, 2023

Can you assign it to me and put someone else as a reviewer? so I can resolve some conflicts myself.

@soulios soulios marked this pull request as draft July 11, 2023 10:42
@soulios soulios marked this pull request as ready for review July 11, 2023 10:48
@bernt-matthias
Copy link
Contributor

@soulios cool that there are no more conflicts!

I started the workflows. Linter and tests revealed some (hopefully) minor problems :) Could you take care of these? If I find some time I may add some comments earlier.

ping @halirutan maybe you can also add some first comments.

@mai00fti
Copy link
Collaborator

@soulios

flake and pytests checks have failed.
Please resolve respective issues

Copy link
Contributor

@bernt-matthias bernt-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some test calls for the new functionality to the smoke tests defined here:

dfpl train -f example/train.json

From today's perspective, I would implement it as shell scripts in tests/ and in the pr.yaml file just loop over all those scripts and execute them. Might simplify it to create new tests.

dfpl/options.py Outdated
metavar="DIR",
help="The directory where the full model of the encoder will be saved (if trainAE=True) or "
"loaded from (if trainAE=False). Provide a full path here.",
help="The directory where the full model of the encoder will be saved...",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why shorten the help?

dfpl/options.py Outdated
metavar="BOOL",
type=bool,
help="Track training performance via Weights & Biases, see https://wandb.ai.",
metavar="STRING",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intended that the type changed? This parameter and several others.

Copy link
Contributor Author

@soulios soulios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what should I do here in the setup.py where I install my fork of chemprop with
"chemprop @ git+https://github.com/soulios/chemprop.git@1d73523e49aa28a90b74edc04aaf45d7e124e338"

@mai00fti mai00fti merged commit b059e3a into yigbt:master Aug 31, 2023
@bernt-matthias
Copy link
Contributor

Not sure if I understand why this has been merged yet.

what should I do here in the setup.py where I install my fork of chemprop with
"chemprop @ git+https://github.com/soulios/chemprop.git@1d73523e49aa28a90b74edc04aaf45d7e124e338"

The solution^{TM} is to create a proper pull request of your changes upstream!

I also have to note that this is a blocker conda and Galaxy deployment .. besides being incredibly hard to maintain .. I would say impossible (because someone needs to sync this with upstream changes of chemprop).

@soulios
Copy link
Contributor Author

soulios commented Aug 31, 2023

But if the user installs my version of chemprop at this point, any code will be running. Why should it be always the latest version of chemprop? Mine has a few extra functionalities that the original project does not. i.e tracking the training metrics. I have asked them before if they would like it and said no(in a private chat). So I dont think I can open a pr to them.or shall I and we can decide from there?

@bernt-matthias
Copy link
Contributor

But if the user installs my version of chemprop at this point, any code will be running.

True

Why should it be always the latest version of chemprop?

I'm just saying that it's quite an effort to get updates (like bug fixes and improvements) .. my pessimistic prediction would be that we will be stuck with this version for a long time.

Also, think about the wording in a publication. We can't just write we used chemprop version x.y, but we used a modified version of chemprop and describe in detail what has been changed?

I have asked them before if they would like it and said no(in a private chat). So I don't think I can open a pr to them.or shall I and we can decide from there?

Sure. But it needs to be well prepared. At the moment a few changes seem wrong (e.g. changes in the imports). It's also important to submit PR(s) that are as small/atomic as possible. It's probably a good idea to make all new functionality optional.

We also should think twice if at least some of the changes may be implemented externally (or if we can suggest a change of chemprop such that it can be implemented externally).

) -> (np.ndarray, np.ndarray):
# check the value counts and abort if too imbalanced
df: pd.DataFrame, target: str, opts: options.Options, return_dataframe: bool = False
) -> Union[Tuple[np.ndarray, np.ndarray], pd.DataFrame]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it a bit hard to use because one always needs to check if a DataFrame or a tuple has been returned.

Copy link
Contributor

@bernt-matthias bernt-matthias Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the function is not used correctly in vae.py: the use assumes a triple (and there is no seed parameter)

train_data, val_data, test_data = weight_split(
                df, sizes=(1 - opts.testSize, 0.0, opts.testSize), bias="small", seed=42
            )

wondering if this works at all.

Unfortunately, you completely ignored my request for a test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to find such problems easily you can use mypy dfpl (note that some errors predate your PR :) )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it a bit hard to use because one always needs to check if a DataFrame or a tuple has been returned.

Well the retun_dataframe argument solves this ambiguity,no?

Also the function is not used correctly in vae.py: the use assumes a triple (and there is no seed parameter)

train_data, val_data, test_data = weight_split(
                df, sizes=(1 - opts.testSize, 0.0, opts.testSize), bias="small", seed=42
            )

wondering if this works at all.

Unfortunately, you completely ignored my request for a test.

I didnt know about mypy. On it!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the retun_dataframe argument solves this ambiguity,no?

Indeed

Regarding weight_split .. somehow I mixed up the functions .. but the comment regarding mypy is still valid. But I think you only need to fix errors concerning your contributions.

@bernt-matthias
Copy link
Contributor

Not sure if I understand why this has been merged yet.

Just to clarify: For some reason (temporary problem at github ...), the automated testing did not run for the last commits prior to the merge. Thus the PR may have appeared successful, but it wasn't (at least tests are failing locally).

@bernt-matthias bernt-matthias mentioned this pull request Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants