Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable installing the python helpers with pip #96

Merged
merged 12 commits into from
Nov 20, 2020

Conversation

Luthaf
Copy link
Contributor

@Luthaf Luthaf commented Nov 17, 2020

Fix #94

One unresolved question is what should be the name of the package. I went with chemiscope, but this might create a bit of confusion with the npm/javascript chemiscope package. An alternative would be chemiscope-input, but then if/when we create a jupyter plugin for chemiscope it would have to live in a different package.

@Luthaf Luthaf requested a review from ceriottm November 17, 2020 15:39
@Luthaf
Copy link
Contributor Author

Luthaf commented Nov 17, 2020

@rosecers do you think you could give this a look?

@ceriottm
Copy link
Contributor

A few thoughts.

  1. This is important, and an under-developed part of chemiscope. We should make it as easy as possible to generate chemiscopes as part of a ML analysis workflow. As far as I understand, currently the workflow goes through creating an ASE frames that store the various properties - it is unclear to me if this allows exploiting fully the features of chemiscope, see e.g. the definition of units or property descriptions
  2. Something like chemiscope-python, for the package name ?

@Luthaf
Copy link
Contributor Author

Luthaf commented Nov 17, 2020

As far as I understand, currently the workflow goes through creating an ASE frames that store the various properties

That's only true for the command line version. The function supports an extra argument (should probably be renamed to properties) to specify arbitrary properties, including the newly added description and units. I've also started to decouple these scripts from ASE to allow for other structure/properties backends.

We should make it as easy as possible to generate chemiscopes as part of a ML analysis workflow

Completely agree, but I'm not sure we want to tackle this in this repository, and definitively not in this PR =)

This was the goal of our initial idea with @rosecers for the chemistack proposal, and seems to be related to what I understand M-stack would be. The issue here is that there are a lot of different ways one could want to create the map part of chemiscope. While at COSMO we use SOAP/librascal + and a few different mapping algorithms, there is no reason to marry chemiscope with these specific descriptors and mapping algorithms if we want it to be used by the widest variety researchers.

We could create a prototype chemistack easily using notebooks, @rosecers already has one prototype using matplotlib to display the map. This also links into the FlowingSOAP idea at materials cloud. Overall I think we should have a meeting/brainstorming session to decide what we want to do here.

There is also the ASAP tool to so something like this, so we could also help and work with them instead of re-inventing our own wheel.

Something like chemiscope-python, for the package name ?

I think I prefer having the same name in pip and in code (so pip install XXX and then from XXX import), and from chemiscope_python import write_input feels a bit strange. Overall I'm not opposed to chemiscope as a name for the Python package as well, do you think there could be confusion with the npm package with the same name?

@ceriottm
Copy link
Contributor

We should make it as easy as possible to generate chemiscopes as part of a ML analysis workflow

Completely agree, but I'm not sure we want to tackle this in this repository, and definitively not in this PR =)

This PR I agree, this repo we should. I don't mean that the analysis should be part of this repo, but that we should make converting the output of the analysis into a chemiscope as seamless as possible. So M-stack is a separate discussion, here I just wanted to stress that the process of getting an xyz and an analysis output into a chemiscope could be easier.

Something like chemiscope-python, for the package name ?

I think I prefer having the same name in pip and in code (so pip install XXX and then from XXX import), and from chemiscope_python import write_input feels a bit strange. Overall I'm not opposed to chemiscope as a name for the Python package as well, do you think there could be confusion with the npm package with the same name?

My thinking is probably less informed than yours. To me it's not the end of the world to have a pip chemiscope that differs from npm chemiscope but perhaps someone better informed has discussed the case and come up with good practices?

Read the json file to access it in setup.py, and use setuptools
to set chemiscope.__version__ accordingly
@Luthaf
Copy link
Contributor Author

Luthaf commented Nov 18, 2020

I don't mean that the analysis should be part of this repo, but that we should make converting the output of the analysis into a chemiscope as seamless as possible. So M-stack is a separate discussion, here I just wanted to stress that the process of getting an xyz and an analysis output into a chemiscope could be easier.

Oh, right 😄. Completely agree with this, do you have specific pain points that could be improved? Or an ideal workflow that we could try to enable?

By analysis output, do you mean a in-memory numpy array, or some data save on the disk?

@ceriottm
Copy link
Contributor

ceriottm commented Nov 18, 2020 via email

Copy link
Collaborator

@rosecers rosecers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This in general looks good -- there were a few places that I have questions and suggestions, but there's nothing prohibitive.

python/chemiscope/adapters.py Outdated Show resolved Hide resolved
python/chemiscope/adapters.py Outdated Show resolved Hide resolved
python/chemiscope/adapters.py Outdated Show resolved Hide resolved
python/chemiscope/input.py Outdated Show resolved Hide resolved
Comment on lines +59 to +61
parser.add_argument(
"--description", default="", type=str, help="description of the dataset"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also make it possible to load the description from a text file? Seems likely that this will be a very long argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command line script is more for quick & dirty JSON generation. I would rather make people use the python function directly instead if the command line script is not enough for them.

Comment on lines +76 to +77
if args.only_structures and args.cutoff is not None:
raise Exception("--only-structure can not be given with --cutoff")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a warning, rather than an exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? It is bogus, there is no obvious way to recover (should cutoff or only-structure take precedence?), and the user can correct this easily by changing the command line invocation.

@Luthaf
Copy link
Contributor Author

Luthaf commented Nov 18, 2020

One thing that often bugs me is that I want to drop info from the ASE file so there could also be a switch that allows you to drop those fields.

I think I get the idea, that should already be possible within python with del frame.info["whatever"] or del frame.arrays["whatever"]. We could add something like this to the script. I'll open a separate issue to discuss this, so that it does not ends up buried in the PR comments.

@Luthaf
Copy link
Contributor Author

Luthaf commented Nov 19, 2020

So I went a bit ahead and added unit tests for most of the python code. This should help to improve it later, but makes this PR a bit large. I'll try not to add anything else to it =)

@Luthaf

This comment has been minimized.

python/chemiscope/adapters.py Outdated Show resolved Hide resolved
python/chemiscope/adapters.py Show resolved Hide resolved
python/chemiscope/input.py Show resolved Hide resolved
python/chemiscope/input.py Show resolved Hide resolved
python/chemiscope/input.py Show resolved Hide resolved
python/chemiscope/adapters.py Show resolved Hide resolved
python/chemiscope/adapters.py Show resolved Hide resolved
python/chemiscope/adapters.py Show resolved Hide resolved
python/chemiscope/adapters.py Outdated Show resolved Hide resolved
python/chemiscope/adapters.py Show resolved Hide resolved
@rosecers rosecers self-requested a review November 20, 2020 14:34
@Luthaf
Copy link
Contributor Author

Luthaf commented Nov 20, 2020

Everything should be addressed!

Copy link
Collaborator

@rosecers rosecers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. On the record, I'd like us to revisit in the future the decision to omit unexpected/incomplete entries.

@Luthaf Luthaf merged commit 3b61a37 into lab-cosmo:master Nov 20, 2020
@Luthaf
Copy link
Contributor Author

Luthaf commented Nov 20, 2020

On the record, I'd like us to revisit in the future the decision to omit unexpected/incomplete entries.

As soon as we have support for them (#38), I would agree! See also #97 (comment) on this subject.

@Luthaf Luthaf deleted the pip-install branch November 20, 2020 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

install, and easy import of write_chemiscope_input utils
3 participants