Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualization Pass #228

Merged
merged 22 commits into from
Mar 24, 2019
Merged

Visualization Pass #228

merged 22 commits into from
Mar 24, 2019

Conversation

dgasmith
Copy link
Contributor

@dgasmith dgasmith commented Mar 24, 2019

Description

This is a first pass at Dataset visualization which to plot statistics in Juypter notebooks. We picked the Plotly graph library because it both produce nice interactive graphs and also has the Dash library built on top which will easily allow us to build exploration web apps with the same code.

Bar plot example:
image

Violin plot example:
image (1)

Other items:

  • Dataset.get_history() will not automatically perform a variety of queries to pull down all matching data.
  • Canonicalizes Dataset internal data frames names with a canonical function.

Random bits:

  • qcfractal-manager config top level now long accepts additional arguments.
  • Minor fixes to the server and includes a start-periodic flag as an optional arg.
  • Fixes a TorsionDriveDataset query index issues (@ChayaSt).
  • Provides the ability to filter submission in TorsionDriveDataset.
  • Datasets now have a canonical unit, can be show by Dataset.units. Dataset.units="eV" would change all internal units to electron volts for example, fixes Dataset DataFrame units #208.
  • Removes OpenFF provenance information until we have a clear use case, will bring it back then. Fixes OpenFFWorkflow Fragment Provenance Information #126.

Status

  • Changelog updated
  • Ready to go

@dgasmith dgasmith added the Interface Related to the Interface layer (QCPortal) label Mar 24, 2019
@dgasmith dgasmith added this to the v0.6.0 milestone Mar 24, 2019
@Lnaden
Copy link
Collaborator

Lnaden commented Mar 24, 2019

Nice! I'll be able to get a proper review first thing in the morning. What outstanding things do you have for it?

@QCArchiveBot
Copy link
Collaborator

This pull request introduces 2 alerts when merging b8c5dfd into 9a3aef5 - view on LGTM.com

new alerts:

  • 1 for Missing named arguments in formatting call
  • 1 for __init__ method calls overridden method

Comment posted by LGTM.com

@dgasmith
Copy link
Contributor Author

Lots of things really, but good enough to get in.

@codecov
Copy link

codecov bot commented Mar 24, 2019

Codecov Report

Merging #228 into master will decrease coverage by 16.23%.
The diff coverage is 29.74%.

@codecov
Copy link

codecov bot commented Mar 24, 2019

Codecov Report

Merging #228 into master will decrease coverage by 0.24%.
The diff coverage is 82.22%.

@dgasmith dgasmith force-pushed the visualization branch 2 times, most recently from ba458df to 69a4957 Compare March 24, 2019 02:30
@QCArchiveBot
Copy link
Collaborator

This pull request fixes 2 alerts when merging af42704 into 9a3aef5 - view on LGTM.com

fixed alerts:

  • 2 for Unused local variable

Comment posted by LGTM.com

Copy link
Collaborator

@Lnaden Lnaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature is awesome!

I have one question about the save method an when it should or should not be used. I suspect its something I missed in how TorsionDrive works, but I want to make sure.

as_array : bool, optional
Converts the returned values to NumPy arrays
force : bool, optional
Forces a requery if data is already present

Returns
-------
success : bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now a str.

@@ -80,6 +80,7 @@ def add_specification(self,
spec = TorsionDriveSpecification(
name=lname, optimization_spec=optimization_spec, qc_spec=qc_spec, description=description)
self.data.td_specs[lname] = spec
self.save()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the TDDaset's add_ method here call a save back to the Fractal server? This seems to be different behavior from the other Collection classes which don't invoke save until its manually called or compute is called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Procedure Datasets track their data by ObjectId as our querying is less exact for procedures. Result Datasets only need the molecule ObjectId and it can find the exact result it needs.

@@ -141,14 +142,18 @@ def add_entry(self,
raise KeyError(f"Record {name} already in the dataset.")

self.data.records[lname] = record
self.save()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about save here in a get method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an add_entry we must call save here to update the record. Slower, but safer version of Dataset add_entry, we do not expect tens of thousands of entries here.

@dgasmith
Copy link
Contributor Author

Everything is fine here except for the minor return type issue. Please approve and merge, I will fix the string issue in another PR.

@Lnaden Lnaden merged commit a1c257f into MolSSI:master Mar 24, 2019
@dgasmith dgasmith deleted the visualization branch June 28, 2019 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Interface Related to the Interface layer (QCPortal)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dataset DataFrame units OpenFFWorkflow Fragment Provenance Information
3 participants