-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graph Profiler: save() and load() Functionality #628
Conversation
with open(filepath, "rb") as infile: | ||
data = pickle.load(infile) | ||
|
||
profiler = cls(None, options=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just any class? should there not be some logic around loading the correct class for the profiler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cls is the class it is loading within, this is same logic in BaseProfiler which will allow us to merge all 3 profilers in the long term.
|
||
import networkx as nx | ||
import numpy as np | ||
import pandas as pd | ||
import pickle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pickle is built-in so I think should be within the top imports
"_continuous_distribution": self._continuous_distribution, | ||
"_categorical_distribution": self._categorical_distribution, | ||
"metadata": self.metadata, | ||
"__calculations": self.__calculations, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need __calculations bc we aren't saving options, is that the reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't see options used anywhere...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call, we don't
save_profile.profile["continuous_distribution"]["weight"].pop("properties") | ||
load_profile.profile["continuous_distribution"]["weight"].pop("properties") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also assert these props are the same.
e.g. above:
save_continuous_distribution_props = profile.profile["continuous_distribution"]["weight"].pop("properties")
# Then can (unsure if it is a dict or list or what format.
self.assertEqual(save_continuous_distribution_props, load_continuous_distribution_props)
… in actual graph data to ensure load and save work correctly
74e039a
to
821cf3b
Compare
@@ -2,17 +2,17 @@ | |||
[Great Expectations](https://greatexpectations.io/) is an open source tool built around the idea that should always know what to expect from your data. It helps data teams eliminate pipeline debt, through data testing, documentation, and profiling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit formatting
@@ -183,4 +183,4 @@ | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit formatting
@@ -291,4 +291,4 @@ | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit formatting
@@ -300,4 +300,4 @@ | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit formatting
@@ -300,4 +300,4 @@ | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit formatting
@@ -23,7 +25,6 @@ def __init__(self, data, options=None): | |||
Initialize Graph Profiler. | |||
|
|||
:param data: data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param data: data | |
:param data: data | |
:type name: String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be: one of these two types:
data_readers.graph_data.GraphData or nx.Graph
@@ -23,7 +25,6 @@ def __init__(self, data, options=None): | |||
Initialize Graph Profiler. | |||
|
|||
:param data: data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be: one of these two types:
data_readers.graph_data.GraphData or nx.Graph
save()
andload()
for theGraphProfiler