-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Interestingness Scoring for Colored Bar and Line charts #59
Conversation
Added scoring functions for skew, kurtosis, and number of peaks
…o Interestingness
Also added test for this case in test_interestingness.py
Fixed bug where vis' stats and metadata were not being calculated in a specific case
Fixed bug in the Pandas executor where colored bar/line chart meta data was not being calculated. This also decreased performance on the census data by around 1.5 seconds. |
This seems to be causing an issue in |
@dorisjlee so the error in the build check was due to the metadata of certain Vis objects not being calculated. So in that case, the vis.data.cardinality object was of NoneType, which caused the key error. I've fixed this issue and the test runs fine locally, not sure why the build isn't working properly after I pushed my changes though. |
Updated PandasExecutor to recompute stats and metadata for colored charts since non-colored charts do not require this data to compute interestingness scores. Reverted test_performance to previous version since performance was improved
@@ -200,6 +200,11 @@ def execute_aggregate(view: Vis,isFiltered = True): | |||
for col in columns[1:]: | |||
view.data[col] = view.data[col].fillna(0) | |||
assert len(list(view.data[groupby_attr.attribute])) == len(all_attr_vals), f"Aggregated data missing values compared to original range of values of `{groupby_attr.attribute}`." | |||
#need to compute the statistics and metadata for the view's data if no new rows were added | |||
else: | |||
if view.data.cardinality is None and has_color: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we need to recompute the metadata here?
Merging this in for now, will see if the new metadata refresh changes will fix this. |
…xecutor by explicitly passing _metadata from GroupBy object to DataFrame view.data
This issue is fixed after we pull in the latest code for metadata maintenance. I tried to remove the hack line 203-205 in PandasExecutor.py but it was still required. This is happening because when we do
|
…g#59) * Modular Scores Added scoring functions for skew, kurtosis, and number of peaks * Correlation, Mutual Information, Skew * Removing old unused files * Added Intesestingness Scoring for Colored Bar and Line charts Also added test for this case in test_interestingness.py * Bug fix Pandas Executor Fixed bug where vis' stats and metadata were not being calculated in a specific case * Updated PandasExecutor Updated PandasExecutor to recompute stats and metadata for colored charts since non-colored charts do not require this data to compute interestingness scores. Reverted test_performance to previous version since performance was improved Former-commit-id: b04472d
…PandasExecutor by explicitly passing _metadata from GroupBy object to DataFrame view.data Former-commit-id: eaed32a
Added test case to test_interestingness.py under test_interestingness_1_2_0 function. For addressing issue #52