Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for Pandas.Series #66

Closed
dorisjlee opened this issue Aug 18, 2020 · 1 comment
Closed

Better support for Pandas.Series #66

dorisjlee opened this issue Aug 18, 2020 · 1 comment
Assignees
Labels
bug Something isn't working hard issues that are non-trivial and hard to support priority high priority tasks (for dev)

Comments

@dorisjlee
Copy link
Member

We should create a LuxSeries object to take on the sliced version of the LuxDataframe, following guidelines for subclassing DataFrames. We need to pass the _metadata from LuxDataFrame to LuxSeries so that it is preserved across operations (and therefore doesn't need to be recomputed), related to #65. Currently, this code is commented since LuxSeries is causing issues compared to the original Pd.Series.

class LuxDataFrame(pd.DataFrame):
        ....
	@property
	def _constructor(self):
		return LuxDataFrame

	@property
	def _constructor_sliced(self):
		def f(*args, **kwargs):
			# adapted from https://github.com/pandas-dev/pandas/issues/13208#issuecomment-326556232
			return LuxSeries(*args, **kwargs).__finalize__(self, method='inherit')
		return f
class LuxSeries(pd.Series):
	# _metadata =  ['name','_intent','data_type_lookup','data_type',
	# 			 'data_model_lookup','data_model','unique_values','cardinality',
	# 			'min_max','plot_config', '_current_vis','_widget', '_recommendation']
	def __init__(self,*args, **kw):
		super(LuxSeries, self).__init__(*args, **kw)
	@property
	def _constructor(self):
		return LuxSeries

	@property
	def _constructor_expanddim(self):
		from lux.core.frame import LuxDataFrame
		# def f(*args, **kwargs):
		# 	# adapted from https://github.com/pandas-dev/pandas/issues/13208#issuecomment-326556232
		# 	return LuxDataFrame(*args, **kwargs).__finalize__(self, method='inherit')
		# return f
		return LuxDataFrame

In particular the original name property of the Lux Series is lost when we implement LuxSeries, see test_pandas.py:test_df_to_series for more details.
Example:

df = pd.read_csv("lux/data/car.csv")
df._repr_html_()
series = df["Weight"]
series.name # returns None (BUG!)
series.cardinality # preserved

We should also add a repr to print out the basic histogram for Series objects.

@dorisjlee dorisjlee added bug Something isn't working priority high priority tasks (for dev) hard issues that are non-trivial and hard to support labels Aug 18, 2020
dorisjlee added a commit that referenced this issue Aug 18, 2020
westernguy2 pushed a commit to westernguy2/lux that referenced this issue Sep 2, 2020
@dorisjlee
Copy link
Member Author

Support for Pandas Series is much more robust now with these changes. Thanks @westernguy2 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hard issues that are non-trivial and hard to support priority high priority tasks (for dev)
Projects
None yet
Development

No branches or pull requests

2 participants