Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pandas' _constructor for metadata bookkeeping #7

Open
thvitt opened this issue Oct 10, 2020 · 0 comments
Open

Use pandas' _constructor for metadata bookkeeping #7

thvitt opened this issue Oct 10, 2020 · 0 comments
Assignees

Comments

@thvitt
Copy link

thvitt commented Oct 10, 2020

Since (I guess?) 0.16, pandas supports a _constructor property on DataFrame or Series subclasses that will be used when pandas needs to create a new version of one of its datatypes. This could be used to keep track of our metadata in an easier and more robust way than the current manual implementation, where, e.g. corpus / scalar will always create a plain DataFrame and we have to copy the metadata manually, by implementing something like:

class Corpus:

     _metadata = ['metadata']

     @property
     def _constructor(self):
         def constructor(*args, **kwargs):
	     result = Corpus(*args, **kwargs)
	     result.metadata.update_from(self)
	     return result
	 return constructor

     ...

It would require to get rid of much of the conversion and copying code, though.

@thvitt thvitt self-assigned this Oct 10, 2020
thvitt added a commit that referenced this issue Oct 10, 2020
This includes the warning from #6, although the underlying task of
refactoring the inheritance stuff still needs to be tackled (#7).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant