Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate streaming and database streaming. Python 3-ify #49

Merged
merged 3 commits into from Aug 17, 2015

Conversation

mdeland
Copy link
Contributor

@mdeland mdeland commented Aug 15, 2015

This would be a breaking change since it separates text.streamers from text.database_streamers. The difference is that people can use rosetta.text without having to have database dependencies like pymongo and MySQL-python (the latter of which requires a mysql client dependency, which is kind of annoying to carry around if you aren't using mysql). This would partially address people's install issues (e.g. #48)

The other changes are so that rosetta installs using into python3 environments. There are a couple slightly dirty solutions here, implemented by catching ImportError, but for the most part rosetta is python3 compatible so it might as well work there.

@dkrasner
Copy link
Contributor

this looks good to me; @langmore - thoughts?

it might also make sense to move MySQLStreamer to work with sqlAlchemy, much like pandas.io.sql does; this would solve the compatibility issues with python 3, dependency on mysql from MySQL-python, and could very well be a lot nicer to work with.

@@ -1,11 +1,16 @@
"""
Common functions/classes for dataprep.
"""
from __future__ import print_function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also import

division, and unicode_literals

@langmore
Copy link
Contributor

Thanks for updating things. I like removing the dependencies. If a file split is the only way, then I think it's worth it.

@dkrasner
Copy link
Contributor

yeah i think a file split is the easiest/cleanest; we can also try/except import within the classes themselves (for example: import MySQLdb during MySQLStreamer__init__()) but this is prob easiest, even if it screws up some rosetta import in peoples' code here and there.

@langmore if you are ok with this I'll merge and I think we are due for a release

@langmore
Copy link
Contributor

I'm OK with it.

On Mon, Aug 17, 2015, 4:42 AM dkrasner notifications@github.com wrote:

yeah i think a file split is the easiest/cleanest; we can also try/except
import within the classes themselves (for example: import MySQLdb during
MySQLStreamer__init__()) but this is prob easiest, even if it screws up
some rosetta import in peoples' code here and there.

@langmore https://github.com/langmore if you are ok with this I'll
merge and I think we are due for a release


Reply to this email directly or view it on GitHub
#49 (comment)
.

dkrasner added a commit that referenced this pull request Aug 17, 2015
Separate streaming and database streaming.  Python 3-ify
@dkrasner dkrasner merged commit f405420 into columbia-applied-data-science:master Aug 17, 2015
@dkrasner dkrasner deleted the stream-module branch August 17, 2015 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants