Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maybe drop pandas #35

Closed
sckott opened this issue Feb 16, 2016 · 22 comments
Closed

Maybe drop pandas #35

sckott opened this issue Feb 16, 2016 · 22 comments
Milestone

Comments

@sckott
Copy link
Owner

sckott commented Feb 16, 2016

very heavy dependency, and probably more idiomatic python to just return dicts

@sckott sckott modified the milestone: v0.5 Feb 16, 2016
@sckott
Copy link
Owner Author

sckott commented Mar 9, 2016

thoughts @ethanwhite ?

@ethanwhite
Copy link

I think just returning dicts is good. Easy enough to convert to dataframes if the format matches what Pandas takes.

@sckott
Copy link
Owner Author

sckott commented Mar 9, 2016

thanks! will do then

@ryneches
Copy link

Awe. But I like Pandas!

@sckott
Copy link
Owner Author

sckott commented Oct 20, 2016

doesn't it seem like a heavy dependency though?

@ryneches
Copy link

It is, darn it. Is the use-case of pytaxize without Pandas particularly appealing? As long as it still plays nice with Pandas by default, I suppose it doesn't need to be required per se.

@ryneches
Copy link

It also depends on how closely you want it to reflect the R package. If you want it be pretty close, DataFrames are the logical way to make Python behave in an R-ish kind of way.

@sckott
Copy link
Owner Author

sckott commented Oct 20, 2016

Indeed, the R client is focused on data.frame outputs, so in that way using pandas makes it more similar to that. How common will it be for the same person to be switching between taxize and pytaxize? Or is the most common case any one person uses one or the other?

Do you think returning dicts is the most idiomatic python? That is, is that what someone would expect to get back by default?

Dropping pandas makes this library simpler and easier to maintain, so that's a + for dropping

@ghost
Copy link

ghost commented Oct 26, 2016

Thanks for creating this package, I just came across it today.

I definitely see the case for making the library simpler and easier to maintain but my personal preference would be to maintain the option of getting a pandas dataframe returned if at all possible. I think it is becoming idiomatic, especially in the python data science community, to use pandas dataframes as inputs and returns for functions.

If I can be of assistance in maintaining or testing, please let me know.

@sckott
Copy link
Owner Author

sckott commented Oct 26, 2016

@ColinTalbert thanks for your thoughts. Two votes to keep pandas, so i'm willing to keep it. Maybe it makes sense to allow user to say whether they want a pandas dataframe back or a dict?

If I can be of assistance in maintaining or testing, please let me know.

Thanks for the issue #43 - that kind of help is great! using and reporting bugs. Also, any feature requests is great. There are still many things to port over from taxize - so this client is a ways off from being complete at least wrt its R cousin

@ghost
Copy link

ghost commented Oct 27, 2016

One option I've seen and liked is including an optional parameter to
specify the return format:
https://github.com/ulmo-dev/ulmo/blob/0c6337993542f7a46ed9d528ea9046628cd9f8bc/ulmo/nasa/daymet/core.py#L66

If you used this I think you could also set up the pandas import as
optional, which might might make installation easier for folks who aren't
using pandas (ArcMap users, perhaps)

Colin Talbert
GIS Analyst and Developer
US Geological Survey

talbertc@usgs.gov

USGS Fort Collins Science Center
2150 Centre Ave. Bldg. C
Fort Collins, CO 80526
(970) 226-9425

USGS North Central Climate Science Center
(970) 492-4283

Work schedule:
Monday - 7:00 - 3:00 (FORT)
Tuesday - 7:00 - 3:00 (NC CSC)
Wednesday - 7:00 - 3:00 (FORT)
Thursday - 7:00 - 5:00 (FORT)
Friday - 7:00 - 5:00 (FORT)

On Wed, Oct 26, 2016 at 10:41 AM, Scott Chamberlain <
notifications@github.com> wrote:

@ColinTalbert https://github.com/ColinTalbert thanks for your thoughts.
Two votes to keep pandas, so i'm willing to keep it. Maybe it makes sense
to allow user to say whether they want a pandas dataframe back or a dict?

If I can be of assistance in maintaining or testing, please let me know.

Thanks for the issue #43 #43 -
that kind of help is great! using and reporting bugs. Also, any feature
requests is great. There are still many things to port over from taxize -
so this client is a ways off from being complete at least wrt its R cousin


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#35 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACJURIu18lCaSa_nKQnK8EYkLal8mAbvks5q34LIgaJpZM4Hbdzo
.

@sckott
Copy link
Owner Author

sckott commented Oct 27, 2016

As a parameter sounds good to me! ✔️

@sckott
Copy link
Owner Author

sckott commented Oct 27, 2016

If you used this I think you could also set up the pandas import as
optional

how do you do optional imports in a python library?

@ghost
Copy link

ghost commented Oct 27, 2016

something along the lines of:

try:
import pandas as pd

except ImportError:
pd = None

Colin Talbert
GIS Analyst and Developer
US Geological Survey

talbertc@usgs.gov

USGS Fort Collins Science Center
2150 Centre Ave. Bldg. C
Fort Collins, CO 80526
(970) 226-9425

USGS North Central Climate Science Center
(970) 492-4283

Work schedule:
Monday - 7:00 - 3:00 (FORT)
Tuesday - 7:00 - 3:00 (NC CSC)
Wednesday - 7:00 - 3:00 (FORT)
Thursday - 7:00 - 5:00 (FORT)
Friday - 7:00 - 5:00 (FORT)

On Thu, Oct 27, 2016 at 9:56 AM, Scott Chamberlain <notifications@github.com

wrote:

If you used this I think you could also set up the pandas import as
optional

how do you do optional imports in a python library?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#35 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACJURDzHLtUqSCCo2P6JxWvSf4L4GAfJks5q4Mm_gaJpZM4Hbdzo
.

@ghost
Copy link

ghost commented Oct 27, 2016

and then in in code:

if not pd:
raise Exception("Pandas need for this functionality")

Colin Talbert
GIS Analyst and Developer
US Geological Survey

talbertc@usgs.gov

USGS Fort Collins Science Center
2150 Centre Ave. Bldg. C
Fort Collins, CO 80526
(970) 226-9425

USGS North Central Climate Science Center
(970) 492-4283

Work schedule:
Monday - 7:00 - 3:00 (FORT)
Tuesday - 7:00 - 3:00 (NC CSC)
Wednesday - 7:00 - 3:00 (FORT)
Thursday - 7:00 - 5:00 (FORT)
Friday - 7:00 - 5:00 (FORT)

On Thu, Oct 27, 2016 at 9:58 AM, Talbert, Colin talbertc@usgs.gov wrote:

something along the lines of:

try:
import pandas as pd

except ImportError:
pd = None

Colin Talbert
GIS Analyst and Developer
US Geological Survey

talbertc@usgs.gov

USGS Fort Collins Science Center
2150 Centre Ave. Bldg. C
Fort Collins, CO 80526
(970) 226-9425

USGS North Central Climate Science Center
(970) 492-4283

Work schedule:
Monday - 7:00 - 3:00 (FORT)
Tuesday - 7:00 - 3:00 (NC CSC)
Wednesday - 7:00 - 3:00 (FORT)
Thursday - 7:00 - 5:00 (FORT)
Friday - 7:00 - 5:00 (FORT)

On Thu, Oct 27, 2016 at 9:56 AM, Scott Chamberlain <
notifications@github.com> wrote:

If you used this I think you could also set up the pandas import as
optional

how do you do optional imports in a python library?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#35 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACJURDzHLtUqSCCo2P6JxWvSf4L4GAfJks5q4Mm_gaJpZM4Hbdzo
.

@sckott
Copy link
Owner Author

sckott commented Oct 27, 2016

ah, okay

@ghost
Copy link

ghost commented Oct 28, 2016

I've taken a stab at implementing this, take a look at:
https://github.com/ColinTalbert/pytaxize/blob/as_dataframe/pytaxize/itis.py#L9-L13
and:
https://github.com/ColinTalbert/pytaxize/blob/as_dataframe/pytaxize/itis.py#L71-L123
for how I was thinking this would look.

There are obviously many functions in the itis module that would need this change, let alone the modules for other services. If this seems like a workable design pattern, I'll see what I can do to add this functionality to the rest of the itis functions, before submitting a pr.

@sckott
Copy link
Owner Author

sckott commented Oct 28, 2016

thanks @ColinTalbert - looks good, though can we shorten the param as_dataframe to dataframe, just for brevity sake

i realized that there's at least some, didn't look thoroughly through package, internal munging of data with pandas - which I think should be changed to just munging native python data structures - so we'll need to do that too

@sckott
Copy link
Owner Author

sckott commented Oct 28, 2016

don't we also need extra_requires or similar in setup.py ?

@ghost
Copy link

ghost commented Oct 28, 2016

Yes, good point. I'll throw that in as well.

Colin Talbert
GIS Analyst and Developer
US Geological Survey

talbertc@usgs.gov

USGS Fort Collins Science Center
2150 Centre Ave. Bldg. C
Fort Collins, CO 80526
(970) 226-9425

USGS North Central Climate Science Center
(970) 492-4283

Work schedule:
Monday - 7:00 - 3:00 (FORT)
Tuesday - 7:00 - 3:00 (NC CSC)
Wednesday - 7:00 - 3:00 (FORT)
Thursday - 7:00 - 5:00 (FORT)
Friday - 7:00 - 5:00 (FORT)

On Fri, Oct 28, 2016 at 12:38 PM, Scott Chamberlain <
notifications@github.com> wrote:

don't we also need extra_requires or similar in setup.py ?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#35 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACJURLFyDCqG4sJlFgjMfMRuTTOR5eyCks5q4kEPgaJpZM4Hbdzo
.

@dheerajpai
Copy link

Why is this thread still open? :( Why can't we sort this out? has there been a pull
request sent?

@sckott
Copy link
Owner Author

sckott commented Aug 7, 2018

@dheerajpai if @ColinTalbert send a PR we can discuss that. if we don't hear from him soon I can go ahead and do it

@sdtaylor sdtaylor mentioned this issue Jan 8, 2019
@sckott sckott closed this as completed in 2e5e5d6 Jun 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants