Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compute API changes #1417

Merged
merged 25 commits into from
Feb 24, 2016
Merged

compute API changes #1417

merged 25 commits into from
Feb 24, 2016

Conversation

kwmsmith
Copy link
Member

@postelrich @llllllllll here's my proof-of-concept for the compute() API changes. It's messy, and some of the imports will have to change, and we'll need some small refactoring, but I like that it doesn't require huge code changes, and fits in with the existing compute implementation easily.

Example usage:

In [26]: s = bz.symbol('s', 'var * {name: string, id: int}')

In [27]: t = bz.symbol('t', 'var * {name: string, a: int}')

In [28]: sdata = [['a', 1], ['b', 2]]

In [29]: tdata = [['a', 5], ['c', 7]]

The default gives back a list as one would expect:

In [30]: bz.compute(bz.join(s, t, 'name', how='outer'), {s: sdata, t: tdata})
Out[30]: [('a', 1, 5), ('c', None, 7), ('b', 2, None)]

And the return type can be easily overridden:

In [31]: bz.compute(bz.join(s, t, 'name', how='outer'), {s: sdata, t: tdata}, return_type=pd.DataFrame)
Out[31]:
  name  id   a
0    a   1   5
1    c NaN   7
2    b   2 NaN

In [32]: bz.compute(bz.join(s, t, 'name', how='outer'), {s: sdata, t: tdata}, return_type=np.ndarray)
Out[32]:
array([('a', 1.0, 5.0), ('c', nan, 7.0), ('b', 2.0, nan)],
      dtype=[('name', 'O'), ('id', '<f4'), ('a', '<f4')])

Richard Postelnik and others added 4 commits February 11, 2016 14:06
Adds `return_type` kwarg that can be set to:
* 'core' (current default) -- odos return value into a core backend
* 'native' -- leaves return value alone, doesn't modify at all
* user supplied type -- passed on to internal `odo` call.
@@ -45,10 +45,13 @@

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the changes in here?

@kwmsmith
Copy link
Member Author

@llllllllll agreed on your comments, and if we go with this approach, we'll definitely make those improvements. This is just a quick and dirty POC to demonstrate the API I'm thinking of, and how it can be implemented without a major rename or big refactoring.

What are your thoughts on the API?

@llllllllll
Copy link
Member

I like the API. I would like to be able to register new return types like discussed in the other issue but for a first pass this looks great.

@sandhujasmine
Copy link

+1, it looks good.

  • returning results is a more intuitive response from compute()
  • list output by default makes sense for SQLA backends
  • being able to override output return_type is a good idea

@kwmsmith kwmsmith changed the title POC compute API changes compute API changes Feb 23, 2016
@kwmsmith
Copy link
Member Author

@llllllllll will be merging this in the next day or so, last call.

@llllllllll
Copy link
Member

No other comments, thanks!

kwmsmith added a commit that referenced this pull request Feb 24, 2016
@kwmsmith kwmsmith merged commit bd4bd78 into blaze:master Feb 24, 2016
@kwmsmith kwmsmith deleted the feature/compute-api branch February 24, 2016 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants