Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append common actions to the PandasClient #1464

Closed
wants to merge 11 commits into from

Conversation

tonyfast
Copy link
Contributor

@tonyfast tonyfast commented Jun 1, 2018

This pull request adds list_tables, load_data, and create_table attributes for the PandasClient.

@cpcloud cpcloud self-assigned this Jun 1, 2018
@cpcloud cpcloud added feature Features or general enhancements expressions Issues or PRs related to the expression API pandas The pandas backend labels Jun 1, 2018
@cpcloud cpcloud added this to the 0.14 milestone Jun 1, 2018
Copy link
Member

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! A few comments.

return list(filter(lambda t: pattern.findall(t), tables))
return tables


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kill this newline.

dtypes = dict(ibis_schema_to_pandas(schema))
df = pd.DataFrame(columns=list(dtypes.keys())).astype(dtypes)

self.dictionary[table_name] =df
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spacing after the = in =df

-------
if_exists : boolean
"""
return bool(self.list_tables(like=name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also implement this as name in self.dictionary which avoids going through every table.

@@ -35,12 +35,19 @@ def test_client_table(table):
def test_client_table_repr(table):
assert 'PandasTable' in repr(table)

def test_load_data(client):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use two newlines after function definitions.

df = pd.DataFrame(obj)
else:
dtypes = dict(ibis_schema_to_pandas(schema))
df = pd.DataFrame(columns=list(dtypes.keys())).astype(dtypes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should be able to write this as: schema.apply_to(pd.DataFrame()).


def test_literal(client):
lit = ibis.literal(1)
result = client.execute(lit)
assert result == 1

def test_list_tables(client):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two newlines


def test_literal(client):
lit = ibis.literal(1)
result = client.execute(lit)
assert result == 1

def test_list_tables(client):
assert len(client.list_tables()) > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can implement this as assert client.list_tables()

tables = list(self.dictionary.keys())
if like is not None:
pattern = re.compile(like)
return list(filter(lambda t: pattern.findall(t), tables))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason you chose to use pattern.findall here instead of using re.search and checking if the match result is None? We only need to find the first matching occurrence and then terminate. Using findall will return every matching non-overlapping substring, which seems a bit wasteful.

@tonyfast
Copy link
Contributor Author

tonyfast commented Jun 4, 2018

Thanks for the comments. Are these failing circle tests something that this PR caused?

@cpcloud
Copy link
Member

cpcloud commented Jun 4, 2018

@tonyfast nope, I'm working on fixing them in #1458

Copy link
Member

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except for the exists_table implementation. After that is fixed this is ready to merge.

-------
if_exists : boolean
"""
return name in self.list_tables()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this should be name in self.dictionary. No reason to make this unnecessarily slow.

@xmnlab
Copy link
Contributor

xmnlab commented Jun 4, 2018

@tonyfast maybe you can use this workaround https://github.com/Quansight/ibis/blob/master/ci/requirements-dev-3.6.yml#L17 until #1458 is not done.

@cpcloud
Copy link
Member

cpcloud commented Jun 4, 2018

@tonyfast rebase on master and you'll be able to get past any failures that aren't related this PR.

@cpcloud
Copy link
Member

cpcloud commented Jun 4, 2018

@tonyfast Looks like there's another issue, putting a PR up now to fix it.

@cpcloud
Copy link
Member

cpcloud commented Jun 4, 2018

#1468 should fix these errors.

# kwargs is a catch all for any options required by other backends.
self.dictionary[table_name] = pd.DataFrame(obj)

def create_table(self, table_name, obj=None, schema=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for this?

@cpcloud
Copy link
Member

cpcloud commented Jun 4, 2018

@tonyfast it looks like you're out of date with master, can you rebase again?

Copy link
Member

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tonyfast
Copy link
Contributor Author

tonyfast commented Jun 4, 2018

👌Thanks @cpcloud!

@cpcloud cpcloud closed this in f7392fa Jun 4, 2018
@cpcloud
Copy link
Member

cpcloud commented Jun 4, 2018

thanks @tonyfast!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
expressions Issues or PRs related to the expression API feature Features or general enhancements pandas The pandas backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants