Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rm inter sample/prep tables and fixing tests #1856

Merged
merged 4 commits into from
Jun 8, 2016

Conversation

antgonza
Copy link
Member

@antgonza antgonza commented Jun 6, 2016

Removing:

  • study_sample_columns table
  • prep_columns table
  • type_loookup
  • get_datatypes
  • cast_to_python
  • as_python_types
  • convert_type

This also includes the code to transform all tables to varchar but is commented cause it increases the testing time dramatically.

@@ -1044,7 +985,7 @@ def to_dataframe(self):
meta = qdb.sql_connection.TRN.execute_fetchindex()

# Create the dataframe and clean it up a bit
df = pd.DataFrame((list(x) for x in meta), columns=cols)
df = pd.DataFrame((list(x) for x in meta), columns=cols, dtype=str)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only case that I'm concerned here is if there is a None in meta:

In [20]: l = [[1, 3], [None, 2]]

In [21]: df = pd.DataFrame(l, columns=['c1', 'c2'], dtype=str)

In [22]: df
Out[22]: 
    c1 c2
0    1  3
1  NaN  2

It changes the None to NaN. Can this case occur in here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that's possible if something in the database is NULL but if it is, my guess is that it's fine ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the warning at the end of this section on the pandas docs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what will be the solution to your concern ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, all NaNs should be Nones, but pandas for whatever reason it's applying some kind of transformation although you're passing dtype=str. I think the only option would be to force everything to be None afterwards using this:

In [32]: df
Out[32]: 
    c1 c2
0    1  3
1  NaN  2

In [33]: df.where(pd.notnull(df), None)
Out[33]: 
     c1 c2
0     1  3
1  None  2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K

@josenavas
Copy link
Contributor

👍

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 35a7dee on antgonza:rm-inter-tables into * on biocore:str-ing_info_fles*.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 35a7dee on antgonza:rm-inter-tables into * on biocore:str-ing_info_fles*.

@antgonza
Copy link
Member Author

antgonza commented Jun 7, 2016

@ElDeveloper, @wasade, @HannesHolste, do you guys have a (few) second(s) to review?

@josenavas
Copy link
Contributor

👍 once test pass

@ElDeveloper
Copy link
Member

👍 once tests pass.

@HannesHolste
Copy link
Contributor

Looks good.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 8d9c655 on antgonza:rm-inter-tables into * on biocore:str-ing_info_fles*.

@josenavas josenavas merged commit e8f5d83 into qiita-spots:str-ing_info_fles Jun 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants