-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rm inter sample/prep tables and fixing tests #1856
rm inter sample/prep tables and fixing tests #1856
Conversation
@@ -1044,7 +985,7 @@ def to_dataframe(self): | |||
meta = qdb.sql_connection.TRN.execute_fetchindex() | |||
|
|||
# Create the dataframe and clean it up a bit | |||
df = pd.DataFrame((list(x) for x in meta), columns=cols) | |||
df = pd.DataFrame((list(x) for x in meta), columns=cols, dtype=str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only case that I'm concerned here is if there is a None
in meta
:
In [20]: l = [[1, 3], [None, 2]]
In [21]: df = pd.DataFrame(l, columns=['c1', 'c2'], dtype=str)
In [22]: df
Out[22]:
c1 c2
0 1 3
1 NaN 2
It changes the None
to NaN
. Can this case occur in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that's possible if something in the database is NULL but if it is, my guess is that it's fine ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about the warning at the end of this section on the pandas docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what will be the solution to your concern ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, all NaN
s should be None
s, but pandas for whatever reason it's applying some kind of transformation although you're passing dtype=str
. I think the only option would be to force everything to be None
afterwards using this:
In [32]: df
Out[32]:
c1 c2
0 1 3
1 NaN 2
In [33]: df.where(pd.notnull(df), None)
Out[33]:
c1 c2
0 1 3
1 None 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K
👍 |
Changes Unknown when pulling 35a7dee on antgonza:rm-inter-tables into * on biocore:str-ing_info_fles*. |
Changes Unknown when pulling 35a7dee on antgonza:rm-inter-tables into * on biocore:str-ing_info_fles*. |
@ElDeveloper, @wasade, @HannesHolste, do you guys have a (few) second(s) to review? |
👍 once test pass |
👍 once tests pass. |
Looks good. |
Changes Unknown when pulling 8d9c655 on antgonza:rm-inter-tables into * on biocore:str-ing_info_fles*. |
Removing:
This also includes the code to transform all tables to varchar but is commented cause it increases the testing time dramatically.