You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had implement MIMIC III as Postgres SQL localy.
When fetching the column names in Python with pyodbc they partially dont match to the .csv files in compairsion.
def query(table, sql):
"""
table: input tablename to query \n
sql: input SQL query (select * from mimiciii.table) \n
return --> Pandas Dataframe
"""
cnstring = f'DRIVER={{PostgreSQL ODBC Driver(UNICODE)}};SERVER={SERVER};DATABASE={DATABASE};UID={USERNAME};PWD={PASSWORD}'
cnxn = pyodbc.connect(cnstring)
cursor = cnxn.cursor()
colnames = cursor.execute(f"SELECT column_name FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '{table}';").fetchall()
rows = cursor.execute(sql).fetchall()
df = pd.DataFrame.from_records(data=rows, columns=[colname[0] for colname in colnames])
df.columns = [str(i).upper() for i in df.columns]
cursor.close()
return df
colnames=cursor.execute(f"SELECT column_name FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '{table}';").fetchall()
Returns column names in a non-deterministic order, so you wouldn't expect the order to match exactly. The two groups look the same, just in a different order. I would verify that the sets of columns are equal, in which case this is expected and you just need to re-order your columns as necessary.
Prerequisites
Description
I had implement MIMIC III as Postgres SQL localy.
When fetching the column names in Python with pyodbc they partially dont match to the .csv files in compairsion.
def query(table, sql):
"""
table: input tablename to query \n
sql: input SQL query (select * from mimiciii.table) \n
return --> Pandas Dataframe
"""
cnstring = f'DRIVER={{PostgreSQL ODBC Driver(UNICODE)}};SERVER={SERVER};DATABASE={DATABASE};UID={USERNAME};PWD={PASSWORD}'
cnxn = pyodbc.connect(cnstring)
cursor = cnxn.cursor()
colnames = cursor.execute(f"SELECT column_name FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '{table}';").fetchall()
rows = cursor.execute(sql).fetchall()
df = pd.DataFrame.from_records(data=rows, columns=[colname[0] for colname in colnames])
df.columns = [str(i).upper() for i in df.columns]
cursor.close()
return df
Example:
Wrong -- > ['ROW_ID', 'SUBJECT_ID', 'HADM_ID', 'ADMITTIME', 'DISCHTIME',
'DEATHTIME', 'EDREGTIME', 'EDOUTTIME', 'HOSPITAL_EXPIRE_FLAG',
'HAS_CHARTEVENTS_DATA', 'LANGUAGE', 'RELIGION', 'MARITAL_STATUS',
'ETHNICITY', 'DIAGNOSIS', 'ADMISSION_TYPE', 'ADMISSION_LOCATION',
'DISCHARGE_LOCATION', 'INSURANCE']
Correct --> ['ROW_ID', 'SUBJECT_ID', 'HADM_ID', 'ADMITTIME', 'DISCHTIME',
'DEATHTIME', 'EDREGTIME', 'EDOUTTIME', 'DISCHARGE_LOCATION',
'INSURANCE', 'LANGUAGE', 'RELIGION', 'MARITAL_STATUS',
'ETHNICITY', 'DIAGNOSIS', 'ADMISSION_TYPE', 'ADMISSION_LOCATION',
'HOSPITAL_EXPIRE_FLAG', 'HAS_CHARTEVENTS_DATA']
The text was updated successfully, but these errors were encountered: