Error Creating Index due to Bad Chars #14

ScarletGuo · 2018-11-14T02:00:23Z

Error:

LINE 1: ...ATE INDEX census_0126 ON census_01 (migration_code-change_in...
                                                             ^
 [SQL: 'CREATE INDEX census_0126 ON census_01 (migration_code-change_in_msa)'] (Background on this error at: http://sqlalche.me/e/f405)
For table: census_01 ERROR while creating index for table census_01 on attributes ['migration_code-change_in_msa']

===================

tmp solution added to my local holoclean/dataset/table:36

bad_chars = re.compile('[^a-zA-Z0-9]+')
renamed_cols = [bad_chars.sub(' ', col).strip().replace(' ', '_') for col in columns]

The text was updated successfully, but these errors were encountered:

ScarletGuo · 2018-11-14T05:43:44Z

However, the tmp solution above can result in inconsistency of attribute names when it comes to evaluation. Similar changes might be needed for eval.py when loading clean data

richardwu · 2018-11-15T17:34:24Z

The error arises because we don't quote columns (attributes) in our queries since Postgres's lexer attempts to resolve these unquoted tokens while also downcasing the attribute (see https://stackoverflow.com/questions/20878932/are-postgresql-column-names-case-sensitive).

I fixed how we treat attributes as columns in #18 by retaining their original format from the raw datasets. This should no longer be an issue: I tested this by replacing the attribute "HospitalName" with "Hospital-Name" in the hospital dataset

// hospital.csv
ProviderNumber,Hospital-Name,Address1,Address2,Address3,City,State,ZipCode,CountyName,PhoneNumber,HospitalType,HospitalOwner,EmergencyService,Condition,MeasureCode,MeasureName,Score,Sample,Stateavg
10018,callahan eye foundation hospital,1720 university blvd,,,birmingham,al,35233,jefferson,2053258100,acute care hospitals,voluntary non-profit - private,yes,surgical infection prevention,scip-card-2,surgery patients who were taking heart drugs caxxed beta bxockers before coming to the hospitax who were kept on the beta bxockers during the period just before and after their surgery,,,al_scip-card-2
...

// hospital_clean.csv
tid,attribute,correct_val
0,ProviderNumber,10018
0,Hospital-Name,callahan eye foundation hospital
0,Address1,1720 university blvd
...

// hospital_constraints_att.txt
t1&t2&EQ(t1.Condition,t2.Condition)&EQ(t1.MeasureName,t2.MeasureName)&IQ(t1.HospitalType,t2.HospitalType)
t1&t2&EQ(t1.Hospital-Name,t2.Hospital-Name)&IQ(t1.ZipCode,t2.ZipCode)
t1&t2&EQ(t1.Sample,t2.Sample)&IQ(t1.Score,t2.Score)
...

and the test ran perfectly

Precision = 0.96, Recall = 0.69, Repairing Recall = 0.80, F1 = 0.80, Repairing F1 = 0.88, Detected Errors = 435, Total Errors = 509, Correct Repairs = 350, Total Repairs = 365, Total Repairs (Grdth present) = 365

thodrek · 2018-11-22T22:22:27Z

@richardwu is this fixed? if so please close.

richardwu · 2018-11-22T23:18:33Z

Yes this is fixed. I can't seem to close this issue (I believe only the author and/or maintainers can).

This was referenced Nov 15, 2018

Groundwork for fusion implementation and assorted bug fixes #18

Merged

Do not downcase attributes and use quotes when referring to columns in Postgres. #24

Closed

ScarletGuo closed this as completed Nov 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error Creating Index due to Bad Chars #14

Error Creating Index due to Bad Chars #14

ScarletGuo commented Nov 14, 2018 •

edited

Loading

ScarletGuo commented Nov 14, 2018 •

edited

Loading

richardwu commented Nov 15, 2018 •

edited

Loading

thodrek commented Nov 22, 2018

richardwu commented Nov 22, 2018

Error Creating Index due to Bad Chars #14

Error Creating Index due to Bad Chars #14

Comments

ScarletGuo commented Nov 14, 2018 • edited Loading

Error:

tmp solution added to my local holoclean/dataset/table:36

ScarletGuo commented Nov 14, 2018 • edited Loading

richardwu commented Nov 15, 2018 • edited Loading

thodrek commented Nov 22, 2018

richardwu commented Nov 22, 2018

ScarletGuo commented Nov 14, 2018 •

edited

Loading

ScarletGuo commented Nov 14, 2018 •

edited

Loading

richardwu commented Nov 15, 2018 •

edited

Loading