Invalid Source Data ROWID: ''2.1'' #37

okennedy · 2015-07-30T16:44:59Z

Follow the steps in the Demo and then run the following query:

SELECT * FROM FINALDATA where (rating > 4)

The text was updated successfully, but these errors were encountered:

okennedy · 2015-07-31T23:17:45Z

The issue seems to have nothing to do with the selection predicate itself. Rather, the FINALDATA lens itself seems to be having some issues when called on to classify rows that do not appear in the final result set. Concretely, MISSING_VALUE runs data-harvesting queries of the form:

PROJECT[ROWID <= JOIN_ROWIDS(__LHS_ROWID, __RHS_ROWID), ID <= PRODUCT_ID, CATEGORY <= PRODUCT_CATEGORY, NAME <= PRODUCT_NAME, RATING <= {{ TYPEDRATINGS1_1[__LHS_ROWID] }}, PID <= {{ TYPEDRATINGS1_0[__LHS_ROWID] }}, REVIEW_CT <= {{ TYPEDRATINGS1_2[__LHS_ROWID] }}, BRAND <= PRODUCT_BRAND, __MIMIR_CONDITION <=  ( ({{ TYPEDRATINGS1_0[__LHS_ROWID] }}=PRODUCT_ID)  AND  (JOIN_ROWIDS(__LHS_ROWID, __RHS_ROWID)='2.1') ) ](
  JOIN(
    PROJECT[RATINGS1_PID <= RATINGS1_PID, RATINGS1_RATING <= RATINGS1_RATING, RATINGS1_REVIEW_CT <= RATINGS1_REVIEW_CT, __LHS_ROWID <= ROWID](
      RATINGS1(RATINGS1_PID:string, RATINGS1_RATING:string, RATINGS1_REVIEW_CT:string // ROWID:rowid)
    ),
    PROJECT[PRODUCT_ID <= PRODUCT_ID, PRODUCT_NAME <= PRODUCT_NAME, PRODUCT_BRAND <= PRODUCT_BRAND, PRODUCT_CATEGORY <= PRODUCT_CATEGORY, __RHS_ROWID <= ROWID](
      PRODUCT(PRODUCT_ID:string, PRODUCT_NAME:string, PRODUCT_BRAND:string, PRODUCT_CATEGORY:string // ROWID:rowid)
    )
  )
)

Note the condition:

( ({{ TYPEDRATINGS1_0[__LHS_ROWID] }}=PRODUCT_ID)  AND  (JOIN_ROWIDS(__LHS_ROWID, __RHS_ROWID)='2.1') )

2.1 is the rowid of a row that the MISSING_VALUE lens is being asked to classify a record for, specifically the 2nd row of ratings1 and the 1st row of product. Looking at the data --- these do not join, and ({{ TYPEDRATINGS1_0[__LHS_ROWID] }}=PRODUCT_ID) is false.

What seems to be happening is that rating>4 is triggering some sort of premature evaluation of classify() for a row that is straight up not in the result set.

okennedy · 2015-07-31T23:18:14Z

For reference, here's the full query:

--- Optimized Query ---
PROJECT[NAME <= PRODUCT_NAME, BRAND <= PRODUCT_BRAND, CATEGORY <= PRODUCT_CATEGORY, REVIEW_CT <= {{ TYPEDRATINGS1_2[__LHS_ROWID] }}, PID <= {{ TYPEDRATINGS1_0[__LHS_ROWID] }}, ID <= PRODUCT_ID, RATING <= CASE WHEN {{ TYPEDRATINGS1_1[__LHS_ROWID] }} IS NULL THEN {{ FINALDATA_3[JOIN_ROWIDS(__LHS_ROWID, __RHS_ROWID)] }} ELSE {{ TYPEDRATINGS1_1[__LHS_ROWID] }} END, __MIMIR_CONDITION <=  ( ({{ TYPEDRATINGS1_0[__LHS_ROWID] }}=PRODUCT_ID)  AND  (CASE WHEN {{ TYPEDRATINGS1_1[__LHS_ROWID] }} IS NULL THEN {{ FINALDATA_3[JOIN_ROWIDS(__LHS_ROWID, __RHS_ROWID)] }} ELSE {{ TYPEDRATINGS1_1[__LHS_ROWID] }} END>4) ) ](
  JOIN(
    PROJECT[RATINGS1_PID <= RATINGS1_PID, RATINGS1_RATING <= RATINGS1_RATING, RATINGS1_REVIEW_CT <= RATINGS1_REVIEW_CT, __LHS_ROWID <= ROWID](
      RATINGS1(RATINGS1_PID:string, RATINGS1_RATING:string, RATINGS1_REVIEW_CT:string // ROWID:rowid)
    ),
    PROJECT[PRODUCT_ID <= PRODUCT_ID, PRODUCT_NAME <= PRODUCT_NAME, PRODUCT_BRAND <= PRODUCT_BRAND, PRODUCT_CATEGORY <= PRODUCT_CATEGORY, __RHS_ROWID <= ROWID](
      PRODUCT(PRODUCT_ID:string, PRODUCT_NAME:string, PRODUCT_BRAND:string, PRODUCT_CATEGORY:string // ROWID:rowid)
    )
  )
)

Legacy25 · 2015-08-22T20:48:02Z

This is also fixed I think with commit fe0ae23

okennedy · 2015-08-23T12:07:28Z

I'd like to test things a bit more before closing the issue outright, since I still don't have an idea why this got broken in the first place. Do you know why the fix fixed things?

Legacy25 · 2015-08-23T13:48:25Z

The missing value lens was breaking for multiple columns because every missing value model created by the missing value lens was using the same iterator to get the results. So if there were a combination of multiple missing value models and no-op models, only one of the missing value models was getting the actual data, since as of now there is no reset() in the iterator interface.

Now each model gets its own iterator. I think this is why this issue is being resolved.

okennedy added bug compiler labels Jul 30, 2015

okennedy modified the milestone: Phase 2 Demo Jul 30, 2015

Legacy25 closed this as completed Aug 22, 2015

Legacy25 reopened this Aug 23, 2015

okennedy closed this as completed Dec 16, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalid Source Data ROWID: ''2.1'' #37

Invalid Source Data ROWID: ''2.1'' #37

okennedy commented Jul 30, 2015

okennedy commented Jul 31, 2015

okennedy commented Jul 31, 2015

Legacy25 commented Aug 22, 2015

okennedy commented Aug 23, 2015

Legacy25 commented Aug 23, 2015

Invalid Source Data ROWID: ''2.1'' #37

Invalid Source Data ROWID: ''2.1'' #37

Comments

okennedy commented Jul 30, 2015

okennedy commented Jul 31, 2015

okennedy commented Jul 31, 2015

Legacy25 commented Aug 22, 2015

okennedy commented Aug 23, 2015

Legacy25 commented Aug 23, 2015