AnalysisException: resolved attribute(s) missing for blockEntityLinkage #150

yeikel · 2019-02-08T07:40:57Z

Describe the bug

I'd like to link two dataframes blocking on a field named cd using blockEntityLinkage with the following Schemas :

df1
 |-- name: string (nullable = true)
 |-- address: string (nullable = true)
 |-- cd: string (nullable = true)

df2
 |-- name: string (nullable = true)
 |-- address: string (nullable = true)
 |--cd: string (nullable = true)


 val linkedResults = LuceneRDD.blockEntityLinkage(df1 , df2 ,
    linker,
    Array("cd"),
    Array("cd"),
    500
  )

But it produces the following exception :

Exception in thread "main" org.apache.spark.sql.AnalysisException: resolved attribute(s) cd#351 missing from name#444,address#454,cd#101 in operator !Project [name#444, address#454, cd#101, concat(cd#351) AS __PARTITION_COLUMN__#481];;
!Project [name#444, address#454, cd#101, concat(cd#351) AS __PARTITION_COLUMN__#481]
+- Project [name#444, address#454, cd#101]

A self join in df1 works just fine

I tried renaming the column but it did not work.

I am running Spark 2.1.0 and lucenerdd 0.3.3

Edit : An explanation about the issue can be found here but I don't believe it can be fixed on my end.

Thank you

The text was updated successfully, but these errors were encountered:

yeikel · 2019-02-08T08:02:26Z

For now I can use a regular link with no blocking , but ideally I'd like to know if it possible to use the blocking method.

zouzias · 2019-02-08T08:52:10Z

Can you try with version 0.3.5 and report back?

yeikel · 2019-02-25T05:07:07Z

Seems to be fixed in 0.3.5

yeikel · 2019-02-25T05:42:55Z

Actually it is not..

I am still seeing this in 0.3.5

org.apache.spark.sql.AnalysisException: resolved attribute(s) cd#594 missing from name#738,address#748,mkt_cd#101,id#2 in operator !Project [name#738, address#748, mkt_cd#101, id#2, concat(mkt_cd#594) AS __PARTITION_COLUMN__#778];;

zouzias · 2019-02-25T10:34:21Z

Can you share the full exception here?

zouzias · 2019-02-26T07:49:10Z

I pushed a hotfix here: https://github.com/zouzias/spark-lucenerdd/pull/151/files (feedback is welcome) and I plan to release it tonight under 0.3.6-SNAPSHOT.

yeikel · 2019-02-27T04:10:40Z

Will this be available with Spark 2.1?

I will be testing as soon as it is in maven central

zouzias · 2019-02-27T22:52:14Z

I reproduced here. I will try to fix now.

zouzias · 2019-02-27T23:14:28Z

Released a fix under 0.3.6-SNAPSHOT. Can you look again if the exception appears?

Tests are clean on the CI: https://travis-ci.org/zouzias/spark-lucenerdd/jobs/499546155

yeikel · 2019-03-08T21:22:35Z

I believe it is fixed now. Thank you

zouzias · 2019-03-09T20:23:55Z

Glad to hear.

yeikel · 2019-03-11T13:57:05Z

This is very strange.

When I run it in a cluster (reading from a hive table) , I don't see this error anymore. On the other hand, when I run it in my local (reading parquet files) , I see it. I am not sure how to replicate it and the contents of the files are sensitive , so I can't share them here.

I need to test more , but let's leave it closed for now.

yeikel changed the title ~~org.apache.spark.sql.AnalysisException: resolved attribute(s) missing~~ org.apache.spark.sql.AnalysisException: resolved attribute(s) missing for blockEntityLinkage Feb 8, 2019

yeikel changed the title ~~org.apache.spark.sql.AnalysisException: resolved attribute(s) missing for blockEntityLinkage~~ AnalysisException: resolved attribute(s) missing for blockEntityLinkage Feb 8, 2019

yeikel closed this as completed Feb 25, 2019

yeikel reopened this Feb 25, 2019

zouzias mentioned this issue Feb 27, 2019

Hotfix: issue 150 #151

Merged

zouzias added the bug label Feb 27, 2019

zouzias self-assigned this Feb 27, 2019

zouzias added this to To do in Kanban via automation Feb 27, 2019

zouzias moved this from To do to In progress in Kanban Feb 27, 2019

yeikel closed this as completed Mar 8, 2019

Kanban automation moved this from In progress to Done Mar 8, 2019

zouzias mentioned this issue Mar 11, 2019

Prepare for release v0.3.6 #155

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AnalysisException: resolved attribute(s) missing for blockEntityLinkage #150

AnalysisException: resolved attribute(s) missing for blockEntityLinkage #150

yeikel commented Feb 8, 2019 •

edited

Loading

yeikel commented Feb 8, 2019

zouzias commented Feb 8, 2019

yeikel commented Feb 25, 2019

yeikel commented Feb 25, 2019 •

edited

Loading

zouzias commented Feb 25, 2019

zouzias commented Feb 26, 2019

yeikel commented Feb 27, 2019

zouzias commented Feb 27, 2019

zouzias commented Feb 27, 2019

yeikel commented Mar 8, 2019

zouzias commented Mar 9, 2019

yeikel commented Mar 11, 2019 •

edited

Loading

AnalysisException: resolved attribute(s) missing for blockEntityLinkage #150

AnalysisException: resolved attribute(s) missing for blockEntityLinkage #150

Comments

yeikel commented Feb 8, 2019 • edited Loading

yeikel commented Feb 8, 2019

zouzias commented Feb 8, 2019

yeikel commented Feb 25, 2019

yeikel commented Feb 25, 2019 • edited Loading

zouzias commented Feb 25, 2019

zouzias commented Feb 26, 2019

yeikel commented Feb 27, 2019

zouzias commented Feb 27, 2019

zouzias commented Feb 27, 2019

yeikel commented Mar 8, 2019

zouzias commented Mar 9, 2019

yeikel commented Mar 11, 2019 • edited Loading

yeikel commented Feb 8, 2019 •

edited

Loading

yeikel commented Feb 25, 2019 •

edited

Loading

yeikel commented Mar 11, 2019 •

edited

Loading