Replies: 1 comment 2 replies
-
Unable to infer schema often means you're reading a file with zero rows. This error could occur if one of your input files has zero rows. another possibility is a Splink intermediate output has zero rows. It might help to increase target rows. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Facing issues when trying to execute the example code (https://moj-analytical-services.github.io/splink/demos/example_simple_pyspark.html) in my spark cluster.
Error while executing the following command:
linker.estimate_u_using_random_sampling(target_rows=5e5)
AnalysisException: 'Unable to infer schema for Parquet. It must be specified manually'
I had provided the break_lineage_method as "parquet"
When I set the break_lineage_method as "persist" or "checkpoint", I get an error message "str function is not callable"
Appreciate any help in resolving this issue.
Beta Was this translation helpful? Give feedback.
All reactions