## Joins (doric join function)
Joins in spark are very coupled to the dataframes, for example for dataframes:

In [22]:
val leftdf = List((1,"hi"), (2, "bye")).toDF("id-left", "value-left")
val rightdf = List((1,"hi"), (2, "bye")).toDF("id-right", "value-right")

[36mleftdf[39m: [32mDataFrame[39m = [id-left: int, value-left: string]
[36mrightdf[39m: [32mDataFrame[39m = [id-right: int, value-right: string]

In [23]:
leftdf.join(rightdf, col("id-left") === col("id-right")).show

+-------+----------+--------+-----------+
|id-left|value-left|id-right|value-right|
+-------+----------+--------+-----------+
|      1|        hi|       1|         hi|
|      2|       bye|       2|        bye|
+-------+----------+--------+-----------+



This case we can do it because the columns to join have different names, but in case we have same name columns, we have to couple more the condition, using alias or the dataframes to extract the column reference.

In [24]:
val leftdf2 = List((1,"hi"), (2, "bye")).toDF("id", "value-left")
val rightdf2 = List((1,"hi"), (2, "bye")).toDF("id", "value-right")

[36mleftdf2[39m: [32mDataFrame[39m = [id: int, value-left: string]
[36mrightdf2[39m: [32mDataFrame[39m = [id: int, value-right: string]

In [25]:
leftdf2.alias("left")
.join(rightdf2.alias("right"), col("left.id") === col("right.id"))
.show

+---+----------+---+-----------+
| id|value-left| id|value-right|
+---+----------+---+-----------+
|  1|        hi|  1|         hi|
|  2|       bye|  2|        bye|
+---+----------+---+-----------+



In [26]:
leftdf2
.join(rightdf2, leftdf2("id") === rightdf2("id"))
.show

+---+----------+---+-----------+
| id|value-left| id|value-right|
+---+----------+---+-----------+
|  1|        hi|  1|         hi|
|  2|       bye|  2|        bye|
+---+----------+---+-----------+



Doric porpouse is to help you decouple your application and make it safer. This examples are very easy but imagine that id in each dataframe is of a different type, we will have the same problem that we had in previous examples. Also, as said, the condition is coupled to this two dataframes or alias.
Doric has a twist for this aproach with a doric join function, a condition that needs to match a left column element and a right column element, but not coupled to already defined dataframes.

In [27]:
val joinFunc = LeftDF.colInt("id") === RightDF.colInt("id")
leftdf2.join(rightdf2, joinFunc, "inner").show

+---+----------+---+-----------+
| id|value-left| id|value-right|
+---+----------+---+-----------+
|  1|        hi|  1|         hi|
|  2|       bye|  2|        bye|
+---+----------+---+-----------+



[36mjoinFunc[39m: [32mDoricJoinColumn[39m = [33mDoricJoinColumn[39m(
  [33mKleisli[39m(habla.doric.package$LeftDoricColumn$$Lambda$5377/276981435@bfa711f)
)

In cases that you need to preprocess your column, buth without creating one, is also very easy:

In [28]:
val leftdf2 = List(("1","hi"), ("2", "bye")).toDF("id", "value-left")
val rightdf2 = List((1,"hi"), (2, "bye")).toDF("id", "value-right")

[36mleftdf2[39m: [32mDataFrame[39m = [id: string, value-left: string]
[36mrightdf2[39m: [32mDataFrame[39m = [id: int, value-right: string]

In [29]:
val joinFunc2 = LeftDF.colString("id") === RightDF(colInt("id").cast[String])
leftdf2.join(rightdf2, joinFunc2, "inner").show

+---+----------+---+-----------+
| id|value-left| id|value-right|
+---+----------+---+-----------+
|  1|        hi|  1|         hi|
|  2|       bye|  2|        bye|
+---+----------+---+-----------+



[36mjoinFunc2[39m: [32mDoricJoinColumn[39m = [33mDoricJoinColumn[39m(
  [33mKleisli[39m(habla.doric.package$LeftDoricColumn$$Lambda$5377/276981435@43139be3)
)

And as you imagine, we stil have all the goodies that doric give us of type cheking and error location :D

In [30]:
val joinFunc2 = LeftDF.colInt("id") === RightDF.colInt("id-2")
leftdf2.join(rightdf2, joinFunc2, "inner").show

: 