Converting a join to an arrow ([`R` version](https://github.com/WinVector/rquery/blob/master/Examples/Arrow/JoinArrow.md), [`Python` version](https://github.com/WinVector/data_algebra/blob/master/Examples/Arrow/JoinArrow.md)).

In [1]:
import pandas

from data_algebra.data_ops import *
from data_algebra.arrow import *

d1 = pandas.DataFrame({
    'key': ['a', 'b'],
    'x': [1, 2],
})

d1

Unnamed: 0,key,x
0,a,1
1,b,2


In [2]:
table_1_description = describe_table(d1, table_name='d1')

d2 = pandas.DataFrame({
    'key': ['b', 'c'],
    'y': [3, 4],
})

d2

Unnamed: 0,key,y
0,b,3
1,c,4


In [3]:
table_2_description = describe_table(d2, table_name='d2')

ops =  table_1_description.\
    natural_join(
        b=table_2_description, 
        by=['key'],
        jointype='FULL')

In [4]:
arrow_1 = DataOpArrow(ops, free_table_key=table_1_description.key)
#arrow_1.fit(d1)

print(arrow_1)

[
 'd1':
  [ key: <class 'str'>, x: <class 'numpy.int64'> ]
   ->
  [ key, x, y ]
]



In [5]:
arrow_2 = DataOpArrow(ops, free_table_key=table_2_description.key)

print(arrow_2)

[
 'd2':
  [ key: <class 'str'>, y: <class 'numpy.int64'> ]
   ->
  [ key, x, y ]
]



In [6]:
arrow_1.pipeline.eval_pandas({
    'd1': d1,
    'd2': d2,
})


Unnamed: 0,key,x,y
0,a,1.0,
1,b,2.0,3.0
2,c,,4.0


More on the categorical arrow treatment of data transformations can be found [here](https://github.com/WinVector/data_algebra/blob/master/Examples/Arrow/Arrow.md).

(Examples of advanced inter-operation between the [`R` `rquery` package](https://github.com/WinVector/rquery/) and the [`Python` `data_algebra` package](https://github.com/WinVector/data_algebra) and `SQL` can be found [here](https://github.com/WinVector/data_algebra/blob/master/Examples/LogisticExample/ScoringExample.md).)
