# How to work with R

* **Difficulty level**: easy
* **Time need to lean**: 10 minutes or less
  

## R <a id="R"></a>

Because there is no one to one correspondence of data types between Python and R datatypes, SoS tries to translate variables in the most natural way. For example, although `3` and `[1, 4]` are both `numeric` type in R (the former have length 1), they are translated to Python variables `3` (an integer) and `[1, 4]` (a list).

SoS `%put` variables in SoS to R as follows:

  
  | Python  |  condition |   R |
  | --- | --- |---|
  | `None` | |    `NULL` |
  | `boolean` |   | `logical` |
  | `integer` |  |  `integer` |
  | `float` |  |  `numeric` |
  | `complex` |  |  `complex` |
  | `str` |  | `character` |
  | Sequence (`list`, `tuple`, ...) |  homogenous type |  `c()` |
  | Sequence (`list`, `tuple`, ...) |  multiple types |  `list` |
  | `set` |  |  `list` |
  | `dict` |  |  `list` with names |
  | `numpy.ndarray` |  | array |
  | `numpy.matrix` |  | `matrix` |
  | `pandas.DataFrame` |  |  R `data.frame` |

  Python objects in other datatypes are transferred as string `"Unsupported datatype"`.

SoS `%get` data from R as follows:

  
  | R  |  length (n) |   Python |
  | --- | --- |---|
  | `NULL` | |    `None` |
  | `logical` |  `1` |  `boolean` |
  | `integer` |  `1` |  `integer` |
  | `numeric` |  `1` |  `double` |
  | `character` |  `1` |  `string` |
  | `complex` |  `1` |  `complex` |
  | `logical` |  `n > 1` |  `list` |
  | `integer` |  `n > 1` |  `list` |
  | `complex` |  `n > 1` |  `list` |
  | `numeric` |  `n > 1` |  `list` |
  | `character` |  `n > 1` |  `list` |
  | `list` without names |  `n > 0` | `list` |
  | `list` with names |  `n > 0` |  `dict` (with ordered keys)|
  | `matrix` |  `n > 0` |  `numpy.array` |
  | `data.frame` |  `n > 0` |  `DataFrame` |
  | `array` |  `n >= 0` |  `numpy.array` |

For example, the scalar data is converted from SoS to R as follows:

In [1]:
null_var = None
num_var = 123
logic_var = True
char_var = '1"23'
comp_var = 1+2j

In [2]:
%get null_var num_var logic_var char_var comp_var
%preview -n null_var num_var logic_var char_var comp_var

NULL

The one-dimension (vector) data is converted from SoS to R as follows:

In [3]:
import numpy
import pandas
char_arr_var = ['1', '2', '3']
list_var = [1, 2, '3']
dict_var = dict(a=1, b=2, c='3')
set_var = {1, 2, '3'}
recursive_var = {'a': {'b': 123}, 'c': True}
logic_arr_var = [True, False, True]
seri_var = pandas.Series([1,2,3,3,3,3])

In [4]:
%get char_arr_var list_var dict_var set_var recursive_var logic_arr_var seri_var
%preview -n char_arr_var list_var dict_var set_var recursive_var logic_arr_var seri_var

The multi-dimension data is converted from SoS to R as follows:

In [5]:
num_arr_var = numpy.array([1, 2, 3, 4]).reshape(2,2)
mat_var = numpy.matrix([[1,2],[3,4]])

In [6]:
%get num_arr_var mat_var
%preview -n num_arr_var mat_var

0,1
1,2
3,4


0,1
1,2
3,4


The scalar data is converted from R to SoS as follows:

In [7]:
null_var = NULL
num_var = 123
logic_var = TRUE
char_var = '1\"23'
comp_var = 1+2i

In [8]:
%get null_var num_var logic_var char_var comp_var --from R
%preview -n null_var num_var logic_var char_var comp_var

None

123

True

'1"23'

(1+2j)

The one-dimension (vector) data is converted from R to SoS as follows:

In [9]:
num_vector_var = c(1, 2, 3)
logic_vector_var = c(TRUE, FALSE, TRUE)
char_vector_var = c(1, 2, '3')
list_var = list(1, 2, '3')
named_list_var = list(a=1, b=2, c='3')
recursive_var = list(a=1, b=list(c=3, d='whatever'))
seri_var = setNames(c(1,2,3,3,3,3),c(0:5))

In [10]:
%get num_vector_var logic_vector_var char_vector_var list_var named_list_var recursive_var seri_var --from R
%preview -n num_vector_var logic_vector_var char_vector_var list_var named_list_var recursive_var seri_var

[1, 2, 3]

[True, False, True]

['1', '2', '3']

[1, 2, '3']

{'a': 1, 'b': 2, 'c': '3'}

{'a': 1, 'b': {'c': 3, 'd': 'whatever'}}

0    1
1    2
2    3
3    3
4    3
5    3
dtype: int64

The multi-dimension data is converted from R to SoS as follows:

In [11]:
mat_var = matrix(c(1,2,3,4), nrow=2)
arr_var = array(c(1:16),dim=c(2,2,2,2))

In [12]:
%get mat_var arr_var --from R
%preview -n mat_var arr_var

array([[ 1.,  3.],
       [ 2.,  4.]])

array([[[[ 1,  3],
         [ 2,  4]],

        [[ 5,  7],
         [ 6,  8]]],


       [[[ 9, 11],
         [10, 12]],

        [[13, 15],
         [14, 16]]]])

It is worth noting that R's named `list` is transferred to Python as dictionaries but SoS preserves the order of the keys so that you can recover the order of the list. For example,

In [13]:
Rlist = list(A=1, C='C', B=3, D=c(2, 3))

Although the dictionary might appear to have different order

In [14]:
%get Rlist --from R
Rlist

{'A': 1, 'B': 3, 'C': 'C', 'D': [2, 3]}

The order of the keys and values are actually preserved

In [15]:
Rlist.keys()

dict_keys(['A', 'C', 'B', 'D'])

In [16]:
Rlist.values()

dict_values([1, 'C', 3, [2, 3]])

so it is safe to enumerate the R list in Python as

In [17]:
for idx, (key, val) in enumerate(Rlist.items()):
  print(f"{idx+1} item of Rlist has key {key} and value {val}")

1 item of Rlist has key A and value 1
2 item of Rlist has key C and value C
3 item of Rlist has key B and value 3
4 item of Rlist has key D and value [2, 3]


## R <a id="R"></a>

Because there is no one to one correspondence of data types between Python and R datatypes, SoS tries to translate variables in the most natural way. For example, although `3` and `[1, 4]` are both `numeric` type in R (the former have length 1), they are translated to Python variables `3` (an integer) and `[1, 4]` (a list).

SoS `%put` variables in SoS to R as follows:

  
  | Python  |  condition |   R |
  | --- | --- |---|
  | `None` | |    `NULL` |
  | `boolean` |   | `logical` |
  | `integer` |  |  `integer` |
  | `float` |  |  `numeric` |
  | `complex` |  |  `complex` |
  | `str` |  | `character` |
  | Sequence (`list`, `tuple`, ...) |  homogenous type |  `c()` |
  | Sequence (`list`, `tuple`, ...) |  multiple types |  `list` |
  | `set` |  |  `list` |
  | `dict` |  |  `list` with names |
  | `numpy.ndarray` |  | array |
  | `numpy.matrix` |  | `matrix` |
  | `pandas.DataFrame` |  |  R `data.frame` |

  Python objects in other datatypes are transferred as string `"Unsupported datatype"`.

SoS `%get` data from R as follows:

  
  | R  |  length (n) |   Python |
  | --- | --- |---|
  | `NULL` | |    `None` |
  | `logical` |  `1` |  `boolean` |
  | `integer` |  `1` |  `integer` |
  | `numeric` |  `1` |  `double` |
  | `character` |  `1` |  `string` |
  | `complex` |  `1` |  `complex` |
  | `logical` |  `n > 1` |  `list` |
  | `integer` |  `n > 1` |  `list` |
  | `complex` |  `n > 1` |  `list` |
  | `numeric` |  `n > 1` |  `list` |
  | `character` |  `n > 1` |  `list` |
  | `list` without names |  `n > 0` | `list` |
  | `list` with names |  `n > 0` |  `dict` (with ordered keys)|
  | `matrix` |  `n > 0` |  `numpy.array` |
  | `data.frame` |  `n > 0` |  `DataFrame` |
  | `array` |  `n >= 0` |  `numpy.array` |

For example, the scalar data is converted from SoS to R as follows:

In [1]:
null_var = None
num_var = 123
logic_var = True
char_var = '1"23'
comp_var = 1+2j

In [2]:
%get null_var num_var logic_var char_var comp_var
%preview -n null_var num_var logic_var char_var comp_var

NULL

The one-dimension (vector) data is converted from SoS to R as follows:

In [3]:
import numpy
import pandas
char_arr_var = ['1', '2', '3']
list_var = [1, 2, '3']
dict_var = dict(a=1, b=2, c='3')
set_var = {1, 2, '3'}
recursive_var = {'a': {'b': 123}, 'c': True}
logic_arr_var = [True, False, True]
seri_var = pandas.Series([1,2,3,3,3,3])

In [4]:
%get char_arr_var list_var dict_var set_var recursive_var logic_arr_var seri_var
%preview -n char_arr_var list_var dict_var set_var recursive_var logic_arr_var seri_var

The multi-dimension data is converted from SoS to R as follows:

In [5]:
num_arr_var = numpy.array([1, 2, 3, 4]).reshape(2,2)
mat_var = numpy.matrix([[1,2],[3,4]])

In [6]:
%get num_arr_var mat_var
%preview -n num_arr_var mat_var

0,1
1,2
3,4


0,1
1,2
3,4


The scalar data is converted from R to SoS as follows:

In [7]:
null_var = NULL
num_var = 123
logic_var = TRUE
char_var = '1\"23'
comp_var = 1+2i

In [8]:
%get null_var num_var logic_var char_var comp_var --from R
%preview -n null_var num_var logic_var char_var comp_var

None

123

True

'1"23'

(1+2j)

The one-dimension (vector) data is converted from R to SoS as follows:

In [9]:
num_vector_var = c(1, 2, 3)
logic_vector_var = c(TRUE, FALSE, TRUE)
char_vector_var = c(1, 2, '3')
list_var = list(1, 2, '3')
named_list_var = list(a=1, b=2, c='3')
recursive_var = list(a=1, b=list(c=3, d='whatever'))
seri_var = setNames(c(1,2,3,3,3,3),c(0:5))

In [10]:
%get num_vector_var logic_vector_var char_vector_var list_var named_list_var recursive_var seri_var --from R
%preview -n num_vector_var logic_vector_var char_vector_var list_var named_list_var recursive_var seri_var

[1, 2, 3]

[True, False, True]

['1', '2', '3']

[1, 2, '3']

{'a': 1, 'b': 2, 'c': '3'}

{'a': 1, 'b': {'c': 3, 'd': 'whatever'}}

0    1
1    2
2    3
3    3
4    3
5    3
dtype: int64

The multi-dimension data is converted from R to SoS as follows:

In [11]:
mat_var = matrix(c(1,2,3,4), nrow=2)
arr_var = array(c(1:16),dim=c(2,2,2,2))

In [12]:
%get mat_var arr_var --from R
%preview -n mat_var arr_var

array([[ 1.,  3.],
       [ 2.,  4.]])

array([[[[ 1,  3],
         [ 2,  4]],

        [[ 5,  7],
         [ 6,  8]]],


       [[[ 9, 11],
         [10, 12]],

        [[13, 15],
         [14, 16]]]])

It is worth noting that R's named `list` is transferred to Python as dictionaries but SoS preserves the order of the keys so that you can recover the order of the list. For example,

In [13]:
Rlist = list(A=1, C='C', B=3, D=c(2, 3))

Although the dictionary might appear to have different order

In [14]:
%get Rlist --from R
Rlist

{'A': 1, 'B': 3, 'C': 'C', 'D': [2, 3]}

The order of the keys and values are actually preserved

In [15]:
Rlist.keys()

dict_keys(['A', 'C', 'B', 'D'])

In [16]:
Rlist.values()

dict_values([1, 'C', 3, [2, 3]])

so it is safe to enumerate the R list in Python as

In [17]:
for idx, (key, val) in enumerate(Rlist.items()):
  print(f"{idx+1} item of Rlist has key {key} and value {val}")

1 item of Rlist has key A and value 1
2 item of Rlist has key C and value C
3 item of Rlist has key B and value 3
4 item of Rlist has key D and value [2, 3]


Note that SoS uses the feather modules in Python and R to exchange Python Pandas `Dataframe` and R `data.frame` so this module is required if you would like to exchange data frames between these two languages.

## Further reading

* 