[![Binder](https://mybinder.org/badge_logo.svg)](https://notebooks.gesis.org/binder/v2/gh/jolin-io/fall-in-love-with-julia/main?filepath=13%20Julia-Python20-%2001%20Accelerating%20Python%20with%20PyJulia.ipynb)

<a href="https://www.jolin.io" target="_blank" rel="noreferrer noopener">
<img src="https://www.jolin.io/assets/Jolin/Jolin-Banner-Website-v1.3-darkmode.webp">
</a>

# Fall-in-love-with-Julia: Accelerating Python 101

For Julia Python interactions there are two packages:

|    | [PyCall.jl](https://github.com/JuliaPy/PyCall.jl) | [PythonCall.jl](https://github.com/cjdoris/PythonCall.jl) |
| -: | --------- | ------------- |
| pypi | with [`PyJulia`](https://github.com/JuliaPy/pyjulia) python package (simply called `julia` on pypi) | with [`JuliaCall`](https://pypi.org/project/juliacall/) python package |
| conversions | automatically converts between native types | no auto-conversion, just wrapping |
| dependencies | Global package management via `Conda.jl` | Project-separated package management via `CondaPkg.jl` |
| run python | use `py"..."` | use `@pyexec "..."` |

Outline
- **[01](https://notebooks.gesis.org/binder/v2/gh/jolin-io/fall-in-love-with-julia/main?filepath=13%20Julia-Python20-%2001%20Accelerating%20Python%20with%20PyJulia.ipynb) first notebook shows how to use PyJulia**
- [02](https://notebooks.gesis.org/binder/v2/gh/jolin-io/fall-in-love-with-julia/main?filepath=13%20Julia-Python20-%2002%20Accelerating%20Python%20with%20JuliaCall.ipynb) second notebook is about JuliaCall
- [03](https://notebooks.gesis.org/binder/v2/gh/jolin-io/fall-in-love-with-julia/main?filepath=13%20Julia-Python20-%2003%20Use%20Python%20with%20PyCall%20and%20PythonCall.ipynb) using Python from Julia with both PyCall and PythonCall
- [special extra](https://mybinder.org/v2/gh/jolin-io/workshop-accelerate-Python-with-Julia/main?filepath=03-example-cython-vs-cpp-vs-julia.ipynb) - Julia vs C++

# PyJulia

Let's look at the basics how to call Julia from Python with PyJulia. Note that this Jupyter has a **Python kernel**.

In [45]:
from julia.api import Julia
_jl = Julia(compiled_modules=False)
%load_ext julia.magic

The julia.magic extension is already loaded. To reload it, use:
  %reload_ext julia.magic


In [75]:
import numpy as np
import pandas as pd

array = np.arange(10)
array

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

`PyJulia` does autoconversions for you

In [77]:
%%julia
println(typeof($array))  # caution! @show does not work because of the special handling of interpolation $
result = isodd.($array)
println(typeof(result))


Vector{Int64}
BitVector


In [14]:
result = %julia isodd.($array)
type(result)

numpy.ndarray

This is especially handy when working with `True`/`False`

In [23]:
mycondition = (array == 40).any()
mycondition

False

In [24]:
%%julia
if $mycondition
    println("yes! works")
else
    println("also works :)! (we only want to test that julia actually gets a julia Bool value)")
end

also works :)! (we only want to test that julia actually gets a julia Bool value)


## DataFrames

In [26]:
data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}
df = pd.DataFrame(data, index = ["day1", "day2", "day3"])
df

Unnamed: 0,calories,duration
day1,420,50
day2,380,40
day3,390,45


For PyCall.jl there is not much default support for converting a Pandas Dataframe to a Julia Dataframe. Thankfully there is a separate package, compatible with PyCall, which wraps Pandas within Julia.

In [33]:
%%julia
import Pandas
using DataFrames

DataFrame(Pandas.DataFrame($df))

<PyCall.jlwrap 3×2 DataFrame
 Row │ calories  duration
     │ Int64     Int64
─────┼────────────────────
   1 │      420        50
   2 │      380        40
   3 │      390        45>

The other way around works analogously

In [40]:
%%julia
df2 = DataFrame(grp=repeat(1:2, 3), x=6:-1:1, y=4:9, z=[3:7; missing], id='a':'f')

<PyCall.jlwrap 6×5 DataFrame
 Row │ grp    x      y      z        id
     │ Int64  Int64  Int64  Int64?   Char
─────┼────────────────────────────────────
   1 │     1      6      4        3  a
   2 │     2      5      5        4  b
   3 │     1      4      6        5  c
   4 │     2      3      7        6  d
   5 │     1      2      8        7  e
   6 │     2      1      9  missing  f>

In [41]:
jdf2 = %julia df2
df2 = %julia Pandas.DataFrame(df2)
type(jdf2), type(df2)

(PyCall.jlwrap, pandas.core.frame.DataFrame)

In [43]:
df2

Unnamed: 0,grp,x,y,z,id
0,1,6,4,3.0,<PyCall.jlwrap a>
1,2,5,5,4.0,<PyCall.jlwrap b>
2,1,4,6,5.0,<PyCall.jlwrap c>
3,2,3,7,6.0,<PyCall.jlwrap d>
4,1,2,8,7.0,<PyCall.jlwrap e>
5,2,1,9,,<PyCall.jlwrap f>


For more comparisons between Pandas and Julia Dataframe see https://dataframes.juliadata.org/stable/man/comparisons/

## How to do it without %julia magic?

The julia's Main module allows you to access almost everything from Julia programmatically. 

In [68]:
from julia import Main as jl

In [69]:
jl.eval("""
broadcast_isodd(a) = isodd.(a)
""")

<PyCall.jlwrap broadcast_isodd>

In [70]:
jl.broadcast_isodd(array)

array([False,  True, False,  True, False,  True, False,  True, False,
        True])

Caution! However be cautious that autoconversion, while handy, can also become tricky.

While this works

In [60]:
%julia DataFrame(Pandas.DataFrame($df))

<PyCall.jlwrap 3×2 DataFrame
 Row │ calories  duration
     │ Int64     Int64
─────┼────────────────────
   1 │      420        50
   2 │      380        40
   3 │      390        45>

This does not work

In [71]:
from julia import Pandas
jl.DataFrame(Pandas.DataFrame(df))

ValueError: <PyCall.jlwrap (in a Julia function called from Python)
JULIA: ArgumentError: 'PyObject' iterates 'String' values, which doesn't satisfy the Tables.jl `AbstractRow` interface
Stacktrace:
  [1] invalidtable(#unused#::PyObject, #unused#::String)
    @ Tables ~/.julia/packages/Tables/AcRIE/src/tofromdatavalues.jl:41
  [2] iterate
    @ ~/.julia/packages/Tables/AcRIE/src/tofromdatavalues.jl:47 [inlined]
  [3] buildcolumns
    @ ~/.julia/packages/Tables/AcRIE/src/fallbacks.jl:209 [inlined]
  [4] _columns
    @ ~/.julia/packages/Tables/AcRIE/src/fallbacks.jl:274 [inlined]
  [5] columns
    @ ~/.julia/packages/Tables/AcRIE/src/fallbacks.jl:258 [inlined]
  [6] DataFrame(x::PyObject; copycols::Nothing)
    @ DataFrames ~/.julia/packages/DataFrames/JZ7x5/src/other/tables.jl:58
  [7] DataFrame(x::PyObject)
    @ DataFrames ~/.julia/packages/DataFrames/JZ7x5/src/other/tables.jl:48
  [8] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base ./essentials.jl:816
  [9] invokelatest(::Any, ::Any, ::Vararg{Any})
    @ Base ./essentials.jl:813
 [10] _pyjlwrap_call(f::Type, args_::Ptr{PyCall.PyObject_struct}, kw_::Ptr{PyCall.PyObject_struct})
    @ PyCall ~/.julia/packages/PyCall/twYvK/src/callback.jl:28
 [11] pyjlwrap_call(self_::Ptr{PyCall.PyObject_struct}, args_::Ptr{PyCall.PyObject_struct}, kw_::Ptr{PyCall.PyObject_struct})
    @ PyCall ~/.julia/packages/PyCall/twYvK/src/callback.jl:44>

As a fallback you always have to define a custom julia function which you then can pass python objects to.

(Unfortunately Python does not have an interpolation syntax outside Jupyter %julia magics)

In [73]:
jl.eval("""
pdf2jdf(df) = DataFrame(Pandas.DataFrame(df))
""")

<PyCall.jlwrap pdf2jdf>

In [74]:
jl.pdf2jdf(df)

<PyCall.jlwrap 3×2 DataFrame
 Row │ calories  duration
     │ Int64     Int64
─────┼────────────────────
   1 │      420        50
   2 │      380        40
   3 │      390        45>

# Next: [02 JuliaCall](https://notebooks.gesis.org/binder/v2/gh/jolin-io/fall-in-love-with-julia/main?filepath=13%20Julia-Python20-%2002%20Accelerating%20Python%20with%20JuliaCall.ipynb)


For questions or suggestions please contact me at stephan.sahm@jolin.io

<a href="https://www.jolin.io" target="_blank" rel="noreferrer noopener">
<img src="https://www.jolin.io/assets/Jolin/Jolin-Banner-Website-v1.3-darkmode.webp">
</a>
